Big Data Solutions for Classified Engines
The pre-digital state of classified ads was extremely monolithic; requiring a pair of reading glasses even for the normal eyes. Then came the internet with the explosion of websites. Digitization of classifieds too was happening and Craiglist was the pioneer. What followed is evident to all. There were more players rushing to this zone, and given the […]
Read MoreAmazon EC2 On-demand vs. Reserved Instance Price Calculator
If you use Amazon EC2 extensively, then you’ve fallen prey to the dilemma of “to reserve or not to reserve”. It is however a simple question in the sense that for a reserved instance, you pay a lesser per hour fee but end up paying an additional one-time fee upfront. So what is the optimal […]
Read MoreFunding geographic data for Geo-specific Market Research
Geographic data is, by far, the single most indispensable entity on which a business stands these days even before standing on brick and mortar. We have already discussed various use cases that validate the above statement in our older posts on What enterprises do with Big Data series. But here’s a specific use case that deserves […]
Read MoreBig Data for the Enterprise – Use Cases
This post has 2 precursors- Part 1 and Part 2 . Here’s the third batch of Big Data For Enterprise use cases. Use Cases of Big Data for Enterprise The source site that I’d like to collect data from limits the number of results. So use a search engine to collect as many results as […]
Read MoreComing to Life- The All New Website is Here!
For those who don’t remember how the old website looked, we could have been glad about it but we’d rather show the before-after difference. Here it is- one big blob of text versus structured information. The new website design is finally live. Before After What led to the revamp? Design- The old website was designed […]
Read MoreWeb Crawler vs Hosted Web Scraping Solution
Web scraping is a widely known term these days; not just because so much data exists around us, but more because there’s already so much being done with that data. Let’s try to analyze the differences between opting for software that comes with DIY components over picking a hosted data acquisition or hosted crawl solution […]
Read MoreConfluence of Data Mining and Web Crawling
1993– 90’s saw a buzz in data mining, the days when tech publishers started part series on mining techniques and approaches. Courses were introduced in colleges and multiple researches produced to ride on this wave of data mining. Data mining essentially meant employing clustering or machine learning techniques to draw out conclusions based on data […]
Read MoreData as a Service platform for Market Research
Traditionally when data sources were limited, there were different kinds of processes in place at the market research firms. Reports were created out of manually entering data into systems and results were later visualized via some standard analyses. But with the exponential rise in data volume, both online and offline (because of online resources), techniques […]
Read MoreExtract product data feeds from E-commerce websites
Increasing attendance of retailers online and big data soaring new heights calls for a quick look at what’s trending these days in the context of the e-commerce landscape. Requirements that we receive from our enterprise clients with respect to crawling and extracting products can by far be categorized into 3: Collecting product information from specific […]
Read MoreCustomized 404 / Freshness Checker for URLs
With websites that link back to the original source for a particular piece of information, there’s an inherent problem of maintaining freshness of those links. Let’s take an example of a digital classified ad listing company that aggregates various ads from multiple sources on the web, and links each such ad back to its source […]
Read More