Confluence of Data Mining and Web Crawling
1993– 90’s saw a buzz in data mining, the days when tech publishers started part series on mining techniques and approaches. Courses were introduced in colleges and multiple researches produced to ride on this wave of data mining. Data mining essentially meant employing clustering or machine learning techniques to draw out conclusions based on data […]
Read MoreData as a Service platform for Market Research
Traditionally when data sources were limited, there were different kinds of processes in place at the market research firms. Reports were created out of manually entering data into systems and results were later visualized via some standard analyses. But with the exponential rise in data volume, both online and offline (because of online resources), techniques […]
Read MoreExtract product data feeds from E-commerce websites
Increasing attendance of retailers online and big data soaring new heights calls for a quick look at what’s trending these days in the context of the e-commerce landscape. Requirements that we receive from our enterprise clients with respect to crawling and extracting products can by far be categorized into 3: Collecting product information from specific […]
Read MoreCustomized 404 / Freshness Checker for URLs
With websites that link back to the original source for a particular piece of information, there’s an inherent problem of maintaining freshness of those links. Let’s take an example of a digital classified ad listing company that aggregates various ads from multiple sources on the web, and links each such ad back to its source […]
Read MoreTargeting International Clients – how we bypassed into them
It was only when someone asked us- “How many Indian clients do you guys serve currently?”, did we realize that we were only serving 2 at that point and had directly gone international since PromptCloud’s inception. How this happened- unintentionally and why this happened- we can closely guess. The Indian ecosystem wasn’t ready for such […]
Read MoreWhat enterprises do with Big Data- Part 2
It’s amusing how Big Data is knocking doors these days and so it took a while to settle down with the overwhelming response on the previous post. Here’s the next one. Notes- a) Notes that applied to the previous batch apply here too. b) Only public data gets crawled in the process and robots.txt is strictly […]
Read More