We addressed this requirement by setting up a
mass scale crawl, that enabled crawling numerous sources in parallel at regular periodic intervals in a day, still adhering to the politeness policies by not excessively hitting the servers of these sources.
Feeds from various social media were aggregated intelligently by developing a Geo-Intelligence API, that assured feeds were captured only from desired locations. List of locations, sources, keywords and queries was dynamically modified based upon the client requirements and feedback. Over 2,00,000 feeds were collected from various continents within 2 months of time. Every week fresh data is collated location-wise and delivered.