Based on this article published by Forbes, Walmart data had 20,000 stores in 28 countries as of July 2021. It is still the largest retailer in the world, with Amazon following far second at almost half its sales. Established in 1962, while it is not a new company, it has improved its tech efforts, leaving behind a lot of new players in the industry. It is also one of the top companies dabbling with data and enabling data-backed decision-making in its board rooms.
In 2021, it started building the world’s largest private cloud that could process anywhere in the ballpark of 2.5 petabytes (2500Tb) of data per hour. To further work on this massive data, it has also set up an analytics hub called Data Café at its Bentonville, Arkansas headquarters. At this hub, close to 200 streams of internal and external data, as well as 40 petabytes of transactional data, can be transformed, visualized, or used to create models. Decreasing the time required for crunching data from weeks to minutes has helped the company is spotting trends and enabling quicker decisions making, thus decreasing the turnaround time for applying data effectively.
Walmart and big data
E-Commerce sites and retailers often use internal and external data sources (competitor data) for dynamic pricing management. While this is the default (and often the only) use case for most companies, Walmart uses its data sources to perform multiple activities–
Personalize your online shopping experience
Just like Netflix uses your previous usage data to provide you with a personalized experience and recommendations, Walmart uses your historical data to show products and deals that might be more relevant to you. This helps in customer retention and often larger order sizes.
Improve in-store checkout processes
Those who still prefer to go grocery shopping at physical stores dread unmanned checkout counters and long lines. Walmart is trying to remove such last point bottlenecks by studying previous data and computing how many associates can facilitate efficient billing at any hour of the day.
Supply chain management
Every item reaches customers across a series of steps each involving a different transportation system. Walmart tries to optimize the supply chain by reducing the steps as much as possible and changing truck timings to ensure that they can fill up their entire cargo space. It even studies routes and timings to figure out which route would enable customers to receive their orders at the earliest.
Restocking pharmacies efficiently
It uses internal as well as historical data to create simulations and predict to a high degree of accuracy certain data points. These include–
- at what time of the day do stores see maximum footfall
- the busiest days of a month or year
- which medicines are most in demand
All this information helps in managing staff and medicines efficiently to ensure less time is required for filling every prescription.
Optimize product selections
It uses data from both online and offline sales to have the most optimum selection of brands and products on shelves at its stores and warehouses. It also tries to gauge which of its internal brands are a hit with the customer to increase their availability.
Data points and sources
Discussing use cases are a great way to increase public interest on topics. However, what we need to focus on most are the data points that are being collected and what are the sources for these data streams.
Walmart has a wide presence across international boundaries as well as in the online sphere. This is why it can gather data from multiple sources–
- 245 million customers at 10,900 physical stores as well as 10 live websites worldwide every day.
- 300,000 mentions and tags across social media websites every day.
- 2,00,000 associates with close to 50,000 more are hired every year– all of whom generate internal data and enable Walmart to improve its hiring process and provide better working environments.
- Customer data on 145 million US citizens, 60% of which are adults.
This massive data hoard allows Walmart to analyze millions of keywords daily and accordingly bet on keywords to place its advertisements. It is also able to analyze thousands of products– those that it sells and those that it doesn’t. All this has enabled it to increase its sales by having products that the customers want most.
Walmart has gone so far as to analyze local events, weather and social media phenomena and how they impact customer behaviour. For example, suppose a movie is a hit and the lead actor wears a watch which immediately becomes a rage among young adults. Walmart would be able to predict a higher sale as a result of data from social media and would try to stock up on the product.
Converting data challenges for competitions
Every company faces data challenges when working with new datasets or trying to answer new questions using data. In 2014, Walmart needed to find an efficient way to predict sales with a small range of historical data. It held this competition on Kaggle where it shared the sales data for 45 physical stores across multiple departments. Sales on special days and the holiday season were also tagged in the data.
Individuals were provided with more data points corresponding to the location where every store was located. This contained information like weather patterns, unemployment percentages, median wage, cost of fuel, and more. This was a recruitment challenge– thus allowing Walmart to nail two birds with one stone.
Implementation of Data practices from Walmart
In case you plan to scrape product or pricing data from Walmart, you should first decide on the department that you want to target. Getting all the data from all the departments might turn out to be a ginormous task. In case you are operating in a specific geographical location, it would also be wise to scrape data only related to that place. Getting all the data and filtering through it later would be a two-fold waste of time and computational resources.
Scraping data from Walmart can get you places, given the variety of markets and departments that it serves and the number of products on its catalogue. However, you’d go much further if you adopted the “data practices” at Walmart, be it from a data handling perspective or its cloud infra.