At PittSmartLiving we have been trying to find ways to avoid over-crowded buses through win-win-win settings for all involved stakeholders (port authorities, riders and local businesses), improve riding conditions and in general “flatten the curve” for public transport. As part of our efforts we have been analyzing crowding data in buses and businesses. The latter is important since it means that we could use the same methods to study changes in businesses’ foot traffic during the coronavirus pandemic, to understand how people react to recommendations for social distancing (which in the author’s personal opinion it should be termed as physical distancing, since we are still socializing using the technological advancements of our era).
We have been using Google’s Place API to collect crowding data. You might have interacted with this information available from this API when you tried to search for your favorite cafe and got back a bunch of information for it, including the bar chart below:
This chart provides information on how “busy” the place is expected to be based on historical data from people’s cell phones that use Google’s services, as well as, how busy the place is right now (red-ish bar). These numbers are expressed as a fraction of the most popular time of the week for the venue. For example, if the red bar on the figure above corresponds to 40, and this place is expected to be at its weekly busiest on a Sunday at noon, then currently the place is 40% as busy as during a Sunday at noon. Now, of course, the current value can be larger than 100 as well, which means that the place is busier than the expected weekly busiest time. Google’s estimates for the expected levels of crowdedness are updated in a rolling window fashion (with details not being fully known to the public).
It should be evident that we can use these data to get an estimate of how people are distancing these days. We started collecting data for this purpose from a select number of venues in Pittsburgh on March 13th and we found some interesting patterns. People, in general, were following recommendations (the order for sheltering in place was announced on March 19th and enforced on March 23rd in Pennsylvania). During that week traffic in retail stores and malls was down (approximately 45%), traffic in restaurants was down (approximately 30%), traffic in transportation hubs/stations was down (approximately 65%). Following, are some representative time-series examples of venues that experienced a reduction in traffic.
One significant exception was bars during March 14th that were busier than normal, with Pittsburghers celebrating St. Patrick’s day as it can be seen by the following time-series:
Now the only type of business that did not see any significant decline during the first week of data collection in Pittsburgh was grocery stores. They saw a small decline of 4%, but since that week things have changed, with larger declines observed. For example, the following figure shows average daily changes from a busy grocery store:
These results seem to indicate that people stocked up and distanced themselves even from grocery stores. Among the 30 groceries that we have been monitoring currently, the last 10 days there has been an average reduction of 30% in the crowdedness levels in these businesses. Now it is always good to understand the data. Part of this decline can be policies put by various grocery stores on how many people can be within their premises at any given time. This certainly will have an impact on the volumes reported by Google and other providers. So it is always good to keep in mind these things when trying to understand the data and make conclusions.
Recently, I also came across another dataset from Foursquare that captures foot traffic in venues. Now I was a bit skeptical initially since people rarely check-in to places they go, but digging a bit deeper in the data, these are not based on check-ins but rather on passive sensing of user locations (i.e., similar to what Google does). I was particularly interested in residential venues (that we cannot get information about from the Google API) and how foot traffic has changed there. First I took a look at the US as a whole and following are some interesting figures:
We can see the natural progression here through the month of March across the whole country with foot traffic in residential places being significantly reduced by the end of the month (as compared to the month of February on a similar day)! Now again we have to understand what the data measure. Someone might be confused saying that this does not make sense since we are staying at home more. This is true, but these Foursquare data measure the foot traffic, i.e., how many people are in a building/venue. This means that by the end of March there were fewer people in a residential building than expected (as compared to a baseline from February). This points to people physically distancing from their close friends and family as well, staying home with their close/immediate family only. Simply put, they do not have people over. Following is the time series of these changes for Allegheny County, which tells a similar story.
Overall, people seem to be taking this seriously (as they should) but there is still more that we can do! Stay far from each other; it saves lives!
Note: Google published a similar analysis in the beginning of April that provides a similar analysis for several countries and you can access these reports here.