Last summer, I wrote about the statistically improbable restaurant, the restaurant you wouldn’t expect to find in a small American city: the excellent Nepali food in Erie, PA and Akron, OH; a gem of a Gambian restaurant in Springfield, IL. Statistically improbable restaurants often tell you something about the communities they are based in: Erie and Akron have large Lhotshampa refugee populations, Nepali-speaking people who lived in Bhutan for years before being expelled from their county; Springfield has University of Illinois Springfield, which attracts lots of west African students, some of whom have settled in the area.
The existence of the statistically improbable restaurant implies a statistically probable restaurant distribution: the mix of restaurants we’d expect to find in an “average” American city. Of course, once you dig into the idea of an “average” city, the absurdity of the concept becomes clear.
There are 343 cities in the US with populations of over 100,000 people, from 8.47 million in New York City to 100,128 in Sunrise, Florida (a small city in the Ft. Lauderdale, FL metro area). Within that set are global megacities like New York and LA, state capitols, college towns, towns growing explosively and those shrinking slowly.
I’ve retrieved data about the restaurants in 340 of these cities using the Google Places API. This is a giant database of geographic information from across the world – not only does it include information about restaurants, but about parks, churches, museums and other points of interest. The API was designed to make it easy to search by proximity – “return all restaurants within 2km of this point” – but it’s recently gained an “aggregate” attribute, which allows you to ask questions like “How many Mexican restaurants are there in Wichita Falls, Texas?”.
The API is not perfect. I tested my queries on my hometown of Pittsfield, MA and while it got some questions (the number of Dunkin’ Donuts) completely correct, it missed others entirely, failing to identify our two excellent Brazilian restaurants when I searched for that category. We’re going to proceed with the assumption that the data is imperfect, and sanity-check when we get surprising results.
For starters, we look to see whether there’s a relationship between the population of a city and the number of restaurants located within city limits. It seems obvious that New York City should have significantly more restaurants than Lincoln, Nebraska, and indeed, that’s true. When we look at the whole set of cities between 100k – 8 million, there’s a straightforward linear relationship between population and restaurants with a few interesting outliers: Houston has more restaurants than we might expect, Phoenix fewer than we’d anticipate for cities their size.
The data is messier as we look at smaller sets of cities. Looking at cities with populations over 250,000, lopping off the four largest US cities (New York, Los Angeles, Chicago, Houston), a linear regression no longer fits as well. Some of the cities that are celebrated for their “creative economies” – Austin, San Francisco, Portland, Seattle, Nashville, Boston – have more restaurants than we might expect, while some less celebrated cities of comparable size – Fort Worth, Jacksonville, Indianapolis, El Paso, Oklahoma City – have fewer than we might expect.
Exploring the cities between 100,000 – 250,000, there’s still a clear relationship between population and the number of restaurants, but that relationship explains just more than half the data variance (R2=0.5333) Some of the cities that are especially restaurant-rich are relatively small capitol cities – Little Rock, AR; Providence, RI; Baton Rouge, LA; Tallahassee, FL – and college towns – Knoxville, TN; Tempe, AZ. Some of the cities that have fewer restaurants than expected are close to larger cities – Cape Coral, FL is next to Fort Meyers; Yonkers, NY is next to New York City; Moreno Valley, CA may be overshadowed by Riverside and San Bernardino.
(These are rough guesses based on staring at scatterplots. I’ll want to try some regressions before positing that capitol cities have a higher than usual number of restaurants because lobbyists need to take legislators out to eat.)
With all this data, we can now imagine an “average” American city of 100,000 people. We’ll call our imagined city “New Springfield, California”. (California has 76 cities with 100,000 or more people, ahead of Texas with 42. There are three Springfields in our set of cities, and 5 cities that start with “New”.)
... continue reading