A friend of ours once described Big Data as the belief that if you just look hard enough through a big enough pile of horseshit, somewhere in there you’ll find yourself a pony.
Food website Chef's Pencil took up the challenge. In an article that appears to since have been deleted, they took a big data approach to finding the best barbecue in the country.
Chef's pencil compiled tens of thousands of TripAdvisor ratings from over two thousand barbecue joints in seventy five US cities. They ordered each city by the average rankings therein, and came up with this Top-10 list.
Perhaps this already offends you at first glance, or perhaps it takes a bit of closer look. Either way....
Seattle.
Now look, we live in Seattle and it's a really nice place. We've got glorious views of the mountains and ocean, beautiful summers, a thriving arts scene, endless opportunities for outdoor recreation, wonderful coffee...but one thing we don't have is an excess of fantastic barbecue. Come visit us—but order the salmon.
What went wrong? So many things. It's hard to even know where to start. But let's focus on the single biggest issue: the data collected are not appropriate to answer the question at hand.
Crowdsourced restaurant ratings like those on TripAdvisor, Yelp, etc., prodominently serve to compare restaurants within cities, not between cities
The reason is fairly straightforward. Most of the restaurant ratings in given city will be left by people who live there. Their baseline will be set by the other restaurants they've visited, which will also tend to be in the same region.
So what these data tell us is that people in Seattle rate Seattle barbecue higher than people in Fort Worth rate Fort Worth barbecue places. We don't know how people in Seattle would rate Fort Worth barbecue, or how people in Fort Worth would rate Seattle barbecue. Thus we can't make meaning cross-country comparisons from these data. At best, we can make within-city comparisons, and note that Seattle's Jones Barbecue ("they ran out of brisket, mean it’s must be amazing!") is rated more highly than Seattle's Bigfoot BBQ ("technically it is BBQ").
Presumably if Texans rated Washington and Washingtonians rated Texas, we'd see a very different picture. The figure above illustrates what might happen. Pacific Northwest dude (left) visits two places in Seattle and rates them a 5 and 4. If he visited Forth Worth, he'd rate those places a 5 as well — but he doesn't. Texas guy visits two places in Fort Worth, and rates them 3 and 4 respectively. He's got high standards. If he visited Seattle, he'd be disgusted with he found — but he doesn't. In the end, Fort Worth has better barbecue but Seattle rates higher because the Seattle rater is more generous.
As for why Seattlites rate Seattle barbecue places rate higher? There could be any number of reasons. Maybe people in Seattle don't know what good barbecue is. Maybe they do know, but adjust their standards downward given what is available locally. Maybe they're just Seattle nice when they leave restaurant reviews. Maybe they're trolling their friends who moved to Austin. We're sure you can come up with other explanations yourself.
At the end of the day, you can't using ratings to compare A and B if the people rating them differ systematically. I admit this was a painful realization for me. You see, I thought I had this fatherhood thing dialed in — until I discovered that Jevin has a "World's Best Dad" coffee mug just like mine.