Comparing World Cup Prediction Algorithms – Ranker vs. FiveThirtyEight
Like most Americans, I pay attention to soccer/football once every four years. But I think about prediction almost daily and so this year’s World Cup will be especially interesting to me as I have a dog in this fight. Specifically, UC-Irvine Professor Michael Lee put together a prediction model based on the combined wisdom of Ranker users who voted on our Who will win the 2014 World Cup list, plus the structure of the tournament itself. The methodology runs in contrast to the FiveThirtyEight model, which uses entirely different data (national team results plus the results of players who will be playing for the national team in league play) to make predictions. As such, the battle lines are clearly drawn. Will the Wisdom of Crowds outperform algorithmic analyses based on match results? Or a better way of putting it might be that this is a test of whether human beings notice things that aren’t picked up in the box scores and statistics that form the core of FiveThirtyEight’s predictions or sabermetrics.
So who will I be rooting for? Both methodologies agree that Brazil, Germany, Argentina, and Spain are the teams to beat. But the crowds believe that those four teams are relatively evenly matched while the FiveThirtyEight statistical model puts Brazil as having a 45% chance to win. After those first four, the models diverge quite a bit with the crowd picking the Netherlands, Italy, and Portugal amongst the next few (both models agree on Colombia), while the FiveThirtyEight model picks Chile, France, and Uruguay. Accordingly, I’ll be rooting for the Netherlands, Italy, and Portugal and against Chile, France, and Uruguay.
In truth, the best model would combine the signal from both methodologies, similar to how the Netflix prize was won or how baseball teams combine scout and sabermetric opinions. I’m pretty sure that Nate Silver would agree that his model would be improved by adding our data (or similar data from betting markets that similarly think that FiveThirtyEight is underrating Italy and Portugal) and vice versa. Still, even as I know that chance will play a big part in the outcome, I’m hoping Ranker data wins in this year’s world cup.
- Ravi Iyer