New Hampshire Results and the Limitations of Data Analysis

Before the Iowa primary, we showed the data behind the conventional wisdom that outperforming expectations in Iowa generally leads to a polling bump in New Hampshire.  The logic behind this argument makes perfect sense, as positive media coverage and the bandwagon effect attracts voters who are still making up their minds.  And plotting the data from the last few elections, we were able to see a clear positive trend in support of this theory.  However, the recent New Hampshire results demonstrate an important but often overlooked truth about poll-based predictions – they assume the status quo remains unchanged.  The poll averages that many media predictions are based on do a decent job of measuring current support, but they simply cannot account for future events that have nothing to do with the numbers.  So when a surging candidate like Marco Rubio trips up in a debate with heavy media attention, all our predictions go out the window.

For all the uncertainty on the Republican side of things, our simplistic model of converting Iowa support to New Hampshire success actually worked quite well for the Democrats.  We first calculated the percent that Bernie Sanders overperformed and Hillary Clinton underperformed against expectations in Iowa.  To calculate the expectations, we modified the HuffPost Pollster average with corrections for both the viability rule that hurt Martin O’Malley’s final vote tally as well as for undecided voters1.  Using this calculation as our baseline, we found that Bernie Sanders overperformed by 1.2% and Hillary underperformed by 0.6%.  In our article on Iowa, we explained that in the last few elections, each percentage point above expectations in Iowa tends to raise New Hampshire performance by 0.55%.  This allowed us to calculate, using our historical model2, a prediction for the New Hampshire vote and compare it to the actual results:


HuffPost Pollster Average Average Adjusted for O’Malley + Undecideds Actual Result Result against Expectations
Sanders 44.6 48.4 49.6 +1.2
Clinton 47.7 50.5 49.9 -0.6


New Hampshire

HuffPost Pollster Average Feb. 1st Average Adjusted for O’Malley + Undecideds Actual Result Model’s Predicted Result
Sanders 54.7 58.4 60.4 59.1
Clinton 39.4 41.8 38 41.4

All in all, the historical model was fairly accurate at predicting the actual New Hampshire results, missing by just 1.3% for Sanders and 3.4% for Clinton. While the magnitude of the shift was slightly larger than predicted by our model, the model was successful in adjusting the expectations in the correct direction for both candidates.  The relative success of the model for the Democrats was likely caused by the Democratic race progressing more or less how the model’s assumptions predicted them to.  Sanders’ campaign beat expectations in Iowa, received energetic press, and saw a small bump in New Hampshire.  The Clinton campaign faced media questions about the rising threat of Sanders, and saw a small drop in support in New Hampshire.

The Republican results in New Hampshire, however, defied the expectations set by our model as well as most political pundits.  Most pundits declared that Marco Rubio and Ted Cruz beat expectations in the Iowa caucus, while Donald Trump underperformed.  Following conventional punditry as well as our quantitative model, we would thus expect Rubio and Cruz to see a bump in support in New Hampshire while Donald Trump’s numbers fell.

In actuality, the exact opposite happened.  Trump’s final results actually surpassed his New Hampshire HuffPost polling average from the day of the Iowa caucus, even though his campaign had to fight off negative media attention arguing many of his supporters were unlikely to turn up at the polls. Meanwhile, Rubio’s and Cruz’s final vote tallies were lower than their polling averages a week earlier.  What happened?  Well, to put it simply, events happened that broke the assumptions made by both our model and the political pundits.  Our model predicted that momentum from Iowa for someone like Rubio would carry on to positively influence his level of support in New Hampshire.  And this prediction seemed to be coming true, as Rubio’s HuffPost polling average in New Hampshire was climbing the week before the primary.  However, an unforeseen and unpredictable debate gaffe likely cost him supporters in the crucial few days before votes were actually cast.  Cruz faced favorable demographics in Iowa but unfavorable demographics in New Hampshire, likely playing a role in his disappointing finish.  And Trump’s surprisingly high turnout proved to the media once again that it is notoriously difficult to compare Trump’s supporters with historical examples.

So, in conclusion, New Hampshire’s results show that while our models and punditry occasionally get it right, both attempts at political prognostication work best when certain assumptions hold.  The Democratics had no last-minute surprises, and estimates were pretty close to the final result.  However, the last-second poll swings of the Republicans demonstrate that even the so-called experts base their predictions on imperfect assumptions, and it’s sometimes best to take both polls and pundits with a grain of salt.


Footnotes   [ + ]

One thought on “New Hampshire Results and the Limitations of Data Analysis

  1. You should also take into account candidates that have different focuses in different states. Kasich only cared about New Hampshire and ignored iowa.

Comments are closed.