Happy New Year, With a Margin of Error
When I got up this morning, I read this embarrassing bit of journalism from Reuters - “Clinton holds lead as Romney slips in Iowa”:
Clinton, a New York senator, maintained a stable four-point edge over Illinois Sen. Barack Obama, 30 percent to 26 percent, in the Democratic race. Former North Carolina Sen. John Edwards was in third at 25 percent, down one point overnight. Huckabee, a former Arkansas governor, widened his lead over Romney among Republicans to 29 percent to 25 percent. Romney, a former Massachusetts governor who has been on the attack against Huckabee, slipped two points overnight.[then, further down in the article] …[the poll] has a margin of error of 3.3 percentage points…
The poll’s margin of error erases any notion of a lead by anyone. The article should instead say something along the lines of: “polls indicate a statistical dead heat between Clinton, Obama, and Edwards for the Democratic race, and between Huckabee and Romney for the Republican race.”
This is explained nicely in this fictional polling example from What is a Survey (PDF), available from the American Statistical Association:
In the case of the mayoral poll in which 55 of 100 sampled individuals support Ms. Smith, the sample estimate would be that 55 percent support Ms. Smith—however, there is a margin of error of 10 percent. Therefore, a 95 percent confidence interval for the percentage supporting Ms. Smith would be (55%-10%) to (55%+10%) or (45 percent, 65 percent), suggesting that in the broader community the support for Ms. Smith could plausibly range from 45 percent to 65 percent.
The margin of error is actually even greater if you intend to use the numbers as a means of comparing one candidate’s support to another’s:
In more technical terms, a law of probability dictates that the difference between two uncertain proportions (e.g., the lead of one candidate over another in a political poll in which both are estimated) has more uncertainty associated with it than either proportion alone.
Accordingly, the margin of error associated with the lead of one candidate over another should be larger than the margin of error associated with a single proportion, which is what media reports typically mention (thus the need to keep your eye on what’s being estimated!).
Until media organizations get their reporting practices in line with actual variation in results across political polls, a rule of thumb is to multiply the currently reported margin of error by 1.7 to obtain a more accurate estimate of the margin of error for the lead of one candidate over another. Thus, a reported 3 percent margin of error becomes about 5 percent and a reported 4 percent margin of error becomes about 7 percent when the size of the lead is being considered.
The whole thing gets even goofier in the case of the Democratic Iowa caucuses, where a candidate must pass a viability threshold:
After 30 minutes, the electioneering is temporarily halted and the supporters for each candidate are counted. At this point, the caucus officials determine which candidates are “viable”. Depending on the number of county delegates to be elected, the “viability threshold” can be anywhere from 15% to 25% of attendees. For a candidate to receive any delegates from a particular precinct, he or she must have the support of at least the percentage of participants required by the viability threshold. Once viability is determined, participants have roughly another 30 minutes to “realign”: the supporters of inviable candidates may find a viable candidate to support, join together with supporters of another inviable candidate to secure a delegate for one of the two, or choose to abstain. This “realignment” is a crucial distinction of caucuses in that (unlike a primary) being a voter’s “second candidate of choice” can help a candidate.
The Reuters article I quoted at the beginning also noted the popularity figures for voters’ second choices, but those numbers are not particularly helpful. This is because the general set of second choices isn’t what’s interesting - what’s interesting is the second choices of those who support the candidates who are likely to be non-viable (Dodd, Paul, etc). It would be hard to get those numbers because you’d have to conduct a much larger poll to get reliable numbers for the small subset of people who support the likely non-viable candidates.
Having said all that, the media is going to do a huge disservice to the American people by trumpeting a Republican and a Democratic “winner” tomorrow night. Whoever wins is likely to do so only because of their “second choice” support, and whoever comes in third in the Democratic race is likely to be doomed (except for perhaps Clinton), even if it’s by a trivial margin. (The dynamic is different in the Republican race, since both McCain and Giuliani have not campaigned substantially in Iowa, whereas the top 3 Democratic candidates have campaigned intensely.) Remembering Kerry’s virtual coronation by the media as the inevitable nominee after winning Iowa in 2004, Anonymous Liberal wrote an excellent piece yesterday:
…if Iowa had a primary (like most other states), we could be pretty confident that the final tally would resemble the poll numbers we’re seeing now. Any one of the top three [Democratic] candidates could win, but it would likely be a very narrow victory, with the other two candidates just a few percentage points back.
In a rational universe, that kind of outcome–particularly in a small, unrepresentative state like Iowa–would be virtually meaningless. We’d call Iowa a draw and everyone would move on to the next state, their prospects unchanged. After all, why should we just hand the nomination to a candidate who only bested his rivals by 1 or 2 percentage points in one state?
…We know already that after a year of campaigning, the level of support for the three major Democratic candidates among Iowans is roughly the same. That will be true regardless of the final delegate count there. I wish journalists would keep that in mind when covering the results Thursday night, but I know that they won’t.
As a result, the Democratic nominee will again be chosen through a bizarre game of red-rover played by roughly 5% of the population of Iowa.
UPDATE: Just like trying to predict the weather, it’s awfully easy to make the wrong predictions in politics, no matter how knowledgeable you are (not that I’m staking a particular claim on knowledge
). Now that the Iowa caucus results are in, my prediction that the winner would be put over the top by “second choice” voters was wrong. It looks like Edwards got the lion’s share of the second choice votes, which means this was a solid win by Obama:
…According to the entrance poll, which only measured first preferences of the participants going in, the numbers were: Obama 35%, Hillary 27%, Edwards 23%.
If we assume that the final state delegate numbers actually approximated the votes of the caucus participants, this means John Edwards was the big second-choice winner, as he boosted his final score by seven points, compared to only three points for Obama and two for Hillary. It was enough to just overtake Hillary for second place, but not enough for first — because it turned out that Obama entered as the clear winner from first choices alone.
The other key aspect to Obama’s victory was the huge increase in Democratic caucus turnout (almost 90% higher than 2004), and his impressive numbers among these new voters:
Here’s another figure from the entrance poll: An astonishing 57% of caucusers were first-time participants. And how did they vote? Barack Obama carried them with 41% of the people going in and before second-choice reallocations, followed by Hillary Clinton at 29% and John Edwards at 18%.
And among the returning caucus-goers? Edwards was carrying them with 30%, with Obama at 26% and Hillary with 24%.
This tells us two things. First, Obama’s strategy of bringing in new caucus-goers worked, the first time in recent history where such a strategy actually did so in the caucus. It’s a big change from when Howard Dean tried it with less than impressive results…


