Archive for October 24th, 2008

How to Read Pre-Election Polls, P 1

One confusing aspect of this election is the ubiquity and apparently disagreement between a large number of different polls.  How should one read them?   First, the national polls.  Yesterday we were treated to a variety of different national polls on the state of the election:

Rassmussen:  Obama 52  McCain 45
Gallup (trad):  Obama 50  McCain 46
Gallup (exp):  Obama 51  McCain 45
Reuters/Zogby: Obama 52 McCain 40
Hotline/FD:   Obama 48  McCain 43
IBD/TIPP:       Obama 45  McCain 44
GWU/Battleground:  Obama 49 McCain 45
CBS/NYT:  Obama 52 McCain 39
ABC/WP:  Obama 54 McCain 43

A few basic facts about polls:

1)  All polls will post a margin of error, which can vary widely.  Gallup is at 2%, IBD/TIPP at 3.5%, some are up near 5%.   That margin of error means that there is a 95%  chance that the race is within that framework.  We can say with 95% certainty that if the methodology of IDB/TIPP is accurate the race is anywhere between Obama up 48.5 to 40.5 (Obama +8) to McCain up 47.5 – 41.5 (McCain +6).  This is also assumed to be on a “normal curve,” meaning the probability is that the poll is accurate falls off at an even rate as you head towards the edge of the margin of error.

However, it is probable that one in twenty polls (and we get over 20 national polls a week) is outside the margin of error.  That means some polls we see are, even if the methodology is correct, way off.  So when on one day last week there was a poll with Obama up by 1, and another up by 14, it’s possible one of these was either on the edge or outside the margin of error in either direction.  In such cases, I usually ignore obvious outliers on either side.

2.  There are polls, and there are tracking polls.  Most of the polls getting cited are tracking polls which come out daily, but represent three or more days of polling.  This means that the data is not new.  With one poll (the GWU/Battleground) they do no polling on Fridays and Saturdays, and use five days of polling data.  So the poll they posted today includes data from October 16th.  They do this because they only do 200 interviews a day, compared to say Gallup and Rasmussen, who both do 1000 a day, and have three days of data, including weekends.   Does this make Gallup and/or Rasmussen better than GWU/Battleground?   Not necessarily, though I generally prefer larger samples and recent data.  It does mean that with tracking polls its often more important to look for trends, and the GWU poll might show the trend later than the Gallup poll.

Some polls are NOT tracking polls, however.   Fox Opinion Dynamics published on Wednesday a poll that showed Obama up 49-40, with a margin of error of 3%, based on a sample size of 1100 taken over two days.  These polls generally have a clearer snapshot because they use one or two days of data.  They usually have less bits of data than the “big” tracking polls (hence a larger margin of error), but because they are done in a briefer period of time, the snap shot might be more accurate.  Unlike tracking polls, it’s harder to gauge trends.

Since most emphasis is being placed now on tracking polls, I use a couple of rules of thumb.  First, watch for trends, and compare trends between polls.  If the polls agree on a particular trend, it’s probably real.  If not, look at their data gathering.  GWU/Battleground showed a different trend than the others until two days ago.  Turns out it was precisely because they still had older data in their results than the others.  Second, don’t over-react to sudden changes.  Sometimes because the poll dumps old data each day and replaces it with new data, there can be a quick jump.  After McCain’s decision to suspend his campaign the next Gallup found McCain and Obama at 49-49.  Within two days Obama had a seven point lead.  But that was due to data dumped which covered five days.  Finally, remember that on average one in every twenty polls may be outside the margin of error.   It is even more common for any given single day’s data to be off.  So wait for broad trends, not suddenly jumps one way or another.

3.  Polls have different methodologies.   Polling of all registered voters usually does not give one a very good sense of the final turnout on election day.  Pollsters have learned that asking questions about whether or not a voter is likely to vote, and then about a voter’s recent voting history renders a better result.  So does using demographic data.   Some pollsters, like AP/Roper (who had Obama up only one point yesterday) take a generally conservative approach in focusing primarily on voters with a  history of voting.  This tends to downplay (how much depends on the assumptions made) the impact of first time voters.  This election, however, may bring in a larger number of first time voters than usual.   Thus many question Roper’s results and methodology.

Gallup, the granddaddy of pollsters, has done the unusal thing of posting three sets of results: all registered voters (Obama +7), an expanded likely voter result, based on voter intent rather than history (Obama +6), and a ‘traditional’ likely voter model that takes history more heavily into account (Obama +4).  If the Obama ground game is as strong as some believe (and early voting may indicate), then the expanded model is probably more accurate.  If, however, this is another false hope for the Democrats that they’ll suddenly surge turnout, the traditional poll is probably accurate.

If you want to really dissect a poll, many of them post their complete results.  The Roper poll, for instance, notes that 23% of total voters, and 21% of likely voters considered themselves liberal or very liberal.  39% of total and 38% of likely voters considered themselves conservative.   From this one can learn that for Obama to lead, he must be getting support from some conservatives, or if not, from nearly all the independents.  One might also wonder if Roper didn’t oversample conservatives.

The CBS poll notes that every registered voter is weighted (to fit demographics) and then assigned a probability of likelihood to vote, and the likely voter turnout is based on that (using every voter, plus the probability calculation.)  They do not explain how they make that calculation, though they do show how they weight votes.  For instance, total registered voters interviewed were 1046, weighted to 1010.  Republicans interviewed were 326 weighted to 287, Democrats 391 weighted to 411, and independents 329 weighted to 312.  While the population ratio of Democrats to Republicans is correct, one can wonder if perhaps the larger lead for Obama comes from overweighting Democrats.

But most people don’t want to dig into the polls themselves.   Know only that polls use different methods.  Most don’t go into too much public detail about how exactly how they weight them.   All want to be accurate, so many may be putting less emphasis on historical voting patterns than in the past, hoping not to be caught up by Obama’s GOTV efforts.   This all makes Gallup’s publication of three sets of numbers all the more interesting.

In general, political scientists trust polls because it is a science, and most are quite professional.  But we know the limits of polling, and how a mistaken assumption can yield a faulty methodology, or how inevitably there will be polls outside the margin of error.   I like tracking polls for watching trends, and in general the high “N” polls (and low margin of error) like Rasmussen and Gallup are best in that regard.

But, of course, national polls are relevant at this stage primarily to see a ‘big trend’ — Obama pulling away, or McCain mounting a surge.  The real interest now is in state contests.  So tomorrow I’ll tackle the question of how to read state polls.

Leave a comment