McCain, John McCain, campaign, 2008, election, Republican, nomination, New Hampshire primary, primary, caucus, nominating process, presidential campaign, president, 2008

Race42008.com’s Sean Oxendine Explains Polling Very Clearly- “On Error Margins”

One of the many fine posters, among those at Race42008.com, is Sean Oxendine.  His post on polling - On Error Margins - explains the margin-of-error concept in lay terms for those of us who enjoy the horse-race aspects of politics, and so it’s worthwhile to post it here:

In comments on DaveG’s analysis of McCain leading Obama by a point in the Franklin & Marshall Poll, long-time favorite commenter of mine Caroline writes:

With a MoE of +/-3.9% McCain does not “lead”.

This is pretty much true. But this raises an important post for those of you who come here for the horse race analysis. If you *really* want to be technical, with a sample size of 640, we *can* be 20% certain that McCain is leading based on the F&M poll. Now that is to say, we’d be much better off relying on a coin flip rather than this poll in choosing the correct leader, but nonetheless, there are conclusions we can draw about who is leading based on the F&M poll. Just not very useful conclusions.

And this is just a point that I want to make going forward, and it is a very important one to remember with error margins. Pointing to error margins is one of those things in the blogosphere that is often used to show up people who have no idea what they are talking about. Most people who always say “the polls were wrong” or “the polls are all over the place” or “YAY CANDIDATE A IS WINNING YOU REPUBLICLONE THUGS ARE GOING TO LOSE” (And in fairness, you’re just as likely to find a similar all-cap post about “al-qaeda-loving DhimmocRATS” going down), can quickly be shown up with a reference to error margins (btw Caroline, I’m no longer pointing fingers at you, you were 100% correct in the criticism of DaveG as I quoted it; I just used your post as a jumping-off point).

At a simple level, one thing most people don’t understand is that the error margins apply per data point. Not per spread. In other words, in every Obama/McCain poll, there is an error margin for Obama, and an error margin for McCain. With a 3.5% error margin, then, a poll showing the two tied could mean that McCain is ahead by 7 points or behind by seven. A poll showing Obama up four is still within the error margin. A poll showing McCain up by 7 could mean the two are tied, or it could mean that McCain is up by 14.

Most people get this, or pick up on it quickly.

But I am still being imprecise, and this is an important nuance that very few people get. My paternal grandfather… of overwhelming common sense, has said when talking about my hobby “how is it possible to say anything with certainty about how millions of people will vote based on what 500 people say in a poll?” (actually there’s usually a lot more adjectives thrown in, but we won’t go there).

And it is an important point. The answer is that you can never say with 100% certainty, based on a poll, that X or Y is ahead. It is possible that, in a state with 5,000,000, of which 4,999,749 are going to vote for Obama, you could get a poll that includes all of the 251 McCain voters, and end up with a poll predicting a McCain win. It just ain’t very likely.

But this is the important thing about horserace analysis. Most declarations based on a poll that “X” or “Y” is ahead will drop an important caveat: With “x%” certainty.

Pollsters use 95% confidence as an “industry standard,” which they do for some very specific, technical reasons I won’t go into here. So with a sample size of, let’s say, 500, it is correct to say that you can be 95% certain that the “true” outcome is within 4.4% either way of your polling outcome. So if Obama comes in at 50%, and McCain comes in at 42%, you can’t be 95% certain that Obama is leading..

But 95% confidence isn’t the be-all, end-all. Like I said, it is the industry standard that is selected for specific reasons, that may or may not always apply to your needs. Sometimes, for example, you may have an extremely important survey to make. Let’s say you’re thinking of going to intrade and plunking down your life’s savings on Obama to win. At that point, you may decide that you need to be 99% confident in your given range. Other times, you might want to be less sure.

What if, for your purposes, you only wanted to be 90% certain Obama was leading? Your error margin shrinks to 3.68%. In other words, in the preceding example, we *can* be 90% certain Obama is leading. And 90% certain is pretty darned good!

But what if you wanted to be REALLY certain, like 99% certain. With a 500 voter sample, we’d have to have a spread of 11% before we can be THAT certain.

Now you’ll notice something here. The error margin for 90% is +/-3.68%. The error margin for 95% is +/-4.38%. And the error margin for 99% is +/-5.76.

The relationship is not linear here. To go from 90% certainty to 95% certainty, your error margin goes up .7%. To go from 95% to 99%, your error margin goes up 1.4% (with sample sizes of 500).

In other words, if you are willing to accept a lower degree of certainty, you can often draw inferences even based on poll results that are within the reported error margin (which is almost always based upon 95% certainty), especially if you’re close to being outside the MOE, without sacrificing that much certainty.

For example, let’s take a look at the recent poll from SUSA showing Obama up 3 on McCain in Ohio. It has an error margin of +/- 4.3%, meaning that in common parlance, it is a “statistical dead heat.”

But its not really. The poll sampled 542 registered voters, and showed them three points apart. If all we wanted to know was whether it was “more likely than not” that Obama lead McCain, we could say “yes,” because we are 55% certain that Obama’s “true” score lies between 46.6% and 48.4% and that McCain’s lies between 46.4% and 42.6%. In other words, we are 55% certain that Obama is leading in Ohio.

Let’s say that Obama is *four* points ahead. Under those circumstances, we can be 65% certain that he is actually leading. A six point lead means we’re 88% sure he’s leading.

In other words, you can still draw inferences from polls within the error margin, some of which are quite useful. For my purposes, if I see a lead of greater than 4 points in a poll with a sample size of 600, I’m willing to say candidate A probably does have the lead. This becomes especially useful when you have the really good polls with large sample sizes; for example the Gallup tracking poll has 1200 participants in a given sample; with that we can be 66% sure that a three-point lead is a real one. This also is important when you have a large series of polls; the midpoint of the polls will tend to be the actual result since the polls will generally be distributed evenly around the actual result.

Now, all of this assumes polls use proper methodology that doesn’t bias the result, and it assumes a “normal” distribution of the populace, which isn’t really the case (e.g., it assumes that the populace is spread out evenly like a giant bag of M&Ms, rather than segregated like a Snicker’s bar. Mmmmmmm…Snickers bars). Still, there are ways to correct for that, though that is a subject for another post.

Well done.

You can contact Election Night HQ at publisher@electionnighthq.com

Please sign the Guestbook on the right-side column. 

In addition, if you are from overseas, we particularly welcome your interest in the U.S. election process.   Please send us an e-mail, if you have any questions about Senator McCain in particular and the U.S. election process in general, and about how you found the site.

Book Mark it-> del.icio.us | Reddit | Slashdot | Digg | Facebook | Technorati | Google | StumbleUpon | Window Live | Tailrank | Furl | Netscape | Yahoo | BlinkList Sphere: Related Content

One Response to “Race42008.com’s Sean Oxendine Explains Polling Very Clearly- “On Error Margins””

  1. An outstanding piece on polls, which generally don’t tell us quite much as we think (or wish).

    Hi Fellow Supporter of John McCain. On my blog, I’ve written recently about Rush Limbaugh’s endorsement of Alaska Govenor Sarah Palin for McCain-s V-P choice, new information from Texas about the race there, and finally about Barack Obama’s bizarre support of infanticide (in the case of so-called “live birth abortions”). If you’d ever like to use any of my material, please feel free to do so. I’d only ask you to cite my blog at: http://camp2008victorya.blogspot.com. If I can ever offer any assistance to help you get your pro-McCain message out, please let me know. Comments are always welcome. All the best to you and your blog visitors.

    Stephen Maloney’s last blog post..McCain Palin Ticket, Texas Revelations

Leave a Reply


This blog uses the CommentLuv plugin which will try and parse your sites feed and display a link to your last post, please be patient while it tries to find it for you.

Related Posts from the Past:




Please visit WP-Admin > Options > Snap Shots and enter the Snap Shots key. How to find your key