By

David L Miller /

CBS/ February 11, 2009, 4:29 PM

It's All About The Sample

This May 20, 2012 image shows Arthur Frommer, 83, and his daughter, Pauline Frommer, 46, = in New York. The father-daughter team host a live weekly call-in radio show together called ?The Travel Show,? on WOR-AM, which is carried on 115 radio stations across the U.S. (AP Photo/Seth Wenig)

This May 20, 2012 image shows Arthur Frommer, 83, and his daughter, Pauline Frommer, 46, = in New York. The father-daughter team host a live weekly call-in radio show together called ?The Travel Show,? on WOR-AM, which is carried on 115 radio stations across the U.S. (AP Photo/Seth Wenig) / Seth Wenig

By Kathy Frankovic, CBS News director of surveys

Why do pollsters make such a big deal about probability sampling?

You'd think that choosing people to interview should be easy. Take three of these, four of those, and two of the other. But it's not. Polls work because they rely on probability sampling, which ensures that all members of the sampling frame (the people being represented by the poll) have a measurable chance of being selected. (For instance, random digit dialing procedures select numbers from all existing numbers, and exit pollsters can talk with "every fourth voter" who leaves the polling place.) As far as possible, this takes the decision of whom to interview out of the hands of humans — no self-selection ("Please interview me!"), and no choosing people who are easiest to reach. Some famous polling mistakes bring that point home.

Early U.S. polling met with significant success until the 1948 election, when all of the pre-election polls went horribly wrong. They predicted a victory by New York State Republican Governor Thomas Dewey over incumbent Democratic President Harry Truman. In 1948, polling methodology routinely used "quota" samples, which gave interviewers the ability to choose respondents more or less on their own, with "quotas" for the number of men, women, young people, old people, "high" status and "low" status respondents.

The accuracy of polls conducted this way had been essentially unchallenged by the media and the public, although some government statisticians and some academics noted the possibility for error. So, after the mis-predicted 1948 election, one academic entity, the Social Science Research Council, created a blue-ribbon panel to review the polling process. It recommended the use of probability sampling at all levels of respondent selection, and the elimination of quotas.

There was, of course, another reason for this most famous of polling mistakes: simple overconfidence. From their work in the 1936, 1940 and 1944 presidential elections, pollsters believed there were few — if any — changes in public opinion that could be attributed to the fall campaign. Consequently, they assumed that Dewey's lead in the early fall could not be affected by anything that either he or Truman did. So they stopped polling two weeks before the election.

After the horrors of their 1948 error, pollsters switched to a probability-based process, and no longer left the choice of respondents to their interviewers. In addition, they decided to poll much closer to Election Day. Their poll predictions and reports for the 1952 election were both more accurate and more humble!

Yet the quota habit persisted elsewhere. In the 1992 British parliamentary election, polls erroneously predicted a Labour Party victory. In fact, the Conservative Party won by eight percentage points. As the U.S. Social Science Research Council did in 1948, the British Market Research Association conducted an investigation — and found similar problems with the sampling methods. Many British pollsters had relied on in-person interviews and used quota (not probability) sampling. Once again, interviewers chose whom to interview. After 1992, some British polling organizations adopted telephone polling, which made probability sampling of households and individuals much easier.

Exit polls are probably the most famous recent polling innovation. The first was conducted inadvertently in the United States by Ruth Clark, a well-known newspaper researcher who began her research career as an interviewer. In 1964, she was going door-to-door in Maryland, looking for voters to interview on the state's primary election day. Tired of walking, she decided to go to a polling place and talk with voters as they were leaving. As she later put it, "I told Lou [pollster Lou Harris] what I had done, and by the [Republican] California primary in June, the exit poll was put to full use." That day, Barry Goldwater voters dropped blue beans into a jar, while Nelson Rockefeller voters dropped red beans.

But exit polls also can fall victim to selection problems. The largest error in presidential election exit poll history occurred in 2004, when the disparity between the vote count and the exit poll exceeded six percentage points. And the source of the problem — as in 1948 and 1992 — came back to the issue of who was interviewed. Interviewers weren't supposed to use quotas, but many (especially younger) interviewers confronted difficulties at the polling place. Sometimes they did not interview the sample of voters the way they were supposed to, using probability selection (e.g., every second voter, or third voter, or 10th voter, depending on the total numbers of voters at the precinct). Instead, they interviewed whomever it was most convenient for them to interview (often people most like themselves). The result was very much the result that plagued the 1948 and 1992 quota-based systems.

Fixing this problem, once again, meant going back to the basics — especially meeting the rules of probability.

By Kathy Frankovic
Copyright 2009 CBS. All rights reserved.
6 Comments Add a Comment
linkicon reporticon emailicon
cbs_oliver says:
In some cases the media has failed to endorse polls with valid sampling methods and has instead endorsed polls with biased sampling - presumably for political reasons. The standout example is the polling intended to reveal citizen casualties in Iraq. Most mainstream media effectively sneered at the large estimates of Iraqi casualties calculated in the Lancet/Johns Hopkins study. They did not trot up to a local university to interview sampling and statistics experts about the report. Instead they spoke to President Bush and to the Brookings institute both of which criticized the report.

It was interesting later when an internal UK government memo directed at Tony Blair leaked out in which an internal official explained that the methodology of the Lancet/Johns Hopkins study was beyond reproach and other means than criticism of the methodology would need to be found to deal with the findings. He was clearly wrong about what could be done.

Driven to consider larger casualty numbers the media prefered the severely constrained officially vetted body count accumulation put forward by the Iraq Body Count group, even though their sampling procedure would make a select golf club blush with envy at its extreme exclusivity.

Improvement is needed for media treatment of polls.
reply
linkicon reporticon emailicon
cbs_oliver says:
A good article.

There have been a few problems with recent polling and analysis.

Even accurate polling will not predict the results of elections if the election polling process is biased. When this happens - for example when an election exit poll result differs significantly from the result of an election - it needs to be reported and analyzed. The disparaty may be a sign of bias in the election polling. Instead the media seems to cover this up. Not good.

The media seems to enjoy presenting dog and cat fights between partisan spin masters over the interpretataion of poll results. These media events provide nothing of value. They may even provide negative value if participants are shills - which they often are.
reply
linkicon reporticon emailicon
j-whitman says:
"The main plank in the Nationalist Socialist program is to abolish the liberalistic concept of the individual and the Marxist concept of humanity and to substitute for them the folk community, rooted in the soil and bound together by the bond of its common blood." -- Adolph Hitler
reply
linkicon reporticon emailicon
mbcsmith says:
Reminds me of 2000 when all the pollsters vacationed in heavily Dem leaning Miami during the election, rather than sampling the entire state. I guess there wasn't any fine dining or celebs in Palatka.
reply
linkicon reporticon emailicon
gangesdak says:
Statisticians think that they can have the "big" answer by smartly taking a small sample, and extrapolate the answer, to the wonder of the rest of the public. Well, when the difference is between black and white, or between 10 and 10000, then their "prediction" is applauded. When it comes down to the wire, then all bets are off. They become like common people.
reply
linkicon reporticon emailicon
tucano2 says:
The ONLY sample that counts is the result of registered voters and the casting of ballots of those who actually vote.
reply