Public Opinion Watch: Labor Day Leads
Stay up to date with the latest headlines via email.
In this edition of Public Opinion Watch:
(covering polls and related articles from the week of September 6-12, 2004)
- Why the Race Is a Lot Closer Than People Think
- Labor Day Leads and Possible Election Outcomes This Year
Why the Race Is a Lot Closer Than People Think
Is Bush ahead by a little or a lot? Is it close to a tie ball game or has Bush surged to a commanding lead?
The conventional wisdom inclines to the latter not the former. The reason has a great deal to do with two persistent problems with contemporary polls that – at least at this point in time – tend to considerably inflate Bush's apparent lead. But once you dissect the available data with these problems in mind, a truer picture of the race comes into focus which suggests that the race continues to be very close.
The two problems are: (1) samples that have an unrealistic number of Republican identifiers and hence tend to favor Bush; and (2) the widespread and highly questionable practice of using likely voters (LVs) instead of registered voters (RVs) to measure voter sentiment this far before the election.
First, the issue of partisan distribution in samples. Lately, and very suddenly, many polls have been turning up more Republican identifiers than Democratic identifiers in their samples – in some cases, many more (as high as a nine- to ten-point Republican advantage).
How realistic is it to be suddenly turning up a Republican lead on party identification, much less a large one? Not very. The weight of the academic evidence is that, while the distribution of party identification among voters can and does change over time, it changes slowly, not in big lurches from week to week.
And the weight of the empirical evidence is that the distribution of party identification among voters has favored and continues to favor the Democrats. In 2000, the exit polls showed Democrats with a four-point advantage over Republicans. In 1996, it was also five points; in 1996, it was three points and in 1988 it was also three points.
The data also indicate that there were two shifts in party identification over the 2001-2004 period which largely canceled each other out. The first shift, in the period after September 11, shaved several points off the Democrats' lead and brought the Republicans close to even (but never ahead) in party identification. The second shift took place in late 2003 and 2004 and reconstituted the Democrats' lead on party identiifcation to about four points, exactly where it was in the 2000 election according to the exit polls (see this useful study " Democrats Gain Edge in Party Identification" by the Pew Research Center for more details).
But if the party identification distribution is fairly stable and tends to change rather slowly, why would polls suddenly be turning up unrealistically high numbers of Republican identifiers? The best explanation, in my view, is that when the political situation jazzes up supporters of one party, they are more likely to want to participate in a public opinion telephone poll and express their views. An increased rate of interview acceptance by that party's supporters would then skew the sample toward that party without the underlying distribution having changed very much, if at all.
In this case, the Republican convention, coming on the heels of the Swift Boat controversy, may have helped raise political enthusiasm among Republican partisans, leading to more interview acceptances and a disproportionate number of Republicans in recent samples.
But whatever the explanation for the disproportionate number of Republicans in recent samples, if those numbers are unrealistic, they are skewing reported horse race results toward Bush. What, if anything, should be done about this?
One possible solution is to weight poll results by a more reasonable distribution of party identification. The issue of whether to use this approach to the problem is well-summarized by Alan Reifman in his invaluable essay " Weighting Pre-Election Polls for Party Composition: Should Pollsters Do It or Not?" on his website.
As Reifman puts it:
One factor (among many) that may contribute to discrepancies between different outfits' polls in their Bush-Kerry margins... is polling firms' different philosophies as to whether it's advisable to mathematically adjust their samples – after all the interviews have been completed – to make the percentages of D's and R's in their survey sample match the partisan composition that is likely to be evident at the polls on Election Day. The latter can be estimated from exit polls from previous elections, party registration figures (in states where citizens declare a party ID when registering to vote), and surveys.
(Another issue that often comes up in evaluating pre-election surveys, with which many of you may be familiar, is whether results are reported for "registered" or "likely" voters. That is a different issue from what is being discussed [in this essay]. Whether a pollster reports results for registered voters, likely voters, or both, weighting by party ID is a separate, independent decision.)
Note well Reifman's point that the issue of whether and how to use LVs, not RVs, to report results is separate from the issue of whether and how to do party-weighting. I discuss the LV issue below after the party-weighting discussion.
Given that party identification does shift some over time, my instinct has generally been to avoid party-weighting if possible and promote a full-disclosure approach. This is how I recently put it in Public Opinion Watch:
[B]ecause the distribution of party identification does shift some over time... polls should be able to capture this. What I do favor is release and prominent display of sample compositions by party identification, as well as basic demographics, whenever a poll comes out. Consumers of poll data should not have to ferret out this information from obscure places – it should be given out-front by the polling organizations or sponsors themselves. Then people can use this information to make judgements about whether and to what extent they find the results of the poll plausible.
But this approach increasingly seems unrealistic to me. The polling organizations and sponsors do not routinely release the data I call for and certainly do not prominently display them. And even if they did, the typical consumer of polling data lacks the time and skills to use these data to re-weight or adjust reported results. The fact of the matter is that people pay attention to reported results period; therefore they are at the mercy of whichever results are reported and emphasized (an issue that also looms large in the LVs vs. RVs issue, discussed below).
This suggests that weighting poll results by a reasonable distribution of party identification may be necessary to avoid giving the public distorted impressions of the state of the race.
What is a reasonable distribution of party identification to use in such weighting? One obvious candidate is the exit poll distribution from 2000: 39 percent Democrats, 35 percent Republicans, 26 percent independents. Moreover, the Democratic advantage in this distribution – four points – closely matches the average Democratic advantage in 2004, as measured by the Pew Research Center (see above) and other polling organizations, making it an even more attractive option.
But political analyst Charlie Cook probably has the best idea, even though it can really only be implemented by the polling organizations themselves: "dynamic party identification weighting." Cook's idea is that polls should weight their samples by a rolling average of their unweighted party identification numbers taken over the previous several months. This would allow the distribution of party identification to change some over time, but eliminate the effects of sudden spikes in partisan identifiers in samples (such as we are experiencing now).
Lacking such a dynamic weighting, however, the best we can probably do at this point is to use the exit poll distribution mentioned above. How much difference would this make if we applied it to recent polls?
Quite a bit. Here are Bush's leads in a number of recent polls, ordered by size of his lead, once the horse race question is weighted by the 2000 exit poll distribution (note: not all recent polls can be included because you need the horse race figures among Democrats, Republicans, and independents separately to do this procedure and not all polls release these figures; in addition Zogby and Rasmussen results are party-weighted to begin with and therefore do not have to be re-weighted; RV results used unless only LV results available):
CBS News, September 6-8 RVs: +5
Zogby, September 8-9 LVs: +2
Rasmussen: September 10-12 LVs: +1
Fox News: September 7-8 LVs: +1
Washington Post, September 6-8 RVs: +1
Newsweek, September 9-10 RVs, -2
Gallup, September 3-5 RVs: -4
These data present a clear picture of a tight race, with Bush likely running a small lead, but not the solid – and even large – advantage that has been conveyed to the public.
The other problem that is afflicting the polls and considerably inflating perceptions of Bush's lead is the widespread, and highly questionable, use of LVs, instead of RVs, to report horse race results far in advance of the actual election. The reason why using LVs instead of RVs is a bad idea is simple: the LV approach is being asked to do a job – gauge voter sentiment and how it changes from week-to-week (and even day-to-day) – that it was never designed to do. What the LV approach was designed to do was measure voter sentiment on the eve of an election and predict the outcome . That was, and remains, an appropriate application of the LV approach.
But applied as many polling organizations currently do, it is highly inappropriate and frequently very misleading. As political scientists Robert Erikson, Costas Panagopoulos, and Christopher Wlezien put it in their important forthcoming paper, "Likely (and Unlikely) Voters and the Assessment of Campaign Dynamics" in Public Opinion Quarterly :
[E]stimates of who may be likely voters in the weeks and months prior to Election Day in large part reflect transient political interest on the day of the poll, which might have little bearing on voter interests on the day of the election. Likely voters early in the campaign do not necessarily represent likely voters on Election Day. Early likely voter samples might well represent the pool of potential voters sufficiently excited to vote if a snap election were to be called on the day of the poll. But these are not necessarily the same people motivated to vote on Election Day.
And of course, since the group of people "sufficiently excited to vote if a snap election were to be called on the day of the poll" changes from poll to poll, it raises the uncomfortable possibility that observed changes in the sentiments of "likely voters" represent not actual changes in voter sentiment, but rather changes in the composition of likely voter samples as political enthusiasm waxes and wanes among the different parties' supporters. Or, as Erikson et al. put it:
At one time, Democratic voters may be excited and therefore appear more likely to vote than usual. The next period the Republicans may appear more excited and eager to vote. As Gallup's likely voter screen absorbs these signals of partisan energy, the party with the surging interest gains in the likely voter vote. As compensation, the party with sagging interest must decline in the likely voter totals.
And this is exactly what their analysis of Gallup data from the 2000 election finds – "shifts in voter classification as likely or unlikely account for more observed change in the preferences of likely voters than do actual changes in voters' candidate preferences."
This is an important result and helps nail down what has always been disturbing about the use of likely voter methods far in advance of the actual election. Instead of giving you a better picture of voter sentiment and how it is changing than conventional RV data, it gives you a worse one since true changes in voter sentiment are swamped by changes in who is classified as a likely voter.
Does this matter? You bet it does. When Gallup told the world on September 6 that Bush was leading Kerry by seven points among LVs, the world listened and absorbed that figure as a trustworthy indicator of where the race was. Completely lost, except to those who bother to look at such things, was the Gallup finding that Bush only led by single point among Rvs – in other words, that the race was about tied. Gallup and its sponsoring organizations implicitly and explicitly encouraged people to treat the LV finding as the real story and the RV finding as an unreliable afterthought (after all, those voters aren't "likely"!). The incredible irony, of course, is that the real situation was exactly the reverse: as the Erikson et al. findings suggest, it was the RV data that provided the best gauge of voter sentiment and the LV data that should have been an unreliable afterthought.
Or take the Gallup data gathered in Ohio in the last two months, perhaps the key state in this election and the subject of endless media stories about "the battle for Ohio." On September 8, Gallup released data showing Bush ahead of Kerry by eight points among LVs in Ohio, a fourteen-point swing from late July when Kerry led by six points. Again, completely lost in the Gallup, newspaper, and television reports on the poll was the poll's finding that Bush had just a one-point lead among RVs in the state, representing a much more modest swing of six points since late July.
Guess which figures are still with us as coverage of the battle for Ohio continues? That's right: Bush's eight-point lead among LVs and fourteen-point swing. In fact, just this Sunday, the New York Times practically built their Ohio campaign story around these figures which allegedly showed just how well Bush is doing! and just how much the situation has changed! .
In short, these LV figures, especially from Gallup, are contributing mightily to the impression that Bush has built a substantial lead and is even surging ahead in some of the key swing states. But, as we have seen, these LV data are fundamentally inappropriate for measuring the state of the race, and how it is changing, this far ahead of election day. For that, you need the RV data and they suggest something far different: the race is damn close and Bush's substantial lead is a myth.
Sources for this section:
Gallup poll of 1,018 adults for CNN/USA Today, released September 6, 2004 (conducted September 3-5, 2004)
TNS poll of 1,202 adults for ABC News/Washington Post, released September 9, 2004 (conducted September 6-8, 2004)
CBS News poll of 1,058 adults, released September 9, 2004 (conducted September 6-8, 2004)
Opinion Dynamics poll of 1,000 likely voters for Fox News, released September 9, 2004 (conducted September 7-8, 2004)
Zogby poll of 1,018 likely voters, released September 10, 2004 (conducted September 8-9, 2004)
Princeton Survey Research poll of 1,166 adults for Newsweek, released September 11, 2004 (conducted September 9-10, 2004)
Rasmussen Research poll of 1,000 likely voters, released September 13, 2004 (conducted September 10-12, 2004)
Labor Day Leads and Possible Election Outcomes This Year
Frank Newport, "Can Election Probabilities Be Established at This Point?" Gallup News Service, September 8, 2004
Speaking of the harm done by use of early LV data, here's another example. Gallup posted an analysis on their site September 8 about estimating election probabilities based on Labor Day poll data that is almost completely worthless. The reason is that they focus on Kerry's seven-point deficit among LVs on Labor Day (Can he overcome it?), while basing their analysis almost entirely on data about RVs on Labor Day.
How do I know their Labor Day poll data is almost entirely (prior to 1996) based on RVs? Because they published these data, clearly marked as being from RVs 1952-92 and from national adults 1936-48, in an analysis on their own site just five days before they posted the analysis mentioned above.
Don't they read their own stuff? Clearly it makes no sense to analyze a lead among LVs this Labor Day, and its possible relation to the final outcome this year, on the basis of historical data mostly about RV leads on Labor Day and how much they changed by Election Day.
Thus, the question Gallup should have been asking is: can Kerry overcome his one-point deficit among RVs by Election Day, based on historical patterns? Turns out the answer to this question – really, the only question that their data can properly answer – looks pretty favorable for Kerry.
In thirteen of thirteen cases, going back to 1952, the Labor Day margin between the candidates changed enough for Kerry to tie or surpass Bush in the popular vote and, in eleven of those thirteen cases, the change was in Kerry's direction (that is, in the direction of the candidate who was behind among RVs on Labor Day).
Moreover, if you compare Bush's position to the position of incumbent presidents who won their campaigns for re-election in this period, it doesn't look auspicious. In the five cases that qualify (Eisenhower, 1956; Johnson, 1964; Nixon, 1952; Reagan, 1984; and Clinton, 1996), winning incumbent presidents on Labor Day had an average lead of twenty points and a median lead of nineteen points among RVs. Wow.
I sent some of the comments above to Gallup's editors' blog and Frank Newport, editor in chief of the Gallup poll, was sporting enough to print an edited version of my comments on their blog, along with his reply. In Newport's reply, his key rationale for conducting their analysis the way he did was he wished to "[use] Gallup's best available estimates at Labor Day for each year for which we have data." (But by all means read his argument in full through the link.)
To further this discussion, here are some additional remarks on the issue replying to Newport's argument. I should add that I don't believe that Gallup has any particular axe to grind in how they did this analysis – I just think in this case they got it wrong.
Thanks for your thoughtful reply. But I still don't buy it. You surely must see that it makes a difference when people read these analyses with "seven-point deficit to overcome" in mind rather than "one-point deficit to overcome." And in fact that's how your analysis was written, focusing reader attention on the seven-point LV deficit.
And the fact remains that apples-to-apples comparisons are far preferable to apples-to-oranges comparisons. Therefore the proper comparison is between this year's RV Labor Day results and previous years'. Otherwise, you are not analyzing the same change (RV labor day gap vs. final gap) across years.
Using a consistent time series would make a difference to your analysis.
"In summary, the history of presidential elections since 1936 suggests that in about half of the cases, the type of gap change that would be necessary for Kerry to tie or move ahead of Bush has occurred. About half the time it has not. If a gap change does occur, the odds are higher than 50-50 that it would be in Kerry's direction (i.e., a shrinkage rather than an expansion of Bush's current lead)."
You would have:
"In summary, the history of presidential elections since 1952 suggests that in all cases, the type of gap change that would be necessary for Kerry to tie or move ahead of Bush has occurred. If a gap change does occur, the odds are very strong (eleven out of thirteen) that it would be in Kerry's direction (i.e., a shrinkage rather than an expansion of Bush's current lead).
This clearly sounds quite a bit different. And thinking Kerry is behind by one point, rather than seven points, clearly makes a big difference when considering elections like 1960 and 1980, which loom large in your analysis. Kennedy was behind by a point in 1960 among Rvs – the same as Kerry – and Reagan was behind by four points in 1980 – more than Kerry. If you're thinking seven points behind, those races look a lot different.
In short, lacking a consistent time series of any length on LVs, you just shouldn't use 'em in an analysis like this.