Voters being a little shy about declaring who they will vote for is one of the big challenges for polling companies in certain elections around the world and it has lead to some very notorious polling errors in recent history.
The level of shy voting in any one election varies dramatically based on the character of the parties and candidates contesting the vote. It tends be much more prevalent in election where extreme left or right wing/nationalist parties or candidates are in contention. In Poland for example after the fall of communism people were shy to admit that they were still planning on voting communist which lead to some big polling errors in under representing the scale of the communist vote. Similarly in France, many people are have been historically shy about declaring their votes for the Front National which lead to a major polling upset in the 2002 presidential election where the Maria Le Penn's vote share what more than 6% under represented, enough for here to move from 4th to 2nd place in the first round voting which put her through to the second round run offs.
Accounting for shy voting is extremely difficult, as shy voters have a tendency of opting out of taking part in polls which make it difficult to directly measure it. I have first had experience of this from polling we conducted during the 2015 US election. During the high of the Trump "pussygate" scandal we didn't see any noticeable drop in support for Trump among the Republican voters who participated in our surveys, but what we did observe though was a drop in around 3% the number of declared Republicans participating in our survey compared to the previous waves. This all changed when the FBI announced they were investigating Clinton a week or two later, the number of Republicans participating the next wave or our survey suddenly leaped by 6%. When we weighting our sample to be balanced it out by party representation as we did this differences was invisible and we only spotted this in hindsight analyzing our unweighted responses.
Shy voting issues are lot more prevalent in face to face and human based phone polls compared to the more anonymous format of the online survey. This is something that the Kantar public team have studies in some detail. In the 2012 French election for example there was a 4% difference in the Front National vote share between phone/face to face and online.
This perhaps explains one of the reasons why the State wide polls, which are nearly all phone based had much larger errors in the 2015 US election compared to the national polls which tended to be more online based. There were clearly other factors as well but without doubt there were some people who were shy about declaring they would be voting for Trump. You can almost measure the scale of it by looking at the size of the systematic errors in the polls after weighting out some of the other issues that were a factor such as education bias. By my rough calculations shy voting factors accounted for upwards of 2% under representation in the eventual vote count for Trump in some of the State level polls.
This bring me to the Mid-term state election polling in the US. Is shy voting likely to be an issue this round too? I am afraid to say it is looking like there could well be. There have been some learning since the big state wide polling errors in the 2015 but some elements of shy voting error is almost impossible to eradicate for one of polls like these.
At this stage its extremely difficult to measure shy voting directly but one indicator is to study the level of undecided voters in each election district. One of the opt outs for a shy voter is talking to a phone pollster simply say you don't know who you are going to vote for.
I have done some analysis of the undecided polling data from 44 of the mid term elections as published in the New York times https://www.nytimes.com/interactive/2018/upshot/elections-polls.html. What I have examined is the different levels of undecided voters when either a Republican or Democrat candidate currently holds a lead in the current poll and compared it based on the relative size of the lead and what I am observing are some differences.
Now it is very difficult to definitively explain these difference, it could be simply random error variance, but its a basic piece of human psychology that voters tend to be more shy of their views about voting for a more controversial candidate when they feel they might be in a minority. When a candidate start to get even slight lead people start to feel more comfortable declaring that they will also vote for them too. There is also some types of wavering voter who don't want to be seen backing the losing candidate. So you tend to get higher recorded levels of shy voting when a candidate that is subject to shy voter factors is slightly behind than when they are in front in close races. When one or the other candidate has a clear lead shy voting tends to be less prevalent as that means that more of wavering middle have made up their clear mind.
What the results of this analysis shows is that where the democrats hold a lead, there are slightly more undecided voters compared to where the Republicans hold a lead, and the closer the race the bigger the difference the gap is around 1%. This suggest to me that there could well be an element of shy voting caused by negative image of Trump and some elements of the Republican party that is voice by the media that might impact on the accuracy of some of these mid term election polls and with nearly 20/44 of the elections I examined being near dead heat races, we might have to prepare ourselves for some polling miss calls.
Now I have to stress this is very much my personal conjecture about what this this data is telling me, I am very much open to hear other peoples interpretations.
...and if this does happen I would ask you not to leap to blame the polling companies however, as ahead of time it is almost impossible to methodologically factor in for shy voting issues like these at state level polling.
The level of shy voting in any one election varies dramatically based on the character of the parties and candidates contesting the vote. It tends be much more prevalent in election where extreme left or right wing/nationalist parties or candidates are in contention. In Poland for example after the fall of communism people were shy to admit that they were still planning on voting communist which lead to some big polling errors in under representing the scale of the communist vote. Similarly in France, many people are have been historically shy about declaring their votes for the Front National which lead to a major polling upset in the 2002 presidential election where the Maria Le Penn's vote share what more than 6% under represented, enough for here to move from 4th to 2nd place in the first round voting which put her through to the second round run offs.
Accounting for shy voting is extremely difficult, as shy voters have a tendency of opting out of taking part in polls which make it difficult to directly measure it. I have first had experience of this from polling we conducted during the 2015 US election. During the high of the Trump "pussygate" scandal we didn't see any noticeable drop in support for Trump among the Republican voters who participated in our surveys, but what we did observe though was a drop in around 3% the number of declared Republicans participating in our survey compared to the previous waves. This all changed when the FBI announced they were investigating Clinton a week or two later, the number of Republicans participating the next wave or our survey suddenly leaped by 6%. When we weighting our sample to be balanced it out by party representation as we did this differences was invisible and we only spotted this in hindsight analyzing our unweighted responses.
Shy voting issues are lot more prevalent in face to face and human based phone polls compared to the more anonymous format of the online survey. This is something that the Kantar public team have studies in some detail. In the 2012 French election for example there was a 4% difference in the Front National vote share between phone/face to face and online.
This perhaps explains one of the reasons why the State wide polls, which are nearly all phone based had much larger errors in the 2015 US election compared to the national polls which tended to be more online based. There were clearly other factors as well but without doubt there were some people who were shy about declaring they would be voting for Trump. You can almost measure the scale of it by looking at the size of the systematic errors in the polls after weighting out some of the other issues that were a factor such as education bias. By my rough calculations shy voting factors accounted for upwards of 2% under representation in the eventual vote count for Trump in some of the State level polls.
This bring me to the Mid-term state election polling in the US. Is shy voting likely to be an issue this round too? I am afraid to say it is looking like there could well be. There have been some learning since the big state wide polling errors in the 2015 but some elements of shy voting error is almost impossible to eradicate for one of polls like these.
At this stage its extremely difficult to measure shy voting directly but one indicator is to study the level of undecided voters in each election district. One of the opt outs for a shy voter is talking to a phone pollster simply say you don't know who you are going to vote for.
I have done some analysis of the undecided polling data from 44 of the mid term elections as published in the New York times https://www.nytimes.com/interactive/2018/upshot/elections-polls.html. What I have examined is the different levels of undecided voters when either a Republican or Democrat candidate currently holds a lead in the current poll and compared it based on the relative size of the lead and what I am observing are some differences.
What the results of this analysis shows is that where the democrats hold a lead, there are slightly more undecided voters compared to where the Republicans hold a lead, and the closer the race the bigger the difference the gap is around 1%. This suggest to me that there could well be an element of shy voting caused by negative image of Trump and some elements of the Republican party that is voice by the media that might impact on the accuracy of some of these mid term election polls and with nearly 20/44 of the elections I examined being near dead heat races, we might have to prepare ourselves for some polling miss calls.
Now I have to stress this is very much my personal conjecture about what this this data is telling me, I am very much open to hear other peoples interpretations.
...and if this does happen I would ask you not to leap to blame the polling companies however, as ahead of time it is almost impossible to methodologically factor in for shy voting issues like these at state level polling.