Surveys Aren't Broken, People's Expectations From Them Are.

The first reaction I had when I saw this table from this ToM piece was “it was a good cycle for polling.”

Worryingly, that wasn’t the reaction most people had.

I will try to put these numbers in perspective, and argue why unrealistic expectations poison one of the most important wells democracy has.

Putting the Numbers in Perspective

In 2018, Jennings and Wlezien published “Election Polling Errors Across Time and Space” in Nature Human Behaviour. It remains the most comprehensive empirical audit of pre-election polling accuracy so far assembled, drawing on more than 30,000 national polls from 351 general elections across 45 countries between 1942 and 2017.

We will use this as a canvas to assess the local numbers against.

Every point is a poll, with the absolute error from the election result on the vertical axis, and the days before election on the horizontal. I’ve filtered for polls within 300 days of an election to make things a bit clearer, and added a smoothing curve to show average poll error.

Two things to note. On average, most polls track to within around 4% of the election result in time and space, before narrowing to 3% in the final stretch (these are a LOT of elections).

Misses do happen, but they are rare.

So how do our local numbers look in comparison?

And zooming in to the relevant bit:

I genuinely find it hard to find words when researchers are being mocked and threathened after performing slightly better than average. The situation is made worse by many continuing to entertain a meaningless vote gap abstraction, and although there have been some efforts to fight against this, I think there’s still a way to go.

Xjenza™️

One aspect that I see all the time and remains relatively unaddressed is the scientific argument. Every pollster emphasizes to some degree scientific merits of their approach, but I think modern polling has advanced so much beyond most people’s introduction to statistics class that they don’t understand where this xjenza ends.

We sample because we cannot (realistically) ask everyone. If we could infer the will of the people enmasse, we would not need to hold costly elections or referenda. We could govern through push notification. The “sciencey” bit of polling remains probability sampling, specifically, the key insight by statisticians like Neyman that if you give everyone a non-zero chance of being picked, you can use probability theory to make defensible claims about how wrong you might be.

Now modern polling has evolved quite a bit. All sorts of fancy statistical techniques (yes, including imputation), have made their way through. And yes, response rates continue to be abysmal (although local ones are roughly 3 to 4 times better what US and UK pollsters work with), but most polls these days are complicated statistical models with a whole host of covariates to try to fix things (which is why the argument about response rates being low is bogus).

But fundamentally, and I will bold this, probability sampling remains the only part of the polling process that has a rigorous mathematical guarantee behind it.

This does not mean what pollsters are doing is not rigorous. But there is a whole lot that goes on after this scientific gurantee that means that with the same raw numbers, you can end up with different results. This is exactly what Nate Cohn had done 10 years ago in this really good article, where he gave four really good pollsters the same raw numbers.

I will try to emphasise again, the fact that these house effects exist does not mean anyone is trying to “cheat”. I have yet to meet a pollster who didn’t make assumptions that were justifiable to him in the moment. But modern polling is as much an art as it is a science.

Why Polling Remains Crucial for the Democratic Process

The most genuinely absurd sequence of words I find written on the internet remains “l-aqwa survey tibqa’ l-elezzjoni.”

Elections are a very blunt tool, collapsing all our differences into a singular percentage. Polling disaggregates opinion, reveals minority views, and gives texture to what the public actually thinks on specific issues. More importantly, polling is a way for public opinion to matter continuously, not just on election day. The fact that polls are used, and assessed, primarily as a crystal ball for election day is to our detriment and we are poisoning the well that serves as a communication channel between the governed and the government.

To borrow Friggieri’s metaphor, in Fil-Parlament ma jikbrux fjuri he writes how politics is like the dance between the brake pedal and the accelerator. Polling is the speedometer, giving crucial insight on when to use which. It is also the only way some people’s opinion actually matters.

A related thread to this is the discussion along having some sort of blackout period. I believe it is wrong. Days of silence and blackouts stem from a continental European tradition of assuming people are less intelligent or discerning than you. If your politics cannot compete on the marketplace of free ideas on its own merits, it has no place existing in the first place.

It’s also no accident that countries with strong Anglo-Saxon free speech culture have the most advanced polling scenes. The emphasis here has never been to limit what to expose, but how to explain better before making up one’s own mind.