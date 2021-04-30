This Wednesday the latest polls that may be published on the elections in the Community of Madrid on May 4 were published. Those polls keep the PP first (which is around 41% of the votes), followed by PSOE (21%), Más Madrid (16%), Vox (9% -10%), Unidas Podemos (7%) and Ciudadanos ( 4%). Below we offer the latest prediction of EL PAÍS from the polls published.
Before, it is convenient to look at the trends of the last days. On the right, the only movement is a slight decrease in Citizens – which moves away more than 5% and reduces their options of achieving representation – but on the left there is a clearer one: the PSOE has lost 5 or 6 points since March, while that More Madrid rises from 11% to 16%.
The result of the elections depends on a duel between two percentages: the sum of votes of PP and Vox (which is around 50%) and that of PSOE, Más Madrid and Podemos (45%). The gap has narrowed from six to five points in recent days, but the left still needs to surprise the polls to win. Those three points that have to change blocks, in essence, determine the chances of victory for each one.
The prediction of seats
The graph below represents our estimate of seats based on average polls. The PP would be around 59 deputies, followed by PSOE (30), Más Madrid (23), Vox (13), Unidas Podemos (9) and Ciudadanos (0 probable result; 2 on average).
To make this estimate we use a statistical model and simulate the elections 25,000 times, as explained by the methodology at the end of the text. The model is fed by soundings and incorporates a key piece of information: its historical success. In Spain, polls deviate from the result by about two points per game, on average, and it is not uncommon for them to make errors of three or more points with one. That is why it is important to know how accurate you are when making predictions.
It is easy to see the uncertainty that still surrounds these elections. For example, according to our calculations the most likely outcome of the PP is 59 seats, but its 90% probability range ranges from 49 to 69 seats. In other words, one out of every twenty times we would see the PP above (or below) that band. The case of Ciudadanos is also striking: the probability that it will win seats is only 20%, but if it does, it will win seven seats, so that on average it will achieve two (although that exact result is impossible).
The key: who will win the majority
The main advantage of having a prediction model is that it allows you to attribute probabilities to different outcomes, something that polls cannot do on their own. This allows us to answer the fundamental question of these elections: What parties have the option of adding the 69 necessary deputies? The graph shows the summary:
- 7 out of 10 times (71%) there will be a right-wing majority (PP and Vox). In the 25,000 simulations, that’s how often the two parties add up the 69 seats they need. The solitary majority of the PP occurs 1 out of every 20 occasions (6%).
- 1 out of 6 times (16%) there will be a majority of the left (PSOE, MM and UP). It is the probability that the polls are wrong in that direction and enough that there is a turnaround.
- 1 out of 10 times (9%) Citizens will be decisive. It is the combined probability of two events: (1) that Cs exceeds 5% of votes (20% probability), and (2) that their seats need them right and left.
- And… in 1 of 25 times there will be a tie. As the assembly distributes an even number of seats, it may happen that PP-Vox and PSOE-MM-UP tie at 68 seats.
What does this data mean? The above figures are probabilities: those that each majority has to occur. They say a majority of PP and Vox is the most likely outcome, but it’s important not to mistake that for a certainty. On the contrary, this type of forecast can be interpreted as a warning: the polls say that the right is a favorite, but when in the past they were as sure as now about something, they ended up being wrong 20% or 30% of the time.
Another way of looking at it is to imagine a tree of alternatives. Out of every 100 possible futures, the numbers above say how many the right wins and how many the left wins. What we do not know is which of those futures will be ours.
Finally, I have calculated the probability of other varied situations, such as More Madrid being ahead of the PSOE, that the PP has an absolute majority or that Vox does not win seats.
Methodology
Predictions are produced by a statistical model based on soundings and their historical accuracy. The model is similar to the one we used in the elections of April and November 2019, in Mexico, France, the United Kingdom, Andalusia or Catalonia. It works in three steps: 1) aggregate and average the polls, 2) incorporate the expected uncertainty, and 3) simulate 15,000 elections to distribute seats and calculate probabilities.
Step 1. Average of surveys. Our average takes dozens of probes into account to improve its accuracy. The average is weighted to give different weight to each survey according to three factors: the size of the sample, the survey house, and the date.
Step 2. Incorporate the uncertainty of the surveys. This is the most complicated and important step. The expected precision of the soundings needs to be estimated. How big are the usual errors? How likely are 2, 3, or 5 point errors to occur? To answer these questions, hundreds of surveys in Spain and thousands of international ones are studied.
Calibrate the expected errors. First, the error of the surveys in Spain is estimated. A database is built with all the elections since 1986. The mean absolute error (MAE) of the poll averages has been around 2 points per party. This means that deviations of 3 or 4 points were common and that the margin of error (at 95%) is close to seven points for parties with around 30% of votes. These errors depend on at least two things: the size of the party and the proximity of the elections. To take these two factors into account, the Jennings and Wlezien database is used, published in Nature. The errors of more than 4,100 polls in 241 elections in 19 Western countries have been analyzed. Thus, a simple model is constructed that estimates the MAE error of the average votes estimated by the polls for each party, taking into account: 1) its size (it is easier to estimate a party that is around 5% in votes than one that exceeds the 30%), and 2) the days until the elections (because the polls improve in the end).
Choice of the type of distribution. To incorporate the uncertainty into the vote of each party in each simulation, a multivariate distribution is used. Student-t distributions are used instead of normal so that they have longer tails (kurtosis): this makes very extreme events more likely to happen. The advantages of this hypothesis Nate Silver explains it: “I have estimated the level of kurtosis with the database. Then I define the covariance matrix of these distributions so that the sum of the votes does not exceed 100% (a Chris Hanretty idea). I incorporate uncertainty with 53 distributions, one at the national level and another in each province. The first distribution introduces equal errors for the vote of a party in all of Spain. It is important to do so because, in general, survey errors are systemic and the same in all territories. If we assume them independent, the errors cancel between provinces and the model fails due to overconfidence. This happened with some models of the US elections in 2016. I incorporate the second part of the uncertainty about each province. Finally, the amplitude of the covariance matrices must be scaled so that the voting distributions that result in the end have the MAE and the standard deviation expected according to the calibration “.
Step 3. Simulate. The last step is to run the model 15,000 times. Each iteration is a simulation of the elections with voting percentages that vary according to the distribution defined in the previous step. The results in these simulations allow us to calculate the probabilities that each party has of achieving a certain number of seats, reaching a majority, finishing first, and so on.
Why surveys. This model is based entirely on surveys. There is a perception that surveys are unreliable, but the truth is that the polls have not done badly lately. In the last two or three years they have been quite accurate in Spain, although with exceptions, such as the Andalusian elections of 2018. Polls are rarely perfect, but there is no alternative that has been better demonstrated.
