I first needed to create a state lean for each state based on presidential election results by state dating back to 1976. From this, I calculate the two-party share of the vote and determine the percentage gap in the vote share, both for individual states and the national aggregate, using this formula:
Subsequently, I defined the lean for a state in a given year as follows:
To compute each state’s partisan lean for the 2020 election, I used an Exponentially Weighted Moving Average (EWMA). Final lean values can be found here at my GitHub.
The forecasting process consists of approximately 2,200 iterations for each state, applying various model weights. These weights are determined by the relative importance of national versus state-level polls, as well as the alpha values used in EWMAs for polling averages and polling counts.
Initially, I assign a weight to each poll based on the square root of its sample size. This is to emphasize diminishing return on poll accuracy as the sample size increases:
I then compute the average of all polls conducted on a given day based on these weights. I then applied an EWMA to calculate the ongoing daily average of polls.
Next, I wanted to incorporate the influence of national polls on state level polling, so I created an EWMA to create an ongoing weight value for the count of state-level polls. For instance, having three state polls conducted yesterday is more indicative of state sentiment than a single national poll conducted today, so the state polls should be weighted more heavily in the polling EWMA.
To translate national polling split into a state level split, I used the following formulas:
Finally, I computed the final average percentage for a particular candidate in a state using the following: