LASSOing the Governor’s Mansion: A Machine-Learning Approach to Forecasting Gubernatorial Elections

Gregory J. Love; Ryan E. Carlin; Matthew M. Singer

doi:10.1017/S1049096524000866

LASSOing the Governor’s Mansion: A Machine-Learning Approach to Forecasting Gubernatorial Elections

Published online by Cambridge University Press: 15 October 2024

and

Gregory J. Love: Affiliation:
University of Mississippi, USA
Ryan E. Carlin: Affiliation:
Georgia State University, USA
Matthew M. Singer: Affiliation:
University of Connecticut, USA

Article contents

Abstract
RISKS OF OVERFITTING FUNDAMENTALS FORECASTS OF GUBERNATORIAL RACES
THE LASSO-POPULARITY MODEL: REDUCING THE FORECAST’S PREDICTOR-TO-OBSERVATION RATIO
TRAINING THE LASSO-POPULARITY FUNDAMENTALS FORECAST MODEL
VALIDATING THE LASSO-POPULARITY FUNDAMENTALS FORECAST: PREDICTING THE 2022 GUBERNATORIAL ELECTIONS
LASSO-POPULARITY FUNDAMENTALS FORECASTS OF 2024 GUBERNATORIAL ELECTIONS
DATA AVAILABILITY STATEMENT
CONFLICTS OF INTEREST
Footnotes
References

Rights & Permissions

Abstract

Despite governors’ crucial roles in shaping important policies, including abortion, education, and infrastructure, forecasters have paid little attention to gubernatorial elections. We posit that institutional idiosyncrasies and lack of public opinion data have exacerbated the classic problem facing all election forecasts: there are too many predictors and too few cases, leading to overfitting. To address these problems, we combine new governor and state-level presidential approval data with a machine-learning approach, LASSO, for variable selection. LASSO examines numerous variables but retains only those that substantively improve model performance. Results demonstrate the efficacy of gubernatorial and presidential approval ratings measured two quarters preelection in predicting both incumbent-party vote share and election winners in out-of-sample predictions. For 2022, our approach outperformed the Cook Political Report’s Partisan Voting Index and compared well with 538’s Election Day prediction. For 2024, our LASSO-Popularity model predictions indicate that it will likely be a difficult year for Democrats in gubernatorial contests.

Type: Article
Information: PS: Political Science & Politics , First View , pp. 1 - 8

DOI: https://doi.org/10.1017/S1049096524000866 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of American Political Science Association

Governors receive short shrift in the biannual intrigue among political scientists and pundits surrounding American election forecasts. To our knowledge, only two peer-reviewed forecasts of multiple US gubernatorial elections exist (Hummel and Rothschild Reference Hummel and Rothschild2014; Klarner Reference Klarner2012). Such inattention is unsurprising, if regrettable. Although national politics receives more media and voter attention than state politics (Martin and McCrain Reference Martin and McCrain2019), who wins governorships deeply affects the quality of American life and democracy. This terra incognita opens a range of theoretical possibilities, but institutional idiosyncrasies and data scarcity require creative solutions from would-be forecasters. In that spirit, we develop a machine-learning approach that leverages political and economic fundamentals to make predictions for 11 2024 US governors’ elections.

If our focus on governors is novel, our fundamentals approach is fairly traditional.Footnote ¹ Even recent high-profile forecasts from 538 and The Economist, which funnel the deluge of horse-race polls into frequently updated forecasts, take the fundamentals of politics (executive approval) and economics as their starting points (Gelman et al. Reference Gelman, Hullman, Wlezien and Morris2020). Yet, until now, data scarcity has hindered the advancement of fundamentals forecasts of gubernatorial elections. Our fundamentals-based model of US gubernatorial elections gains traction by incorporating recently developed data on gubernatorial approval and state-level presidential approval.

Our fundamentals-based model of US gubernatorial elections gains traction by incorporating recently developed data on gubernatorial approval and state-level presidential approval.

Overall, our forecast performs admirably, correctly predicting the winner in 75% of 345 US governor races from 1978 to 2019 with data collected two quarters out from the election. Although not trained on the 2022 cycle, our model correctly predicted 86% of winners (30 of 36) with an incumbent-party vote share mean average error (MAE) of 4.16 percentage points. Using the same model weights for 2024, our forecasts suggest that Democrats face a difficult election environment: their president is unpopular, and gubernatorial elections are being held in several states with popular Republican governors.

RISKS OF OVERFITTING FUNDAMENTALS FORECASTS OF GUBERNATORIAL RACES

Forecasting gubernatorial elections from fundamentals requires grappling with a major methodological challenge: a high predictor-to-observation ratio risks overfitting. Overfitted models nicely predict in-sample elections but poorly predict out-of-sample elections. This looms large for US presidential elections, which offer relatively few races on which to train and evaluate model performance. Subnational races and subnational data potentially increase the number of cases, although the degree to which these cases are independent is unclear.Footnote ²

Forecasting gubernatorial races has the potential to avoid this “small-n problem” (Turgeon and Rennó Reference Turgeon and Rennó2012) by pooling elections in more than 50 jurisdictions and many historical elections.Footnote ³ Realistically, however, scholars often lack data for key fundamentals— governor approval, state-level presidential approval, and economic conditions—from state-specific sources.Footnote ⁴ Thus, without a strategy to pool data from different sources, previous gubernatorial election forecasts have used national-level presidential approval as the key political fundamental, which does not vary across states (Hummel and Rothschild Reference Hummel and Rothschild2014; Klarner Reference Klarner2012). Novel data described later partially alleviate this issue, but data scarcity continues to depress the prediction-to-observation ratio’s denominator by restricting certain years and states from the training set. A particular challenge is low-population states in which gubernatorial and presidential approval are infrequently polled (Singer Reference Singer2023). More importantly, extant fundamentals-based forecasts of governors’ races raise the specter of overfitting by including numerous predictors and temporal lags. Among the 30 variables in Klarner’s forecast (2012, 689), for example, are a lagged dependent variable and the lags of 16 other predictors; only 3 reach conventional levels of statistical significance. Although this approach may have some advantages in exploratory settings, it exacerbates problems of overfitting and collinearity that hamper the model’s interpretability and parsimony. Absent micro-foundations to guide variable selection, effective and efficient models can prove elusive. Yet, we argue, a fundamentals-based forecast of subnational elections is possible if forecasters have a systematic strategy to address limited data and manage degrees of freedom.

THE LASSO-POPULARITY MODEL: REDUCING THE FORECAST’S PREDICTOR-TO-OBSERVATION RATIO

To avoid overfitting a fundamentals forecast of gubernatorial elections, we seek to reduce the numerator of the predictor-to-observation ratio via regularized regression using LASSO (least absolute shrinkage and selection operator). This machine-learning approach explores many variables but only selects those with significant predictive power. Initially developed in geophysics, LASSO has been adopted in economics and finance for forecasting (Santosa and Symes Reference Santosa and Symes1986; Tian, Yu, and Guo Reference Tian, Yu and Guo2015). Variable selection proceeds as follows. Variables are introduced into the model and weighted with a penalty that attempts to shrink their coefficients to zero without an increase to a loss function. When a coefficient cannot be shrunken to zero without increasing the loss function, then the variable is selected. Through this recursive routine, LASSO selects a parsimonious model that is a “good fit” for the training data.Footnote ⁵ And because it is not overfitted, the model should generate more robust out-of-sample forecasts. Overall, we expect LASSO to perform well in forecast model development and selection for our fundamentals forecast of governors’ races and for any forecast with a high predictor-to-observation ratio, including the other forecasts in this symposium.

We also raise the denominator of our predictor-to-observation ratio with new data. Whereas presidential approval is the dominant political fundamental in presidential election forecasts, gubernatorial popularity, as well as state-level presidential approval, likely shapes gubernatorial elections. Lacking data on both, previous gubernatorial forecasts relied on national-level presidential approval as a proxy (Hummel and Rothschild Reference Hummel and Rothschild2014; Klarner Reference Klarner2012). Our model, however, uses governor popularity from the State Executive Approval Dataset (SEAD; Singer Reference Singer2023).Footnote ⁶ This resource updates Beyle, Niemi, and Sigelman’s (Reference Beyle, Niemi and Sigelman2002) database of governor approval and then uses Stimson’s (Reference Stimson1991) dyads-ratio algorithmFootnote ⁷ to combine all available polls on citizens’ evaluations of governor performance, extract their latent common variance, account for “house effects,”Footnote ⁸ and produce quarterly estimates of governor approval ratings. We complement SEAD data with state-level measures of presidential approval generated using the same methodology.Footnote ⁹

In selecting input variables for the LASSO routine, we build from the substantial fundamentals forecast literature for presidential races and the small literature on state-level races. Specifically, we incorporate governor and state-level presidential approval estimates alongside a range of other political and economic fundamentals described later and measured two quartersFootnote ¹⁰ or more before Election Day into forecasts of (1) vote share and (2) incumbent-party victory.

Political Fundamentals

In addition to gubernatorial and state-level presidential approval, LASSO selected the incumbent-governor’s party and the incumbency status of the incumbent-party’s candidate; that is, dummies for not incumbent, incumbent, and unelected incumbent.Footnote ¹¹ Interactions between these dummies and gubernatorial approval were also selected, suggesting that the effects of governors’ popularity are stronger for incumbents. LASSO also indicates that shared partisanship between the president and governor conditions the effects of state-level presidential approval. A final selection is the state’s general partisan lean, measured as the Democrat presidential share of the two-party vote in the previous presidential election (conditioned by the incumbent governor’s partisanship). LASSO selected neither the partisan composition of the state legislature nor the governor’s gender.Footnote ¹²

Economic Fundamentals

Drawing on gubernatorial support literature (Singer Reference Singer2023) and forecasts (Klarner Reference Klarner2012), we introduce several state- and national-level economic fundamentals into the LASSO routine: the Federal Reserve’s “coincident index” of state-level outcomes,Footnote ¹³ state-level unemployment,Footnote ¹⁴ national-level unemployment, and state-level unemployment rate relative to the national unemployment rate in the second quarter of the election year. LASSO selected none of the state-level variables into the model. Although results indicate that national-level unemployment could be selected, its inclusion does not meaningfully improve the model’s out-of-sample forecast or fit. This is true for national GDP growth or increases in mean household income.Footnote ¹⁵ Once political fundamentals are considered, economic fundamentals appear to provide little additional purchase in forecasting gubernatorial elections (see Supplemental Table A4).

Election Outcomes

Our training dataset consists of 346 pre-pandemic gubernatorial elections for which we have data. We train models to forecast two common election outcomes: (1) incumbent-party governor vote share and (2) a dichotomous measure of whether the incumbent-party candidate wins. These two factors are related but not perfectly; a model could predict vote share well but not reliably pick winners in close races.

TRAINING THE LASSO-POPULARITY FUNDAMENTALS FORECAST MODEL

Table 1 displays the results of an OLS training model using the predictors selected by LASSO on incumbent-party vote share. We present OLS coefficients instead of the direct LASSO estimates because the coefficients between the two estimation approaches are nearly identical (appendix 3), and LASSO’s penalized regression approach does not produce estimates of uncertainty or model fit (standard errors, R², etc.).

Table 1 Training Models: LASSO-Popularity Forecast of U.S. Governors’ Races

Notes: OLS estimates based on model selected by linear LASSO.

Standard errors in parenthesis. **p < 0.01, * p < 0.05 (two-tailed).

Overall, the LASSO-Popularity model performs well. Most coefficients are precisely estimated, and model fit is good but not overly good (Adj. R² = 0.44). The model also exhibits substantial out-of-sample accuracy over the dataset’s range. Using leave-one-out validation (LOOV),Footnote ¹⁶ its mean absolute error (MAE) is 5.6 percentage points, a 37% improvement in accuracy over a naïve, intercept-only model (MAE = 7.7 percentage points).

We can also assess the LASSO-Popularity model’s validity by using only “before-the-fact” training data for each election since 1990. To do this, we forecast each gubernatorial race using only data that occurred before that date; that is, 1990 predictions are based only on pre-1990 elections. Given diminished training data, the MAE for this approach is unsurprisingly larger but only marginally so (5.9).

The substantive results are consistent with a fundamentals-based model. Gubernatorial elections significantly reflect the incumbent governor’s popularity, especially if the incumbent is running for reelection, but even when not.Footnote ¹⁷ National politics also influence state politics. Presidential approval matters for gubernatorial races, particularly when the governor and the president are from different parties. These factors are predictive apart from the effects of state-level tendencies to vote for either the Democrats or the GOP.

Visually, we observe a strong correlation between gubernatorial approval two quarters from Election Day and incumbent-party candidate vote share (figure 1A). The “before-the-fact” MAE for this bivariate model is 6.9 percentage points.Footnote ¹⁸ But, as figure 1B shows, “before-the-fact” predictions from our LASSO-Popularity fundamentals model substantially improve model fit.

Figure 1 Model Fit: Bivariate vs. Full LASSO-Popularity Forecast (“Before-the-Fact”)

A logit implementation of LASSO selects the same predictors for election winner as for vote share. Though typically the main focus of pundits and the public, predicting the winner is often a less rigorous test of a forecast, because many races are uncompetitive. Again, our fundamentals-based forecast is quite accurate,Footnote ¹⁹ predicting the winner in 75% of races: a 10% improvement over assuming the incumbent party always wins and a 50% improvement over a coin toss. The LOOV accuracy, 73%, is nearly as high as the within-sample accuracy. Our “before-the-fact” forecast for 2012 performs substantially better than the 2012 out-of-sample forecasts of Klarner (Reference Klarner2012) and Hummel and Rothschild (Reference Hummel and Rothschild2014). It correctly predicted 91% (10 of 11) races in 2012, compared to Klarner’s and Hummel and Rothschild’s success rates in 2012 of 70% and 79%, respectively. Overall, our model correctly predicts winners more reliably than extant models but with greater parsimony.

VALIDATING THE LASSO-POPULARITY FUNDAMENTALS FORECAST: PREDICTING THE 2022 GUBERNATORIAL ELECTIONS

Thus far, the OLS results in table 1 and the patterns in figure 1 suggest our LASSO-Popularity forecast is substantially accurate. The model’s low LOOV MAE and low “before-the-fact” MAE, especially compared to a naïve model, indicate high out-of-sample predictive power. To further gauge out-of-sample robustness, we registered a preelection forecast for the governors’ races in 2022, an election cycle not included in our training data.Footnote ²⁰ Because the SEAD dataset stops in 2020, we use Morning Consult’s state-level governor and presidential approval estimates to generate predictions.

The 2022 race included three races that, a priori, were considered highly likely to defy a fundamentals forecast: Massachusetts, Maryland, and Hawaii. Massachusetts and Maryland both had very popular, term-limited Republican governors. Given these states’ strong Democratic partisan makeup, any Republican other than the outgoing governor would have low odds of retaining the Governor’s Mansion. Hawaii’s outgoing incumbent hailed from the state’s dominant Democratic Party but was among the nation’s least-liked governors. Thus, long before Election Day, we knew our model was likely to substantially overestimate Republican performance in these three states.

Figure 2 graphs our forecast for the incumbent-party candidate in each 2022 governor’s race. Most were not expected to be competitive, and most were not. The median vote percentage won by the incumbent-party candidates was 56%. The combination of partisan sorting, polarization, and incumbent advantage makes it straightforward to forecast the incumbent party’s likelihood of winning. A higher hurdle is accurately predicting the percentage of votes won by the incumbent-party candidate. To assess the LASSO-Popularity model’s performance in 2022, we look at its error, both directionally and in absolute terms, and compare it to two other forecasts that covered all races in 2022.

Figure 2 LASSO-Popularity Forecast of 2022 Governors’ Races and Actual Vote Share

Forecast of governors’ races with 95% confidence intervals; actual vote share shown with gray diamonds.

The first alternative is 538’s Election Day prediction. Updated daily throughout the race, as Election Day drew near this model is dominated by horse-race polls (replacing fundamentals). A second alternative is the baseline prediction from Cook Political Report’s Partisan Voting Index (PVI). The 2022 PVI boils down to a very simple fundamentals model: the average of the past two presidential elections. Comparing our model to the PVI shows if, and how much, governor approval, state-level presidential approval, and incumbency improve accuracy over the heuristic of presidential election history. Comparing 538’s approach, with its final prediction mere hours before voting began, against our strict fundamentals approach based on data from two quarters before the election, tells us how much fundamentals anchor these races.

Figure 3A displays the forecast errors by state for 538’s Election Day model, our LASSO-Popularity model, and Cook’s PVI model. The largest errors of LASSO-Popularity were the expected cases—Hawaii, Maryland, and Massachusetts—and in the anticipated direction. PVI produced the largest errors, highlighted by Vermont’s popular Republican governor coasting to reelection despite running in a deeply Democratic state.

Figure 3 Forecast Errors across 538, LASSO-Popularity, and Cook’s PVI Models

Although the lack of a consistent error (bias) lends these models validity, their absolute error rates help assess their accuracy and reliability. Figure 3B shows the absolute error by model and state. To examine any potential partisan bias in the forecasts, we split the states by the party of the incumbent governor. With few exceptions, both 538 and our LASSO-Popularity forecast appear highly accurate, often with errors of less than five percentage points. Cook’s PVI is notably less accurate. We observe no evidence that any of these models performed better or worse in races where Republicans held the governor’s office.

For a more quantitative assessment of our model’s relative accuracy, we can examine the forecasts’ root mean squared error (RMSE) and MAE. RMSE weights larger errors as more influential, whereas the MAE weights all errors equally. Table 2 shows the RMSE and MAE for the three models with and without Hawaii, Maryland, and Massachusetts. Not surprisingly, the forecast made closest to the election, 538’s, performed best, and the forecast with the least data, Cook’s PVI, performed worst. Our LASSO-Popularity model performed well in this full out-of-sample forecast with an MAE of 4.16, substantially better than the LOOV MAE from the training data (5.06). If we remove the three a priori outlier cases, then the LASSO-Popularity’s performance, based on data collected two quarters or more before the election, equaled that of 538’s Election Day model. This substantiates the notion that, in most cases, the myriad of vote intention polls released during the campaign contribute little beyond the fundamentals.

Table 2 Forecast Accuracy of 2022 Gubernatorial Races by Model

Note: 538 forecast was accessed here: https://projects.fivethirtyeight.com/2022-election-forecast/governor/. Cook PVI was accessed here: https://www.cookpolitical.com/cook-pvi/2022-partisan-voting-index

LASSO-POPULARITY FUNDAMENTALS FORECASTS OF 2024 GUBERNATORIAL ELECTIONS

Our LASSO-Popularity fundamentals forecast has thus far performed well, both within- and out-of-sample, demonstrating the technique’s utility and range of application. How robustly will it perform in 2024? Unlike in 2022, we have no a priori expectations that our model will perform poorly in any specific 2024 race. Leveraging the same model from the training dataset used to forecast the 2022 governors’ races, we make predictions for 11 gubernatorial races in 2024, again based on approval estimates from Morning Consult (table 3).

Table 3 LASSO-Popularity Forecasts for 2024 Gubernatorial Elections: Incumbent-Party Vote Share and Win Probability

* Incumbent running for reelection.

In all, the LASSO-Popularity model predicts the incumbent-party candidate, including governors seeking reelection, will win in 10 of the 11 races based on the climate in that state. North Carolina is the exception: the incumbent governor is not particularly popular (53% approval), is not running for reelection, President Biden is unpopular in the state (40% approval), and the two-party vote has favored the GOP, and so our model predicts the governorship will change party hands. However, the model predicts races in Washington and Delaware, two traditionally Democratic states, will be closer than in previous elections. Moreover, it forecasts that New Hampshire Republicans, buoyed by outgoing Governor Sununu’s high popularity (66%) and Biden’s unpopularity (43%), will retain the governorship. In sum, our LASSO-Popularity model presages a tough climate for Democrats in gubernatorial contests across most states in 2024. Deviations from model predictions will likely illustrate candidate-specific factors, including candidate quality, that sharply diverge from the “average” candidacies on which this fundamentals-based model rests.

In sum, our LASSO-Popularity model presages a tough climate for Democrats in gubernatorial contests across most states in 2024.

SUPPLEMENTARY MATERIAL

To view supplementary material for this article, please visit http://doi.org/10.1017/S1049096524000866.

DATA AVAILABILITY STATEMENT

Research documentation and data that support the findings of this study are openly available at the PS: Political Science & Politics Harvard Dataverse at https://doi.org/10.7910/DVN/2KT0XS.

CONFLICTS OF INTEREST

The authors declare no ethical issues or conflicts of interest in this research.

Footnotes

1. See Stegmaier and Norpoth (Reference Stegmaier and Norpoth2021) for a review.

2. For example, the 56 Electoral College contests can be modeled separately with state-level fundamentals (e.g., DeSart Reference DeSart2021; Jérôme et al. Reference Jérôme, Jérôme, Mongrain and Nadeau2021; Enns and Lagodny Reference Enns and Lagodny2021), with the assumption that each presidential election represents 56 independent tests of the model. Yet campaign strategies, media environments, and governance challenges render presidential election results highly correlated across states. See Turgeon and Rennó (Reference Turgeon and Rennó2012) for an application to Brazil.

3. Pooling data across comparable countries can potentially provide similar leverage; see Lewis-Beck and Dassonneville (Reference Lewis-Beck and Dassonneville2015) and Bunker (Reference Bunker2020).

4. State-level Electoral College forecasts face similar issues. Enns and Lagodny (Reference Enns and Lagodny2021) estimate state-level presidential approval using multilevel regression and poststratification on election-year national surveys. But their data do not cover most of our sample or include governors.

5. There are alternative tools that could be used for the same problem, notably Bayesian LASSO, random trees and forests, gradient boosting, and Bayesian model selection (via BIC), among others. We believe that LASSO’s simplicity recommends it.

6. See appendix 1 for information about this dataset.

7. See Carlin et al. (Reference Carlin, Hartlyn, Hellwig, Love, Martínez-Gallardo, Singer, Hellwig and Singer2023) for a discussion of this method, its long history of reliability and validity, and applications to national-level data.

8. A small body of work in single-country studies at the national (e.g., Bunker and Bauchowitz Reference Bunker and Bauchowitz2016, Gschwend et al. Reference Gschwend, Müller, Munzert, Neunhoeffer and Stoetzer2022) and subnational level (e.g., Kang and Oh Reference Kang and Oh2024; Montalvo et al. Reference Montalvo, Papaspiliopoulos and Stumpf-Fétizon2019) use Bayesian approaches for combining limited public opinion data in horse-race polls.Bunker’s (Reference Bunker2020) approach addresses both the small-n problem by pooling across Latin American elections and data quality issues in a cross-national study.

9. See appendix 2 for a description and Singer (Reference Singer2024) for the data.

10. Erikson and Wlezien (Reference Erikson and Wlezien2012) suggest that fundamentals models gain predictive power about 200 days before elections.

11. Political variables were coded by the authors.

12. We expect any effects of these variables are captured by the president’s popularity, the state’s long-term partisan dynamics, and the governor’s popularity.

13. https://www.philadelphiafed.org/surveys-and-data/regional-economic-analysis/state-coincident-indexes.

14. All unemployment data are from the Bureau of Labor Statistics. https://www.bls.gov/news.release/laus.toc.htm.

15. Data from the St. Louis Federal Reserve Banks FRED database.

16. LOOV, sometimes termed “jackknife,” is a cross-validation approach where N training datasets are created, each excluding one observation; the model is estimated and then used to predict the outcome of the excluded case.

17. The negative coefficient for incumbency is conditional on a 0% gubernatorial approval rating; for most ranges of governor popularity, incumbency is an advantage.

18. Nearly identical to the LOOV MAE of 6.8.

19. The forecast accuracy for predicting a win has a Brier score of .149, substantially better than the baseline score of a coin toss (.25). Forecast misses are more likely when the incumbent is not a candidate (62% accuracy) than when an incumbent runs (86% accuracy).

20. See appendix 4 for the forecast preregistered on Open Science Framework Sept. 24, 2022, a link to which will be included in the published article. We exclude the Alaska race because the electoral system changed to a nonpartisan, top-four blanket primary system. Morning Consult governor approval (https://pro.morningconsult.com/trackers/governor-approval-ratings) and presidential approval scores are released on a semi-regular basis. https://pro.morningconsult.com/trackers/joe-biden-approval-rating-by-state.

References

REFERENCES

Bunker, Kenneth, and Bauchowitz, Stefan. 2016 “Electoral forecasting and public opinion tracking in Latin America: an application to Chile.” Política 54(2): 207–233.Google Scholar

Beyle, Thad, Niemi, Richard G., and Sigelman, Lee. 2002. “Gubernatorial, Senatorial, and State-level Presidential Job Approval: The U.S. Officials Job Approval Ratings (JAR) Collection.” State Politics & Policy Quarterly 2 (3): 215–29.Google Scholar

Bunker, Kenneth. 2020 “A Two-Stage Model to Forecast Elections in New Democracies.” International Journal of Forecasting 36 (4): 1407–19.Google Scholar

Carlin, Ryan E., Hartlyn, Jonathan, Hellwig, Timothy, Love, Gregory J., Martínez-Gallardo, Cecilia, and Singer, Matthew. 2023. “The Executive Approval Database: Conceptual and Empirical Bases.” In Economics and Politics Revisited: Executive Approval and the New Calculus of Support, eds. Hellwig, Timothy and Singer, Matthew. New York: Oxford University Press, 32–53.Google Scholar

DeSart, Jay. 2021. “A Long-Range State-Level Forecast of the 2020 Presidential Election.” PS: Political Science & Politics 54 (1):73–76.Google Scholar

Enns, Peter, and Lagodny, Julius. 2021. “Forecasting the 2020 Electoral College Winner: The State Presidential Approval/State Economy Model.” PS: Political Science & Politics 54 (1):81–85.Google Scholar

Erikson, Robert S., and Wlezien, Christopher. 2012. The Timeline of Presidential Elections: How Campaigns Do (and Do Not) Matter. Chicago: University of Chicago Press.Google Scholar

Gelman, Andrew, Hullman, Jessica, Wlezien, Christopher, and Morris, George Elliot. 2020. “Information, Incentives, and Goals in Election Forecasts.” Judgment and Decision Making 15 (5): 860–80.Google Scholar

Gschwend, Thomas, Müller, Klara, Munzert, Simon, Neunhoeffer, Marcel, and Stoetzer, Lukas F.. 2022 “The Zweitstimme model: a dynamic forecast of the 2021 German federal election.” PS: Political Science & Politics 55 (1): 85–90.Google Scholar

Hummel, Patrick, and Rothschild, David. 2014. “Fundamental Models for Forecasting Elections at the State Level.” Electoral Studies 35: 123–39.Google Scholar

Jérôme, Bruno, Jérôme, Véronique, Mongrain, Philippe, and Nadeau, Richard. 2021. “State-Level Forecasts for the 2020 US Presidential Election: Tough Victory Ahead for Biden.” PS: Political Science & Politics 54 (1): 77–80.Google Scholar

Kang, Seungwoo, and Oh, Hee-Seok. 2024. “Forecasting South Korea’s presidential election via multiparty dynamic Bayesian modeling.” International Journal of Forecasting 40 (1): 124–141.Google Scholar

Klarner, Carl. 2012. “State-Level Forecasts of the 2012 Federal and Gubernatorial Elections.” PS: Political Science & Politics 45 (4): 655–62.Google Scholar

Lewis-Beck, Michael, and Dassonneville, Ruth. 2015. “Forecasting Elections in Europe: Synthetic Models.” Research & Politics 2 (1): 2053168014565128.Google Scholar

Love, Gregory J., Carlin, Ryan E., and Singer, Matthew M.. 2024. “Replication Data for: LASSOing the Governor’s Mansion: A Machine Learning Approach to Forecasting Gubernatorial Elections.” PS: Political Science & Politics. DOI: 10.7910/DVN/2KT0XS.Google Scholar

Martin, Gregory, and McCrain, Joshua. 2019. “Local News and National Politics.” American Political Science Review 113 (2): 372–84.Google Scholar

Montalvo, José G., Papaspiliopoulos, Omiros, and Stumpf-Fétizon, Timothée. 2019. “Bayesian forecasting of electoral outcomes with new parties’ competition.” European Journal of Political Economy 59: 52–70.Google Scholar

Santosa, Fadil, and Symes, William W.. 1986. “Linear Inversion of Band-Limited Reflection Seismograms.” SIAM Journal on Scientific and Statistical Computing 7 (4):1307–30.Google Scholar

Singer, Matthew M. 2023. “Dynamics of Gubernatorial Approval: Evidence from a New Database.” State Politics & Policy Quarterly 23 (3): 306–26.Google Scholar

Singer, Matthew M. 2024. “State Executive Approval Dataset (SEAD) Version 1.5 (Presidential Approval).” https://doi.org/10.7910/DVN/3IPCR1. Harvard Dataverse, V1.Google Scholar

Stegmaier, Mary, and Norpoth, Helmut. 2021. “Election Forecasting.” In Oxford Bibliographies in Political Science, edited by Rick Valelly. DOI: 10.1093/OBO/9780199756223-0023.Google Scholar

Stimson, James A. 1991. Public Opinion in America: Moods, Cycles, and Swings. Boulder: Westview Press.Google Scholar

Tian, Shaonan, Yu, Yan, and Guo, Hui. 2015. “Variable Selection and Corporate Bankruptcy Forecasts.” Journal of Banking & Finance 52 (1): 89–100Google Scholar

Turgeon, Mathieu, and Rennó, Lucio. 2012. “Forecasting Brazilian Presidential Elections: Solving the N Problem.” International Journal of Forecasting 28 (4): 804–12.Google Scholar

Table 1 Training Models: LASSO-Popularity Forecast of U.S. Governors’ Races

Figure 1 Model Fit: Bivariate vs. Full LASSO-Popularity Forecast (“Before-the-Fact”)

Figure 2 LASSO-Popularity Forecast of 2022 Governors’ Races and Actual Vote ShareForecast of governors’ races with 95% confidence intervals; actual vote share shown with gray diamonds.

Figure 3 Forecast Errors across 538, LASSO-Popularity, and Cook’s PVI Models

Table 2 Forecast Accuracy of 2022 Gubernatorial Races by Model

Table 3 LASSO-Popularity Forecasts for 2024 Gubernatorial Elections: Incumbent-Party Vote Share and Win Probability

Love et al. supplementary material

File 221.8 KB

Love et al. Dataset

Dataset

https://doi.org/10.7910/DVN/2KT0XS

Link

Article contents

LASSOing the Governor’s Mansion: A Machine-Learning Approach to Forecasting Gubernatorial Elections

Abstract

RISKS OF OVERFITTING FUNDAMENTALS FORECASTS OF GUBERNATORIAL RACES

THE LASSO-POPULARITY MODEL: REDUCING THE FORECAST’S PREDICTOR-TO-OBSERVATION RATIO

Political Fundamentals

Economic Fundamentals

Election Outcomes

TRAINING THE LASSO-POPULARITY FUNDAMENTALS FORECAST MODEL

VALIDATING THE LASSO-POPULARITY FUNDAMENTALS FORECAST: PREDICTING THE 2022 GUBERNATORIAL ELECTIONS

LASSO-POPULARITY FUNDAMENTALS FORECASTS OF 2024 GUBERNATORIAL ELECTIONS

SUPPLEMENTARY MATERIAL

DATA AVAILABILITY STATEMENT

CONFLICTS OF INTEREST

Footnotes

References

REFERENCES

Love et al. supplementary material

Love et al. Dataset

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests