Presidential prediction models using down ballot election results

Publication Type	honors thesis
School or College	College of Humanities
Department	Information Systems
Faculty Mentor	Chong Oh
Creator	Mcbride, Ryan
Title	Presidential prediction models using down ballot election results
Date	2023
Description	Research has demonstrated that polling in American elections has accurately predicted election results within a reasonable margin of error. However, due to the nature of the electoral college system, small deviations from polling averages in important states can result in large, surprising underdog victories. This has caused many in the American public to lose faith in polling and polling institutions, calling for the creation of better predictive models for ideological shifts in the country. We propose to utilize state ballot measure results to understand the correlation between the shifts in how American voters swing politically down ballot and the results of presidential elections. In particular, we want to determine whether the former can act as a precursor for how Americans will vote for president in future elections. This is conducted by labeling state ballot measures as Republican or Democratic party leaning, measuring whether the bills passed and by what margin, and correlating these scores with future presidential election results. Even if the results were not conclusive, our research found that ballot measures are correlated with future election results, making room for further studies using more and better data. This research demonstrates how data scientists and researchers may further use non-conventional data metrics to support data models that can better predict election results, especially in elections that are close calls as recent American presidential elections have been.
Type	Text
Publisher	University of Utah
Subject	ballot measure; elections; predictive models; polling; American politics
Language	eng
Rights Management	© Ryan McBride
Format Medium	application/pdf
Permissions Reference URL	https://collections.lib.utah.edu/ark:/87278/s6xaf0sn
ARK	ark:/87278/s6rvp9wr
Setname	ir_htoa
ID	2290174
OCR Text	Show PRESIDENTIAL PREDICTION MODELS USING DOWN BALLOT ELECTION RESULTS by Ryan McBride A Senior Honors Thesis Submitted to the Faculty of The University of Utah In Partial Fulfillment of the Requirements for the Honors Degree in Bachelor of Science In Information Systems Approved: ___________ Chong Oh Thesis Faculty Supervisor __ _____________ Glen Schmidt Chair, Department of Information Systems _________ _____________________________ Sylvia D. Torti, PhD Dean, Honors College Vandana Ramachandran Honors Faculty Advisor December 2023 Copyright © 2023 i All Rights Reserved ABSTRACT Research has demonstrated that polling in American elections has accurately predicted election results within a reasonable margin of error. However, due to the nature of the electoral college system, small deviations from polling averages in important states can result in large, surprising underdog victories. This has caused many in the American public to lose faith in polling and polling institutions, calling for the creation of better predictive models for ideological shifts in the country. We propose to utilize state ballot measure results to understand the correlation between the shifts in how American voters swing politically down ballot and the results of presidential elections. In particular, we want to determine whether the former can act as a precursor for how Americans will vote for president in future elections. This is conducted by labeling state ballot measures as Republican or Democratic party leaning, measuring whether the bills passed and by what margin, and correlating these scores with future presidential election results. Even if the results were not conclusive, our research found that ballot measures are correlated with future election results, making room for further studies using more and better data. This research demonstrates how data scientists and researchers may further use non- conventional data metrics to support data models that can better predict election results, especially in elections that are close calls as recent American presidential elections have been. Keywords: ballot measure, elections, predictive models, polling, American politics ii TABLE OF CONTENTS ABSTRACT ii INTRODUCTION 1 BACKGROUND 3 DATA 8 METHODS 12 THEORY 14 RESULTS 15 CONCLUSSION 22 REFERENCES 25 iii INTRODUCTION Political polling is one of the staples of the American political cycle. As elections heat up, sites like FiveThirtyEight aggregate hundreds of polls in an attempt to predict who the next president will be. But, as the 2016 election demonstrated, polls in close elections can struggle to successfully predict election outcomes, especially on the regional level. Such an example can be shown in a state like Pennsylvania, where in 2016 Trump won by less than a percent, but where FiveThirtyEight aggregated polls showed Trump 4% behind Clinton leading up to the election (“Pennsylvania: 2016 election forecast”, 2016). In the month leading up to the election, out of the dozens of pollsters surveyed only one had Trump leading, and that was from a low-graded pollster (“Pennsylvania: 2016 election forecast”, 2016). Trump’s victory was barely within the margin of error, but the shock of his victory on a data level was deeply felt. This problem was exacerbated by sources like the New York Times claiming that Hillary Clinton had a 91% chance of winning based on their data models (Katz, 2016). Polls are nevertheless one of the more reliable data sources that we have today. But there is still a benefit in exploring second-tier data metrics to see if there are ways to more accurately forecast black swan events like the 2016 election. During that election, Trump won by less than 1% in Wisconsin, Pennsylvania and Michigan totaling 46 electoral votes - enough electoral votes to win the election. But this was only one of 1 many US elections that was won by incredibly narrow margin. The most infamous example may be the 2000 Bush-Al Gore election. In that election the winner was decided by Florida and its electorates, which was decided by less than 500 votes. In an election with more than 100 million votes tallied 500 decided the election. In short, although polling (as will be discussed late in the background section) is fairly accurate in US elections, American elections are extremely close. This means that accurate polling is not always enough to predict election results. One second tier data metric not often used in political data models are state-level ballot measures, which on some level act as opinion polls for large electoral constituencies. Ballot measures are proposals, laws, or other issues that are directly voted on by the electorate and not by a representative of the electorate. Essentially, if it is on a ballot and not voting for a specific individual (like for senate, president, etc.) it is a ballot measure. A state-level ballot measure refers to any ballot measures that the entire constituency of a state votes for and excludes ballot measures that are voted for by a smaller portion of a state’s constituency such as county-level ballot measures. This project investigates whether state-level ballot measures correlate with electoral changes to help improve electoral forecasting methods, thereby hopefully reducing election surprises while also benefiting political institutions’, media outlets’, and politicians’ abilities to gauge the desires of the American public. The first goal of this research is to be able to better predict the outcomes of close elections. This project serves as an exploration into whether or not there are additional, 2 often underused data, that can be utilized to predict which direction election results are headed towards. If there are predictive undercurrents to polling that can be partially gauged through second-tier data sources such as state-level ballot measures, we could improve the models that use polling as the sole source of data. The second goal of this study focuses on how data science is used in our current democratic process. Politicians and political institutions need accurate polling to cater to the American people and to shift strategies based on realities on the ground. Having better data models to help narrow down on campaign issues is important in a representative democracy. Studies conducted by Seabrook & Dyck (2014) found that ballot measures do not increase general political knowledge for the public, but ballot measures may increase knowledge for pollsters on the direction in which an election is turning. This paper is organized as follows. First, the background section gives a general overview of polling during the 2016 election and how that event inspired the ideas behind this project. In the theory section we dissect previous research on similar topics and how they effected elements of this paper. The data section details how we gathered the data and aggregated it. The methods section goes into detail of how we built models given our data to test our hypothesis’. Finally, the results and discussions sections detail what our research found and how this research could be used to inform further studies. BACKGROUND 3 Before examining the potential benefits of a study such as this, there must first be an examination of the extent to which our models need improving. This study is attempting to see if ballot measure data would improve our election prediction models, but how good are polls-based models currently? It should be noted, that even in a case like the very-close American 2016 presidential elections, most polls were fairly accurate (Kennedy et al. 2018). Research from the American Association for Public Opinion Research has suggested that this is especially true on the national level where polling was incredibly accurate. Kennedy et al. (2018, p. 29) discovered that “The national polls were generally correct (with respect to the popular vote) and accurate by historical standards… The most glaring problems occurred in state level general election polling, particularly in the Upper Midwest.” That research further underscores the use of polling as the standard in political forecasting, demonstrating how polling is still the strongest tool in a political data scientist’s arsenal. In an election where polling was held under much scrutiny, the polling was still fairly good outside of a few regional areas. 1 It could even be thoughtfully argued that the betrayal many Americans felt at the polls might be better pointed at the electoral college. At the same time, Kennedy (2018) mentions a late unexpected swing in the polls that contributed to Donald Trump’s victory. During the last weeks of the campaign, Trump’s approval peaked in many states (Kennedy, 2018). This unexpected swing is a factor that motivated this project: could ballot measures foreshadow election outcomes? 1The upper Midwest was where the greatest swings occurred. Kennedy (2018) points to that fact that this could be due to the overrepresentation of college educated voters in polls. 4 If we can understand polling and the public’s opinions as a measure of underlying trends, no matter how accurate the polls are, unexpected polling changes can still occur. Polls are merely the current position of public opinion. If we can start to gauge the velocity and acceleration of public opinion, polling swings leading up to elections would be less surprising (more predictable) than before. Ballot measures are one such way we might predict shifting electoral opinions. If Americans are voting more conservatively or liberally down ballot than they are for president, this could be indicative of a future change electorally. If we assume Americans are ideologically consistent, which is not a given, then With American elections often being decided by margins of victory less than one percent, understanding shifting electoral politics is of upmost importance. THEORY This study examines whether or not down ballot election results are correlated with future election results, and thus might be useful for predicting future presidential election results. To do this we build upon the findings from past research, especially when it comes to informing how we understand ballot measures and their data implications. Luckily, there is a plethora of research that looks at ballot measures and their political impact. However, there is a paucity of studies that examine the relationship between ballot measure results and larger election results – and this paper addresses this gap. One such study published in Political Research Quarterly examined whether states in the process of legalizing gay marriage through ballot measure had strong Republican 5 showings on election day. Garretson (2016, p. 286) found that the data challenged conventional thinking, explaining, “The conventional wisdom that measures to ban [same sex marriage] benefit Republican candidates more than Democratic candidates.” It was largely considered that pushes to ban SSM (Same Sex Marriage) would help push more Republicans to the polls. If this was the case, it would show a very sturdy foundation to prove that at least in some instances ballot measures help shift electoral results. However, the outcomes in Garretson (2016) indicate that ballot measures have a more complex political relationship with the American public than it seems at first, in so much that the research shows there is not a clear link between some highly partisan ballot measures and partisan turnout. Still, it is to be noted that Garretson’s study tries to examine ballot measures’ effects on same year election results, which frames the causational relationships of ballot measures to election results as contained within a single election cycle it remains to be tested whether they would have the ability to predict next election results. Take this hypothetical scenario for example: there is a swing state that has voted slightly Republican over the last few elections. But, while there is a slight Republican bent, on state-level ballot measures the state has been voting more in favor of Democrat-led ballot initiatives. They have been voting to raise minimum wage, expand abortion rights and are voting for heavier corporate regulations. Could this voting pattern be used meaningfully to predict a shift to the Democratic column in the next presidential election? Would better understanding of this relationship, quantized over multiple states, allow organizations such as the New York Times to make more accurate election predictions, 6 and avoid instances like in 2016 when they erroneously predicted Hillary Clinton to win with 91% certainty (Katz, 2016)? When considering whether polls are or have been reliable, it is also important to examine the biases that are at play. Americans in general have a complicated relationship when it comes to belief in polls. Researchers from the University of Michigan have found that for many Americans their belief in polls is based on whether the poll agrees with their perspective (Pasek & Traugott, 2017). When asking Americans about the perceived credibility of polls they were being shown, Pasek & Traugott (2017, p. 432 ) found: For a typical respondent for whom all other variables were at their mean values, being in disagreement (as opposed to agreement) with the poll result decreased the credibility of a poll by 14 percent of the range of the outcome index. This insight into human psychology demonstrates the need to consider our own biases when studying such questions as the one we undertake. It is a reminder to us to stay data focused when we approach data-driven political issues. Our belief in data models is often built on whether we like what they are telling us. As we move into our discussion on data and methods, studies like the abovementioned helped guide our research methods philosophy to be as neutral as possible. Elaborating more on the accuracy of polls in general, there is leading research that assures us that polls are also fairly durable given different environmental factors. A study into the relationship between polling accuracy and election turnout conducted by British political scientist Jean-François Daoust (2021) found that, even given random variables 7 like turnout, polling is still highly accurate. This study used data spanning 206 elections held in 33 unique countries, including the US, from 1942 to 2017. Daoust (2021, p. 5) reported: The predicted poll error never substantially differs from 2 points, regardless of voter turnout. Hence, [the data] suggests that turnout is not substantially linked to the quality of polls’ predictions of the electoral outcomes, regardless of whether the estimation technique is parametric or not. This finding compels data scientists to double down on the current status quo and continue to use the reliable method of using public polling as the pillar of election data models. Elections and the circumstances around them are complicated, and studies like these reaffirm that polls are not just generally accurate but also are not as affected by certain external factors as previously thought. If so, then what is the utility in improving the models to predict presidential election outcomes? It appears that while American polls have been quite accurate for elections that are not won by close margins, additional data could be helpful in elections that are close. Many American presidential elections have been extremely close. In the last election in 2020, Joe Biden won Pennsylvania, Michigan, Wisconsin, and Georgia by 1.2%, 2.8%, 0.6% and 0.2% respectively. If the polls gave us even a 1% margin of error, half the swing states mentioned would not have been called by the polls. This motivates us to seek better models to obtain higher predictive power. If it is found there are other data that reveal useful insights that cover the blind spots of the current polling models, we could make polls accurate enough to matter in states that are currently too close to call. 8 It is in this regard that we consider employing state ballot measures which have heretofore not been used to predict the outcomes of presidential elections. By making our models more comprehensive and including more data like ballot measure results, the public can consume data such as poll results more thoughtfully and in context, making the reality of the numbers more trustworthy and informational. One of the difficulties in conducting this study is the lack of previous, similar research. The previously mentioned study by Garretson (2016) examined whether there was any correlation between same sex marriage (SSM) bills being on the ballot and Republican electoral victories. The study examined the effects of SSM bills over time and revealed the growing unpopularity of anti-LGBT bills over time. This type of research is highly beneficial for the larger field of political data science and helps us understand trends in American political culture. Our study, while also aiming to contribute to our understanding of ballot measures and their effects on general elections, has a different goal. We specifically study the relationship between ballot measures and their subsequent presidential election. To do so, we 1) analyzed all ballot measures and not just those referring to a specific issue, 2) classified those ballots into categories based on political affiliation, and 3) examined whether the performance of ballot measures would tell us anything about the outcomes of future Presidential election results. DATA 9 The first rudimentary decision that had to be made was what elections were going to be used in our sample. We chose every presidential election from 1992 to 2020, looking at all state level ballot measures for those years. This decision was made because we wanted to analyze modern elections, since modern electoral politics are much different than those from early American elections. Reagan is largely considered the first modern president, but we chose to start with Clinton’s elections because his campaigns led to the start of the modern political map. Reagan’s electoral victories were both landslides winning more than 90% of the electoral vote in his first election in 1980 and 97% in the second in 1984. Reagan’s successor George H. W. Bush won a smaller, yet similar, landslide in 1988 winning 79% of the electoral college vote, and although Clinton won by no small margin (68.7% of the electoral college vote), 1992’s political map looks more similar to the political map we have today. For example, the most populous state, California, started its Democratic run in 1992 and has not voted for a Republican since. The Northeast became a largely blue stronghold, and we started to see the emergence of familiar swing states such as Florida, Ohio, and Wisconsin in the 1992 election. We also made the decision to not use ballot measure results from midterms elections. One reason for this is that midterms had lower turnouts, and we wanted to capture America’s total populace as best as possible. Another reason for this decision was to make apples to apples comparisons. Midterm elections do not include a vote for president, which is the most powerful position in American politics, so there are different foci in the minds of voters when comparing midterms versus presidential election results. 10 There is also a practical component to not including midterm elections 1 - it is not clear how to weight midterm election results versus presidential election results. Presidential elections have larger turnouts, but midterm elections are closer in time to the next presidential election. Due to the complications of the data, budget constraints, and a desire to capture a larger percentage of American’s opinions we left midterm election results to be analyzed in future projects. The next data-related issue that we tackled was to determine how to classify ballot measures. Our initial idea was to use two categories - ‘conservative’ and ‘liberal’. But these terms are subjective and mean different things to different people. Instead, we chose to rank bills on a scale from most Democrat leaning to most Republican leaning. This is advantageous for several reasons. First, it factors in the changing tides of political policy in America. In our surveys (described later), we asked respondents to measure ballot measures on a scale from most Democrat to most Republican based on the year that was being analyzed. This fixes issues pertaining to whether ballot measures, like crime bills, are left or right leaning. It is a fact of American politics that the parties have shifted positions over time, and factoring this into our equation is important. The parties over time have changed their exact position on the political spectrum, so asking respondents where they thought ballot measures fell on the political spectrum is not in itself useful to this study. If both parties are pushing for relatively conservative Further, doing so would add a thousand or so ballot measures to be analyzed by our participants which we did not have the budget for at the time. We leave this for future research. 1 11 policies on an issue, finding out that conservative ballot measures surrounding that issue are being passed is not a good measure of what the voters’ partisan bent is. For this reason, we did not go with the political spectrum categorizations. A critical reader may question if Republican and Democrat are also as subjective terms as conservative and liberal, but the difference here is that parties have platforms where they openly state their opinions on issues. We asked participants to research the party platforms for the years they were judging, making this endeavor a lot less subjective than it may at first glance seem. With these categorization issues out of the way, we were able to begin conducting our baseline research. We gathered a total of 8 respondents who were getting undergraduate political science degrees at the University of Utah to parse through ballot measures from our specified years. There were 1414 ballot measures over the 8 elections that we included in our study. The data for all were gathered from Ballotpedia, a digital encyclopedia of American politics and elections. Almost all states had ballot measures at least once over these elections with the exception of Delaware, North Carolina, Tennessee, and Vermont. The respondents on average assessed and categorized 530 ballots each, and each ballot was rated by three respondents. Ballot measures were ranked from 1-7, with 1 being most Democrat leaning and 7 being most Republican leaning, leaving 4 to be neutral. By neutral we mean not aligned with either party; usually bills that fit into this category have agreement by both political parties or are concerning matters that are incredibly mundane. For example, there are many bills that modernize 12 language in old laws or state constitutions, and political parties generally are advocating to remove antiquated language. It is to be noted that, especially during earlier elections, many of the bills were marked as 4 since they were often regarding orders of business more than they were about partisan politics. For example, take 1992 Oklahoma ballot measure “State Question 643” which all three respondents marked as a 4. This bill removes obsolete language revolving around railroads, and for technical reasons required a ballot measure to get done. Once we received three numeric ratings for each ballot measure, we averaged them to obtain a single score, labeled as Ballot Political Score detailing how partisan a bill was. A higher score means a ballot measure is more Republican party leaning and a lower score reflects that a ballot measure is more Democrat party leaning. METHODS Once this score was calculated, we created a formula to aggregate all the ballot measures together for each state and year combination into a measure labeled Political Sway Score. This measure represents the average political sway score for a given state and year. The formula, measured as a percent, is: 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑠𝑠𝑠𝑠 = ∑𝑛𝑛 𝑖𝑖=𝑖𝑖[(𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑆𝑛𝑛 −4)∗(𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑛𝑛 −.5)] 𝑛𝑛 In this formula, n represents the number of ballot measures for an individual state in year t. The ballot political score and ballot results are both from individual ballots pertaining to that state and year. The Ballot Political Score variable is the previously mentioned 13 average of three 1-7 scores, and ranges from 1 to 7. The Ballot Result variable represents the percent pass rate that ballot measure received (0-100%). In order to capture by how much a ballot passes (or loses), we subtract the rate by 0.5, which is the threshold needed to pass a ballot measure into law. By doing so, we capture not only whether a ballot passed (or failed), but also consider the amount of support that it received (or not). Essentially, in the above formula Political Sway Score is the amount by which a ballot passed or failed above the threshold, weighted by the extent of partisanship of the ballot, averaged over all ballots for each state-year. As to why these scores are measured as a percent, it is because Ballot Result is measured as a percentage whereas Ballot Political Score is unitless. This represents well the election victory differential, which in US elections are measured as percentages. Let’s illustrate with an example. If a bill passed with 51% of the votes and was ranked a score of 7, the score for that ballot measure would be 3%. Republican-leaning ballot measures are marked by positive percentages while Democrat-leaning bills are marked by negative percentages. This would also mean that if the ballot measure mentioned above failed to pass with 49% of the votes, the score resulting would be -3% meaning a Republican leaning bill failing to pass garners the same result as a Democrat bill (with a similar proportion of votes for it) passing. The Political Sway Score thus incorporates the following two effects: the more Republican or Democrat a bill is, and the larger the percent of win or loss rate, the larger effect on the overall political sway score that bill has. This score represents how Republican or Democrat leaning a state’s vote was for 14 down ballot measures in each considered year. Thus, a positive Political Sway Score means that in a given year a particular state supported more Republican leaning proposals, whereas a negative score means that a state supports a more Democrat leaning proposal. Our main outcome of interest is the result of presidential elections. In particular we are interested in knowing whether our Political Sway Score measure would influence the results of the subsequent presidential election. We compute the state’s relative or differential percentage vote for president in an election year (labeled as Presidential Election Score). This score is calculated by subtracting the Republican party’s percentage share of votes received from that of the Democrat party in a presidential election year (such that a higher or more positive value indicates a win by the Republican candidate, while a lower value is a win by the Democratic candidate). For example, if state A has a political sway score of -10% in the 2008 election, this would mean that they are Democrat leaning (in the ballots passed). In the subsequent election in year 2012, if the Democratic candidate beat the Republican candidate in that state 55% to 45%, then the party differential percentage score would be -10%. To answer our focal question of whether data about down ballot election results and their political leanings could be used to improve the predictability of presidential elections, we ran basic models to see if there was a correlation between our political sway score and our presidential score. We began by creating two different models. In the baseline model, we only include information on what party each state voted for the previous election and by what margin of victory. In the main model, we add the political sway score for that 15 state. The target variable was the party differential vote percentage in the subsequent presidential election. This allows us to test whether our political sway score holds any additional predictive value, over and beyond the effect of past election outcomes. To analyze the relationship between our Political Sway Score and future election results, we used simple correlation to see if it was worth looking into this metric in the future. We ran a few simple correlation models that broke down results by year and region. We also ran one correlation between Political Sway Score and same year election results as a comparative metric to try to ascertain whether the Political Sway Score was more of a lagging or leading indicator. RESULTS We ran simple correlation statistics to help measure the relationship between down ballot election results and future presidential election results. Below is a table of our overall findings. Our dependent variable column shows what subset of the model was included. We will explain these findings individually after the table. Dependent Variable Presidential Score Next Election Presidential Score Same Year ∆ Presidential Score P-Value b1 1.95E-06 1.54E-08 0.244 0.35 0.39 -0.04 1.56 0.71 0.85 Presidential Score Next Election in 1992 0.199 -0.29 -6.19 Presidential Score Next Election in 1996 0.021 0.47 5.27 Presidential Score Next Election in 2000 0.014 0.55 7.39 16 b0 Presidential Score Next Election in 2004 0.036 0.29 -2.47 Presidential Score Next Election in 2008 0.159 0.4 0.24 Presidential Score Next Election in 2012 0.003 0.61 3.15 Presidential Score Next Election in 2016 0.113 0.31 1.77 Presidential Score Next Election in American Northeast 0.941 0.01 -16.88 Presidential Score Next Election in American Midwest 0.079 0.17 5.91 Presidential Score Next Election in American South 0.025 0.22 8.34 Presidential Score Next Election in American West 0.012 0.61 2.67 Simple Correlation We first examined if there was any correlation between our independent and dependent variable. Using Pearson correlation, we found the coefficient to be 0.347. In Figure 1, we plot the political sway score on the X-axis and the party differential percentage score for the next election or Presidential Score Next Election on the Y-axis. We ran the following OLS correlation coefficient and found that the presidential score of the next election equaled 0.35 multiplied by the political sway score with an intercept of 1.56. 17 Figure 1. Scatterplot of Presidential Score Next Election Vs Political Sway Score The coefficient interpreted in real terms means that for every one unit of political sway score percentage in the direction of a party, we can expect that state on average to vote 0.38% more for the other party. So, if the Political Sway Score shows that a state voted 1% more for Republican leaning initiatives, we can expect that state to vote 0.38% more for Democrats in the next election. As a comparison, below is a graph comparing same-year election results with the political sway score. 18 Figure 2. Scatterplot of Presidential Score Same Election Vs Political Sway Score What is interesting about this graph, is that it shows that the political sway score is slightly more correlated of same year election results than it is future election results. This tells us that our political sway score is not a leading indicator, and instead appears to be more of a lagging indicator. Correlation by Year Next, we looked at our correlation coefficients by year one at a time. The results of these models were difficult to interpret, as there was no clear pattern what years were statistically significant and what years were not. The years with a p-value less than 0.05 were 1996, 2000, 2004, and 2012 whereas 1992, 2008 and 2016 were not statistically significant. For the years that were statistically significant their coefficients in order of year were 0.466, 0.552, 0.127 and 0.608 respectively. 19 Figure 3. Scatterplot of Presidential Score Next Election Vs Political Sway Score in 1996 Figure 4. Scatterplot of Presidential Score Next Election Vs Political Sway Score in 2000 Figure 5. Scatterplot of Presidential Score Next Election Vs Political Sway Score in 2004 20 Figure 6. Scatterplot of Presidential Score Next Election Vs Political Sway Score in 2012 There are a few notable things to discern from these graphs. One interesting item is that all these years had strong Republican outliers, or generally leaned Republican on our political sway score. This idea is further substantiated by the fact that the graphs that were not statistically significant leaned more heavily Democrat overall. One such graph from 2008 is shown below. There are a myriad of speculative conclusions one could draw from these facts, but any data driven conclusions would require further studies to test. Figure 7. Scatterplot of Presidential Score Next Election Vs Political Sway Score in 2008 21 Correlation by Region Our final breakdown for our study was by region. We broke the US into four regions including the Northeast, Midwest, South and West. We decided which state belonged to which region by taking from the United States Census Bureau’s definitions of the four statistical regions in the US(United States Census Bureau). What we found is that the Northeast and Midwest do not have significant correlations, while the South and West regions do. This may be due to the differences in sample sizes, since the Northeast and Widest have the smallest sample sizes out of the four regions, but it also clear from the data that these two regions also have much weaker correlations. Below are the graphs from the West and South Regions of the US. The West had a coefficient of 0.613. The South had a coefficient of 0.216. Figure 7. Scatterplot of Presidential Score Next Election Vs Political Sway Score in American South 22 Figure 8. Scatterplot of Presidential Score Next Election Vs Political Sway Score in American West There is a question as to why different regions have different correlative factors. This may have to do with the historical political and ideological trends that these regions have been subject to, but again detangling the variables that might cause these effects go far outside the scope of this paper. CONCLUSION The goal of the project was to find out if state level ballot results are useful in discerning future presidential election results. In short, we see correlation between the two variables although it fairly weak correlation. Despite this, there are still many things we can derive from the data and consequential conclusions to be had from this information. First, although our correlation results are mild, they still exist. This is to say that state level ballot results are at least somewhat correlated with the general election results from that state. This is further corroborated by our finding that ballot measure results are also mildly correlated with same year presidential election results. This tells us that there is 23 some connection between a populous’ ideology and the success of a ballot in the populous. This is not very surprising, but the fact that the correlation is so mild is surprising. There are a few take aways for those working in politics based on this data. The first and main takeaway is if you can’t win an election in a state, it might be a better idea to pass a ballot measure in that state. Our data shows that American’s political ideology when it comes to voting for a president and voting down ballot are not very connected. There are many areas in our study where Democrats had not won a presidential election in a state for decades, but Democrat leaning ballot measures were still passing and vice versa. There are other ways than winning elections to make real changes in policy, and ballot measures may be a way to separate Americans from their party politics. Another takeaway from our correlation data is that some regions are more ideological than others. The South and West regions in America are much more likely to vote for ballot measures that agree with their elected president’s ideology than the Midwest and Northeast are. This may mean that our first takeaway is truer in the Midwest and Northeast. Before presenting our results, we discuss some limitations of our data and models. First, we worked with a fundamentally small sample size when parsing through data (due to a lack of funds). In the future, researchers can add more respondents to possibly get better results. For our data each ballot measure was reviewed by three undergraduate political science majors. In future studies results could be enhanced by using seasoned experts to sort data and getting more than three respondents for each ballot. Second, our political 24 sway score measure is designed for a two-party political system and will need to be modified before applying to situations with greater than two parties. Not withholding these limitations, this project aims to show the validity and promise of using unique data gathering methods to supplement current data models and hopes to spur further research into the topic. Next, we analyze the results of our paper. There are many places further research could go from here. One area of research begging to be explored is doing a similar study to this one but using only highly contentious ballot measures. For example, if a study examined only ballot measures with 10 million dollars or more spent on advertising, it might show predictive effects on election results with more robust correlative results than this study found. Another area of further research could be looking into how ballot measures are correlated with previous election results and see how past elections effect future ballot measures. This could show ballot measure results as a lagging indicator rather than a leading indicator in partisan politics and once again shift how we think about the nature of ballot measures. Whether or not ballot measures are used in predictive models in the future, the field of political data science will continue to grow searching for a way to reveal more of the future. 25 REFRENCES Pennsylvania: 2016 election forecast. FiveThirtyEight. (2016, November 8). Retrieved April 29, 2022, from https://projects.fivethirtyeight.com/2016-electionforecast/pennsylvania/ Kuru, O., Pasek, J., & Traugott, M. W. (2017). Motivated reasoning in the perceived credibility of public opinion polls. Public Opinion Quarterly, 81(2), 422–446. https://doi.org/10.1093/poq/nfx018 Katz, J. (2016, October 18). Hillary Clinton has a 91% chance to win. The New York Times. Retrieved December 12, 2022, from https://www.nytimes.com/newsgraphics/2016/10/18/presidential-forecastupdates/newsletter.html Seabrook, N. R., Dyck, J. J., & Lascher, E. L. (2014). Do ballot initiatives increase general political knowledge? Political Behavior, 37(2), 279–307. https://doi.org/10.1007/s11109-014-9273-5 Kennedy, C., Blumenthal, M., Clement, S., Clinton, J. D., Durand, C., Franklin, C., McGeeney, K., Miringoff, L., Olson, K., Rivers, D., Saad, L., Witt, G. E., & Wlezien, C. (2018). An evaluation of the 2016 election polls in the United States. Public Opinion Quarterly, 82(1), 1–33. https://doi.org/10.1093/poq/nfx047 Garretson, J. J. (2014). Changing with the Times. Political Research Quarterly, 67(2), 280–292. https://doi.org/10.1177/1065912914521897 Daoust, J.-F. (2021). Blame it on turnout? citizens’ participation and polls’ accuracy. The British Journal of Politics and International Relations, 23(4), 736–747. https://doi.org/10.1177/1369148120986092 United States Census Bureau. (n.d.). Statistical Groupings of States and Counties. Retrieved December 12, 2022, from https://www2.census.gov/geo/pdfs/reference/GARM/Ch6GARM.pdf 26 Name of Candidate: Ryan McBride Date of Submission: May 8, 2023 27
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6rvp9wr