A Quick Look at the Election 2018 Data

Scraping 2018 Election Data
November 7, 2018
Mapping the Election 2018 Data
November 16, 2018
Show all
A Quick Look at the Election 2018 Data

First Look at the Election 2018 Data

Disclaimer: I am not a political analyst and have no intention to be one. The things I look are not necessarily great insights to elections. My goal in these posts is just to show how to use R for some data analytics.

According to the news, midterm voter turnout hit 50-year high. I was curious about whether or not this increase in the number of votes favored any particular party. So, I decided to have a a first look at the election data. In order to reproduce the plots and numbers on this post, you have to download the following two datasets. I provide the R code for reproducing the numbers and plots at the end of each section.

Governor Race - 2014 vs. 2018

First, I compared gubernatorial election results in 2014 and 2018 using the 36 states that had a governor race. In this comparison, I defined two variables:

  • The percent change in the # of votes from 2014 to 2018
  • The change in percent difference between democratic and republican party.

In the below picture, there are four zones created by a vertical line (x=0) and horizontal line (y=0). The states at the upper right are the ones with an increase in the number of votes in 2018 compared to 2014 while there is a net shift in favor of the Democratic party. The states at the bottom right are the ones with an increase in the number of votes in 2018 compared to 2014 while there is a net shift in favor of the Republican Party. The states at the upper left are the ones with a decrease in the number of votes in 2018 compared to 2014 while there is a net shift in favor of the Democratic Party. I labeled some states that is most interesting (to me).

  • The most striking shift appears in Nevada. Nevada is the state that most increased the number votes in 2018 compared to 2014 (76.4%). The increase clearly favored the Democratic party. While the difference was 46.7% with the Republican party leading in 2014, it was 4.1% with the Democratic Party leading, indicating a net 50.8% shift in favor of Democrats.

  • A similar big shift occurs in South Dakota(42.5%), Ohio (26.8%), Tennessee (26.8%), and Illinois (19.6%) in favor of the Democratic party with a considerable amount of increase in the number of votes in South Dakota(23.5%), Ohio (43.4%), Tennessee (61.6%), and Illinois (24.5%).

  • There is a big increase in the number of votes in Texas (76.0%), New York (59.6%), and Georgia (54.2%). However, the shift in favor of the Democratic party was in smaller degree, Texas (7.0%), New York (8.9%), and Georgia (6.3%).

  • In California and Florida, there is almost 0% shift indicating the difference in % votes remain same between the two parties although there is an increase in the number of votes, Florida by 37.1% and California by 7.8%.

  • Massachusetts is another interesting one. That is the one with the most shift in favor of Republican party (31.7%) while increasing the number of votes by 20.4%.

You can look at the rest and dig deeper by yourself.

State Shift (%) Change in Votes (%)
Nevada 50.8 76.4
South Dakota 42.5 23.5
New Mexico 28.5 34.7
Tennessee 26.8 61.6
Ohio 26.8 43.4
Illinois 19.6 24.4
Iowa 18.8 16.8
Hawaii 16.6 6.7
Maine 13.4 -4.0
Michigan 13.1 32.1
Rhode Island 10.9 15.3
New York 8.9 59.6
Kansas 8.4 20.8
Alabama 8.1 45.7
Texas 7.1 76.0
Pennsylvania 7.0 42.0
Wisconsin 6.9 10.9
Minnesota 6.4 33.8
Georgia 6.3 54.2
South Carolina 6.1 36.5
Colorado 3.8 -2.7
Oklahoma 2.7 44.0
Florida 0.4 37.2
California 0.1 7.7
Nebraska -0.1 29.3
Oregon -0.2 25.5
Connecticut -0.6 25.2
Arizona -5.7 15.1
Wyoming -6.2 28.8
Idaho -7.1 36.5
Maryland -8.8 29.7
New Hampshire -12.1 17.3
Vermont -16.1 40.9
Arkansas -19.7 4.5
Massachusetts -31.7 20.4

require(ggrepel)
require(ggplot2)
require(scales)
require(knitr)
options(scipen=999)

#########################
setwd("C:/Users/c.zopluoglu1/Documents")
past    <- read.csv("https://raw.githubusercontent.com/MEDSL/2018-elections/master/election-context-2018.csv")
present <- read.csv("election2018.csv")
d       <- merge(past,present,by=c("state","county"),all=TRUE)
#########################

states <- aggregate(cbind(repsen18,demsen18,othersen18,
                          trump16,clinton16,otherpres16,
                          demgov14,repgov14,othergov14,
                          demgov18,repgov18,othergov18) ~ state,
                          data = d, sum,na.action=NULL,na.rm=TRUE)

states$total.vote2016pres   <- states$trump16 + states$clinton16 + states$otherpres16
states$total.vote2014gov    <- states$demgov14 + states$repgov14 + states$othergov14
states$total.vote2018gov    <- states$demgov18 + states$repgov18 + states$othergov18
states$total.vote2018sen   <- states$repsen18 + states$demsen18 + states$othersen18


states$p.trump16   <- states$trump16/states$total.vote2016pres
states$p.clinton16 <- states$clinton16/states$total.vote2016pres
states$p.demgov14   <- states$demgov14/states$total.vote2014gov
states$p.repgov14   <- states$repgov14/states$total.vote2014gov
states$p.demgov18   <- states$demgov18/states$total.vote2018gov
states$p.repgov18   <- states$repgov18/states$total.vote2018gov

states$p.diffgov14 <- states$p.demgov14 - states$p.repgov14 
states$p.diffgov18 <- states$p.demgov18 - states$p.repgov18 

states$p.diffgov   <- (states$p.diffgov18 - states$p.diffgov14 )*100
states$p.increase.vote.gov <- (states$total.vote2018gov/states$total.vote2014gov)*100-100

labels = as.character(states[,1])
labels[-c(4,5,9,10,13,19,20,21,28,32,35,41,42,43)] = ""

ggplot(states,aes(p.increase.vote.gov,p.diffgov)) +
  geom_point(aes(size=total.vote2018gov,color=total.vote2014gov)) +
  theme_bw() + 
  theme(plot.title   = element_text(size=10),
        axis.title.y = element_text(size=rel(0.7)),
        axis.title.x = element_text(size=rel(0.7)),
        axis.text.y  = element_text(size=rel(0.7)),
        axis.text.x  = element_text(size=rel(0.7)),
        legend.text  = element_text(size=rel(0.5)),
        legend.title = element_text(size=rel(0.5)))+
  xlab("% Change in the Number of Total Votes")+
  ylab("% Change in Difference between Democratic and Republican Party")+
  xlim(c(-10,90))+
  geom_hline(yintercept=0, linetype=2, color="black", size=.5)+
  geom_vline(xintercept=0, linetype=2, color="black", size=.5)+
  labs(title="Gubernatorial Elections (2014 vs. 2018)",
       size ="Total Votes in 2018",
       color="Total Votes in 2014")+
  geom_label_repel(aes(label=labels),hjust=0.5,vjust=-0.5,size=2)


tab.states <- na.omit(states[,c(1,26,27)])
tab.states <- tab.states[order(tab.states[,2],decreasing=TRUE),]
colnames(tab.states) <- c("State","Shift (%)","Change in Votes (%)")
kable(tab.states,digits=1,row.names=FALSE) 

Governor Race 2018 vs. Presidential Race 2016

Some may argue this would be a better comparison to understand the dynamics after President #45. This will be also limited to 36 states that had a governor race. I looked at the same two variables.

  • South Dakota seems one of the least loss in the number of votes (-8.4%) while the net shift of 26.4% in favor of the Democratic Party. Simlar level of shift in favor of Democratic Party also occurred in Kansas (25.1%) and Oklahoma (24.3%) with larger amount of reduction in the number of votes, Kansas (-13.5%) and Oklahoma (-18.4%).

  • In Florida and New York, there is not a noticable shift in favor of neither party.

  • In Massachusetts, there is a net shift of 61.2% in favor of Republican Party while the reduction in the number of votes is 20.9%. Maryland, Vermont, Arizona, Connecticut, and California are other states with a net shift in favor of Republican party more than 10%.

State Shift (%) Change in Votes (%)
South Dakota 26.4 -8.4
Kansas 25.1 -13.5
Oklahoma 24.3 -18.4
Pennsylvania 17.5 -19.7
Minnesota 10.1 -10.5
Idaho 9.7 -13.1
Michigan 9.2 -13.6
Alabama 8.6 -19.4
Wyoming 6.6 -21.0
Iowa 6.4 -16.1
Nebraska 6.3 -19.7
Maine 5.8 -21.3
South Carolina 5.8 -20.0
New Mexico 5.6 -15.1
Tennessee 5.3 -12.9
Ohio 3.9 -21.4
Georgia 3.5 -4.8
Wisconsin 1.9 -10.2
Colorado 1.7 -30.3
Nevada 1.7 -14.2
Florida 0.5 -13.5
Rhode Island 0.0 -19.8
New York -0.3 -26.4
Illinois -2.3 -21.5
Hawaii -3.2 -8.9
Texas -4.3 -7.6
Oregon -6.0 -15.8
Arkansas -6.7 -21.9
New Hampshire -7.4 -23.5
California -11.3 -50.6
Connecticut -11.8 -17.7
Arizona -14.2 -34.6
Maryland -39.9 -23.5
Vermont -41.4 -13.4
Massachusetts -61.2 -20.9

states$p.diffpres16 <- states$p.clinton16 - states$p.trump16
states$p.diffgov18  <- states$p.demgov18 - states$p.repgov18 

states$p.diffpresgov       <- (states$p.diffgov18 - states$p.diffpres16)*100
states$p.increase.vote.gov <- (states$total.vote2018gov/states$total.vote2016pres)*100-100

labels = as.character(states[,1])
labels[-c(2,4,6,9,16,20,21,23,32,36,38,41,45)] = ""

ggplot(states,aes(p.increase.vote.gov,p.diffpresgov)) +
  geom_point(aes(size=total.vote2018gov,color=total.vote2016pres)) +
  theme_bw() + 
  theme(plot.title   = element_text(size=10),
        axis.title.y = element_text(size=rel(0.7)),
        axis.title.x = element_text(size=rel(0.7)),
        axis.text.y  = element_text(size=rel(0.7)),
        axis.text.x  = element_text(size=rel(0.7)),
        legend.text  = element_text(size=rel(0.5)),
        legend.title = element_text(size=rel(0.5)))+
  xlab("% Change in the Number of Total Votes")+
  ylab("% Change in Difference between Democratic and Republican Party")+
  xlim(c(-55,0))+
  geom_hline(yintercept=0, linetype=2, color="black", size=.5)+
  geom_vline(xintercept=0, linetype=2, color="black", size=.5)+
  labs(title="Gubernatorial Election 2018 vs. Presidential Election in 2016",
       size ="Total Votes in 2018",
       color="Total Votes in 2016")+
  geom_label_repel(aes(label=labels),hjust=0.5,vjust=-0.5,size=2)

tab.states2 <- na.omit(states[,c(1,29,27)])
tab.states2 <- tab.states2[order(tab.states2[,2],decreasing=TRUE),]
colnames(tab.states2) <- c("State","Shift (%)","Change in Votes (%)")
kable(tab.states2,digits=1,row.names=FALSE)

Focusing FL Counties - Gubernatorial Race 2014 vs. 2018

I looked at the same comparison between 2014 and 2018 gubernatorial elections at the county level in Fl.

  • There is a net shift in favor of Republican party for the majority of counties, particularly small ones.

  • In three relatively large counties, Duval, Orange, and Osceola are the ones with the most net shift in favor of Democratic party with a significant amount of increase in the number of votes.

  • In Hillsborough, there is also a net shift in favor of Democratic party at a smaller level with a significant amount of increase in the number of votes.

  • In Miami-Dade, the net shift of 1.8% in favor of Democratic party is not noticeable, though there is a 53% increase in the number of votes.

  • In other three large counties, there is a net shift in favor of Republican Party: Palm Beach (3.9%), Broward (1.8), and Pinellas (8.3%).

County Shift (%) Change in Votes (%)
Duval 17.2 40.2
Orange 13.8 54.7
Osceola 12.1 68.0
Escambia 10.9 32.0
Okaloosa 10.9 37.1
Alachua 10.6 46.9
Clay 10.2 37.2
St. Johns 9.8 51.4
Seminole 9.1 37.6
Hillsborough 6.2 40.5
Collier 5.2 36.9
Santa Rosa 4.4 41.9
Bay 3.3 11.0
Nassau 2.0 41.9
Miami-Dade 1.8 53.5
Leon 1.7 30.8
Walton 0.5 48.7
Indian River -0.3 41.4
Sumter -0.8 34.9
Lee -1.5 38.0
Broward -1.8 42.7
Hendry -2.0 32.5
Lake -2.9 38.1
Polk -3.6 27.3
Palm Beach -3.9 35.4
Brevard -5.3 27.0
Manatee -5.4 36.4
Sarasota -5.6 30.7
Putnam -5.6 23.8
St. Lucie -5.8 39.2
Gadsden -5.9 14.0
Gulf -6.9 14.2
Volusia -7.0 30.1
Marion -7.6 29.2
Flagler -8.0 39.5
Monroe -8.0 26.8
Pinellas -8.3 23.7
Martin -9.0 27.0
Highlands -9.5 24.5
Columbia -10.0 35.0
Levy -11.3 29.0
Washington -11.4 9.6
Franklin -11.4 25.8
Madison -12.8 20.5
Bradford -13.5 23.0
Charlotte -14.0 29.5
Jefferson -14.0 17.2
Pasco -14.5 31.2
Glades -14.9 26.8
Hardee -15.0 18.7
Jackson -16.0 5.7
Suwannee -17.3 24.7
Taylor -18.8 19.0
Gilchrist -19.4 23.6
Hamilton -19.9 22.4
Holmes -20.0 18.4
Baker -20.1 24.8
Okeechobee -20.2 26.4
Hernando -22.0 26.0
Citrus -22.1 21.6
Calhoun -22.5 9.7
Wakulla -23.2 23.0
Dixie -31.5 8.2
Lafayette -32.1 7.8
Liberty -32.4 0.3
Union -41.4 -2.8

#########################
setwd("C:/Users/c.zopluoglu1/Documents")
past    <- read.csv("https://raw.githubusercontent.com/MEDSL/2018-elections/master/election-context-2018.csv")
present <- read.csv("election2018.csv")
d       <- merge(past,present,by=c("state","county"),all=TRUE)
fl      <- d[which(d$state=="Florida"),]
##########################
fl$total.vote2016pres  <- fl$trump16 + fl$clinton16 + fl$otherpres16
fl$total.vote2014gov   <- fl$demgov14 + fl$repgov14 + fl$othergov14
fl$total.vote2018gov   <- fl$demgov18 + fl$repgov18 + fl$othergov18
fl$total.vote2018sen   <- fl$repsen18 + fl$demsen18 + fl$othersen18


fl$p.trump16   <- fl$trump16/fl$total.vote2016pres
fl$p.clinton16 <- fl$clinton16/fl$total.vote2016pres
fl$p.demgov14  <- fl$demgov14/fl$total.vote2014gov
fl$p.repgov14  <- fl$repgov14/fl$total.vote2014gov
fl$p.demgov18  <- fl$demgov18/fl$total.vote2018gov
fl$p.repgov18  <- fl$repgov18/fl$total.vote2018gov

fl$p.diffgov14 <- fl$p.demgov14 - fl$p.repgov14 
fl$p.diffgov18 <- fl$p.demgov18 - fl$p.repgov18 

fl$p.diffgov   <- (fl$p.diffgov18 - fl$p.diffgov14 )*100
fl$p.increase.vote.gov <- (fl$total.vote2018gov/fl$total.vote2014gov)*100-100

labels = as.character(fl[,2])
labels[-c(6,13,15,16,29,34,39,44,49,50,51,53,64,66)] = ""

ggplot(fl,aes(p.increase.vote.gov,p.diffgov)) +
  geom_point(aes(size=total.vote2018gov,color=total.vote2014gov)) +
  theme_bw() + 
  theme(plot.title   = element_text(size=10),
        axis.title.y = element_text(size=rel(0.7)),
        axis.title.x = element_text(size=rel(0.7)),
        axis.text.y  = element_text(size=rel(0.7)),
        axis.text.x  = element_text(size=rel(0.7)),
        legend.text  = element_text(size=rel(0.5)),
        legend.title = element_text(size=rel(0.5)))+
  xlab("% Change in the Number of Total Votes")+
  ylab("% Change in Difference between Democratic and Republican Party")+
  xlim(c(-10,75))+
  geom_hline(yintercept=0, linetype=2, color="black", size=.5)+
  geom_vline(xintercept=0, linetype=2, color="black", size=.5)+
  labs(title="FL Gubernatorial Elections (2014 vs. 2018)",
       size ="Total Votes in 2018",
       color="Total Votes in 2014")+
  geom_label_repel(aes(label=labels),hjust=0.5,vjust=-0.5,size=2)


tab.fl <- na.omit(fl[,c("county","p.diffgov","p.increase.vote.gov")])
tab.fl <- tab.fl[order(tab.fl[,2],decreasing=TRUE),]
colnames(tab.fl) <- c("County","Shift (%)","Change in Votes (%)")
kable(tab.fl,digits=1,row.names=FALSE) 

Focusing FL Counties - Gubernatorial Race 2018 vs. Presidential Race 2016

County Shift (%) Change in Votes (%)
Franklin 7.9 -10.0
St. Lucie 6.0 -12.2
Duval 5.8 -12.8
Clay 5.5 -12.3
Pasco 5.2 -12.6
St. Johns 4.8 -4.1
Alachua 4.6 -11.1
Pinellas 4.1 -11.7
Hernando 3.9 -14.1
Nassau 3.7 -6.6
Escambia 3.6 -15.3
Walton 3.6 -9.9
Union 3.5 -14.5
Seminole 3.3 -11.5
Santa Rosa 3.2 -14.0
Monroe 3.2 -15.0
Columbia 2.9 -13.6
Brevard 2.9 -10.6
Sarasota 2.9 -8.0
Okaloosa 2.9 -16.9
Gulf 2.9 -19.3
Martin 2.8 -9.4
Leon 2.5 -8.0
Volusia 2.3 -12.4
Citrus 2.3 -11.1
Indian River 2.3 -7.4
Lake 2.2 -9.1
Flagler 2.0 -8.2
Hillsborough 2.0 -12.9
Marion 2.0 -11.4
Putnam 1.8 -15.1
Bradford 1.7 -13.2
Levy 1.6 -11.9
Broward 1.6 -19.8
Charlotte 1.5 -9.9
Manatee 1.4 -8.9
Jackson 1.3 -24.2
Palm Beach 1.2 -14.4
Holmes 1.0 -20.7
Madison 1.0 -10.4
Orange 1.0 -13.0
Wakulla 0.7 -7.4
Polk 0.7 -14.1
Bay 0.4 -27.7
Taylor 0.3 -14.6
Liberty 0.2 -18.7
Okeechobee 0.2 -17.8
Dixie 0.2 -19.5
Washington 0.2 -18.9
Suwannee 0.0 -14.8
Gadsden -0.1 -9.4
Hamilton -0.8 -16.7
Glades -0.9 -12.1
Baker -1.0 -16.6
Gilchrist -1.0 -12.5
Sumter -1.0 -3.0
Hardee -1.2 -15.7
Lee -1.3 -11.4
Lafayette -1.4 -17.4
Calhoun -1.5 -24.5
Jefferson -1.7 -3.7
Highlands -2.0 -13.1
Collier -4.1 -9.0
Osceola -4.2 -18.1
Hendry -4.5 -20.2
Miami-Dade -8.7 -18.5

#########################
setwd("C:/Users/c.zopluoglu1/Documents")
past    <- read.csv("https://raw.githubusercontent.com/MEDSL/2018-elections/master/election-context-2018.csv")
present <- read.csv("election2018.csv")
d       <- merge(past,present,by=c("state","county"),all=TRUE)
fl      <- d[which(d$state=="Florida"),]
##########################
fl$total.vote2016pres  <- fl$trump16 + fl$clinton16 + fl$otherpres16
fl$total.vote2014gov   <- fl$demgov14 + fl$repgov14 + fl$othergov14
fl$total.vote2018gov   <- fl$demgov18 + fl$repgov18 + fl$othergov18
fl$total.vote2018sen   <- fl$repsen18 + fl$demsen18 + fl$othersen18


fl$p.trump16   <- fl$trump16/fl$total.vote2016pres
fl$p.clinton16 <- fl$clinton16/fl$total.vote2016pres
fl$p.demgov14  <- fl$demgov14/fl$total.vote2014gov
fl$p.repgov14  <- fl$repgov14/fl$total.vote2014gov
fl$p.demgov18  <- fl$demgov18/fl$total.vote2018gov
fl$p.repgov18  <- fl$repgov18/fl$total.vote2018gov

fl$p.diffpres16 <- fl$p.clinton16 - fl$p.trump16 
fl$p.diffgov18  <- fl$p.demgov18 - fl$p.repgov18 

fl$p.diffgovpress      <- (fl$p.diffgov18 - fl$p.diffpres16)*100
fl$p.increase.vote.gov <- (fl$total.vote2018gov/fl$total.vote2016pres)*100-100

labels = as.character(fl[,2])
labels[-c(6,10,11,16,19,26,28,44,50,51,52,53,60)] = ""

ggplot(fl,aes(p.increase.vote.gov,fl$p.diffgovpress)) +
  geom_point(aes(size=total.vote2018gov,color=total.vote2016pres)) +
  theme_bw() + 
  theme(plot.title   = element_text(size=10),
        axis.title.y = element_text(size=rel(0.7)),
        axis.title.x = element_text(size=rel(0.7)),
        axis.text.y  = element_text(size=rel(0.7)),
        axis.text.x  = element_text(size=rel(0.7)),
        legend.text  = element_text(size=rel(0.5)),
        legend.title = element_text(size=rel(0.5)))+
  xlab("% Change in the Number of Total Votes")+
  ylab("% Change in Difference between Democratic and Republican Party")+
  xlim(c(-30,0))+
  geom_hline(yintercept=0, linetype=2, color="black", size=.5)+
  geom_vline(xintercept=0, linetype=2, color="black", size=.5)+
  labs(title="FL - Gubernatorial Elections 2018 vs. Presidential Election 2016)",
       size ="Total Votes in 2018",
       color="Total Votes in 2016")+
  geom_label_repel(aes(label=labels),hjust=0.5,vjust=-0.5,size=2)

tab.fl <- na.omit(fl[,c("county","p.diffgovpress","p.increase.vote.gov")])
tab.fl <- tab.fl[order(tab.fl[,2],decreasing=TRUE),]
colnames(tab.fl) <- c("County","Shift (%)","Change in Votes (%)")
kable(tab.fl,digits=1,row.names=FALSE)