R, FRED, and the 2016 Texas Primary: Part 3.1

In this post I will go over how to create an aggregated table of the FRED data and merge it with the election data. This aggregated table will serve as the basis for a future post on mapping out both the FRED and election data. In my next post, I will go through how to tier and subset this data so that it can be plotted in a choropleth map using choroplethr.

Plan

According to fred.master(), 2013 is the last data point for Per Capita Personal Income (I’ve decided not to incorporate the education data at this time).  Because of this, I will use 2013 data for all the FRED categories.  This may not be ideal when comparing to 2016 election results but it should still paint an accurate enough picture.

The Script

Load Dependencies

There are quite a few dependencies here (packages, source files, and RData files). Here they are:

library(plyr)
library(dplyr)
library(lubridate)
library(ggplot2)
library(scales) ## for the percent object in scales_x_continuous()
source("~/R/TexasPrimary2016/Functions/agg.functions.R")
load("~/R/TexasPrimary2016/Data/FRED/fred.cat.list.RData")
load("~/R/TexasPrimary2016/Data/Election/tex.results.RData")
Create 2013 Tables

Now that everything is loaded in, we can begin creating the columns for our table.  We will be aggregating FRED data corresponding to the year 2013.  I drew up a couple of functions in the agg.functions.R file to handle this task. annual.average() will convert monthly data into an annual value (an average of all that year’s month’s of data). annual.value() will extract the annual value for data with an annual frequency.

Here’s how we use those functions to create our columns:

### Unemployment rate
    ur.2013 <- annual.average(fred.cat.list$`Unemployment Rate`, date = 2013)

### Per Capita Personal Income (PCPI)
    pcpi.2013 <- annual.value(fred.cat.list$`Per Capita Personal Income`, date = 2013)

### Civilian Labor Force (CLF)
    clf.2013 <- annual.average(fred.cat.list$`Civilian Labor Force`, date = 2013)

### Residential Population (RP)
    rp.2013 <- annual.average(fred.cat.list$`Resident Population`, date = 2013)
Merge Data

Now that we have four individual data frames that have a common key (CountyName), we can merge these data frames together. We will need to rename them as well as create a CLF/RP column (Civilian Labor Force divided by the Residential Population) since this will also be of great use. Here’s how we merge, rename, and create those data:

agg.2013 <- merge(ur.2013, pcpi.2013, by = "CountyName") %>%
            rename(UnRate = Value.x, PCPI = Value.y) %>%
            merge(clf.2013, by = "CountyName") %>%
            merge(rp.2013, by = "CountyName") %>%
            rename(CLF = Value.x, RP = Value.y) %>%
            mutate(CLF_RP = (CLF/RP)/1000)
 Add Election Turnout Winner

Now let’s add both the turnout winner (party) and the party winner (candidate). We’ll create the vectors and then merge them into the aggregated FRED data frame like so:

### Party winner (democrat or republican)
    PartyWinner <- sapply(seq_along(tex.results$RepShare), function(x){
        if(tex.results$RepShare[x] > tex.results$DemShare[x]){
            "Republican"
        } else{
            "Democrat"
        }
    })

### Democrat winner of each county
    Dems <- tex.results[15:17]
    DWinner <- max.col(Dems)
    DemWinner <- names(Dems)[DWinner]
### Republican winner of each county
    Reps <- tex.results[8:11]
    RWinner <- max.col(Reps)
    RepWinner <- names(Reps)[RWinner]

### Create new columns in agg.2013
    agg.2013$PartyWinner <- PartyWinner
    agg.2013$DemWinner <- DemWinner
    agg.2013$RepWinner <- RepWinner
Formatting the plot

Even though these are just exploratory graphs, it will help us out if we can achieve a visual effect that really stands out when faceting the data. To do this, I’ve set the color blue = the Democratic winner and red = the Republican winner (makes sense):

### Party colors (BLUE = Democratic; RED = Republican)
    party.colors.1 <- scale_color_manual(values = c("#0000FF", "#FF0000"))
    party.colors.2 <- scale_fill_manual(values = c("#0000FF", "#FF0000"))
Distribution of Data

Now that the aggregated table is set, we can start looking at the distribution of the data. We’ll use histograms with overlain density plots. We’ll also use the default density kernel – gaussian. I’ve also decided to include two kinds of faceted graphs: those with overlain facets and those with grided facets:

### UnRate
    UnRate.dist <- ggplot(data = agg.2013, mapping = aes(UnRate, fill = PartyWinner)) +
                   geom_histogram(aes(y = ..density..), bins = 50, alpha = .5) +
                   geom_density(mapping = aes(color = PartyWinner), alpha = .1, size = 1) + 
                   party.colors.1 + party.colors.2
    UnRate.dist
    UnRate.dist + facet_grid(. ~ PartyWinner)
### PCPI
    PCPI.dist <- ggplot(data = agg.2013, mapping = aes(PCPI, fill = PartyWinner)) +
                 geom_histogram(aes(y = ..density..), bins = 50, alpha = .5) +
                 geom_density(mapping = aes(color = PartyWinner), alpha = .1, size = 1) + 
                 party.colors.1 + party.colors.2
    PCPI.dist
    PCPI.dist + facet_grid(. ~ PartyWinner)
### CLF
    CLF.dist <- ggplot(data = agg.2013, mapping = aes(CLF, fill = PartyWinner)) +
                geom_histogram(aes(y = ..density..), bins = 25, alpha = .5) +
                geom_density(aes(color = PartyWinner), alpha = .1) +
                party.colors.1 + party.colors.2
    CLF.dist
    CLF.dist + facet_grid(. ~ PartyWinner)
### RP
    RP.dist <- ggplot(data = agg.2013, mapping = aes(RP, fill = PartyWinner)) +
               geom_histogram(aes(y = ..density..), bins =  25, alpha = .5) +
               geom_density(aes(color = PartyWinner), alpha = .1) +
               party.colors.1 + party.colors.2
    RP.dist
    RP.dist + facet_grid(. ~ PartyWinner)
### CLF_RP (CLF divided by RP)
    CLF_RP.dist <- ggplot(data = agg.2013, mapping = aes(((CLF/1000)/RP), fill = PartyWinner)) +
                   geom_histogram(aes(y = ..density..), bins = 50, alpha = .5) +
                   geom_density(mapping = aes(color = PartyWinner), alpha = .1, size = 1) +
                   scale_x_continuous(labels = percent) +
                   party.colors.1 + party.colors.2
    CLF_RP.dist
    CLF_RP.dist + facet_grid(. ~ PartyWinner)
Scatterplot – UnRate ~ PCPI

We can now look at the scatterplot data. First let’s start with UnRate ~ PCPI:

### Unfaceted
    UnRate.PCPI <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = PCPI)) +
                   geom_point(alpha = 1/4) + geom_smooth()
    UnRate.PCPI

UnRate.PCPI

### . ~ PartyWinner
    UnRate.PCPI.PartyWinner <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = PCPI, color = PartyWinner)) +
                               geom_point(alpha = 1/2) + geom_smooth() +
                               ggtitle("Texas Counties 2013:\nUnemployment Rate vs Per Capita Personal Income") +
                               labs(x = "Unemployment Rate", y = "Per Capita Personal Income") + 
                               party.colors.1 + party.colors.2
    UnRate.PCPI.PartyWinner
    UnRate.PCPI.PartyWinner + facet_grid(. ~ PartyWinner)
### . ~ DemWinner
    UnRate.PCPI.DemWinner <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = PCPI, color = factor(DemWinner))) +
                             geom_point(alpha = 1/3) + geom_smooth() +
                             ggtitle("Texas Counties 2013:\nUnemployment Rate vs Per Capita Personal Income\n Bernie Sanders vs Hillary Clinton") +
                             labs(x = "Unemployment Rate", y = "Per Capita Personal Income") + 
                             scale_color_manual("Candidates", labels = c("Bernie Sanders", "Hillary Clinton"), values = c("blue", "red"))
    UnRate.PCPI.DemWinner
    UnRate.PCPI.DemWinner + facet_grid(. ~ DemWinner)
### . ~ RepWinner
    UnRate.PCPI.RepWinner <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = PCPI, color = factor(RepWinner))) +
                             geom_point(alpha = 1/2) + geom_smooth() +
                             ggtitle("Texas Counties 2013:\nUnemployment Rate vs Per Capita Personal Income") +
                             labs(x = "Unemployment Rate", y = "Per Capita Personal Income") +
                             scale_color_manual("Candidates", labels = c("Donald Trump", "Ted Cruz"), values = c("blue", "red"))
    UnRate.PCPI.RepWinner
    UnRate.PCPI.RepWinner + facet_grid(. ~ RepWinner)
Scatterplot – UnRate ~ CLF_RP

Now let’s look at UnRate ~ CLF_RP:

### Unfaceted
    CLF_RP.UnRate <- ggplot(data = agg.2013, mapping = aes(UnRate, CLF_RP)) + geom_point(alpha = 1/4) + geom_smooth()
    CLF_RP.UnRate

CLF_RP.UnRate

### . ~ PartyWinner
    CLF_RP.UnRate.PartyWinner <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = CLF_RP, color = PartyWinner)) +
                                 geom_point(alpha = 1/2) + geom_smooth() +
                                 ggtitle("Texas Counties 2013:\nUnemployment Rate vs (Civilian Labor Force/Residential Population)") +
                                 labs(x = "Unemployment Rate", y = "(Civilian Labor Force/Residential Population)") + 
                                 party.colors.1 + party.colors.2
    CLF_RP.UnRate.PartyWinner
    CLF_RP.UnRate.PartyWinner + facet_grid(. ~ PartyWinner)
### . ~ DemWinner
    CLF_RP.UnRate.DemWinner <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = CLF_RP, color = DemWinner)) +
                               geom_point(alpha = 1/2) + geom_smooth() +
                               ggtitle("Texas Counties 2013:\nUnemployment Rate vs (Civilian Labor Force/Residential Population)") +
                               labs(x = "Unemployment Rate", y = "(Civilian Labor Force/Residential Population)") + 
                               scale_color_manual("Candidates", labels = c("Bernie Sanders", "Hillary Clinton"), values = c("blue", "red"))
    CLF_RP.UnRate.DemWinner
    CLF_RP.UnRate.DemWinner + facet_grid(. ~ DemWinner)
### . ~ RepWinner
    CLF_RP.UnRate.RepWinner <- ggplot(data = agg.2013, mapping = aes(x = UnRate, y = CLF_RP, color = RepWinner)) +
                               geom_point(alpha = 1/2) + geom_smooth() +
                               ggtitle("Texas Counties 2013:\nUnemployment Rate vs (Civilian Labor Force/Residential Population)") +
                               labs(x = "Unemployment Rate", y = "(Civilian Labor Force/Residential Population)") + 
                               scale_color_manual("Candidates", labels = c("Donald Trump", "Ted Cruz"), values = c("blue", "red"))
    CLF_RP.UnRate.RepWinner
    CLF_RP.UnRate.RepWinner + facet_grid(. ~ RepWinner)
Scatterplot – PCPI ~ CLF_RP
### Unfaceted
    CLF_RP.PCPI <- ggplot(data = agg.2013, mapping = aes(PCPI, CLF_RP)) + geom_point(alpha = 1/4) + geom_smooth()
    CLF_RP.PCPI

CLF_RP.PCPI

### . ~ PartyWinner
    CLF_RP.PCPI.PartyWinner <- ggplot(data = agg.2013, mapping = aes(x = PCPI, y = CLF_RP, color = PartyWinner)) +
                               geom_point(alpha = 1/2) + geom_smooth() +
                               ggtitle("Texas Counties 2013:\nPer Capita Personal Income vs (Civilian Labor Force/Residential Population)") +
                               labs(x = "Per Capita Personal Income", y = "(Civilian Labor Force/Residential Population)") + 
                               party.colors.1 + party.colors.2
    CLF_RP.PCPI.PartyWinner
    CLF_RP.PCPI.PartyWinner + facet_grid(. ~ PartyWinner)
### . ~ DemWinner
    CLF_RP.PCPI.DemWinner <- ggplot(data = agg.2013, mapping = aes(x = PCPI, y = CLF_RP, color = DemWinner)) +
                             geom_point(alpha = 1/2) + geom_smooth() +
                             ggtitle("Texas Counties 2013:\nPer Capita Personal Income vs (Civilian Labor Force/Residential Population)") +
                             labs(x = "Per Capita Personal Income", y = "(Civilian Labor Force/Residential Population)") + 
                             scale_color_manual("Candidates", labels = c("Bernie Sanders", "Hillary Clinton"), values = c("blue", "red"))
    CLF_RP.PCPI.DemWinner
    CLF_RP.PCPI.DemWinner + facet_grid(. ~ DemWinner)
### . ~ RepWinner
    CLF_RP.PCPI.RepWinner <- ggplot(data = agg.2013, mapping = aes(x = PCPI, y = CLF_RP, color = RepWinner)) +
                             geom_point(alpha = 1/2) + geom_smooth() +
                             ggtitle("Texas Counties 2013:\nPer Capita Personal Income vs (Civilian Labor Force/Residential Population)") +
                             labs(x = "Per Capita Personal Income", y = "(Civilian Labor Force/Residential Population)") + 
                             scale_color_manual("Candidates", labels = c("Donald Trump", "Ted Cruz"), values = c("blue", "red"))
    CLF_RP.PCPI.RepWinner
    CLF_RP.PCPI.RepWinner + facet_grid(. ~ RepWinner)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s