Bayesian modeling of populist party success in German federal elections - A notebook from the lab

Following up on an earlier post, we will model the voting success of the (most prominent) populist party, AfD, in the recent federal elections. This time, Bayesian modeling techniques will be used, drawing on the excellent textbook my McElreath.

Note that this post is rather a notebook of my thinking, doing, and erring. I’ve made no efforts to hide scaffolding. I think it will be confusing to the uniniate and the initiate as well …


Note: Data is based on data published by the Bundeswahlleiter 2017, (c) Der Bundeswahlleiter, Wiesbaden 2017, https://www.bundeswahlleiter.de/info/impressum.html, Data is made available for unrestricted use provided the source is credited.


Setup

Some packages that will help us:

library(tidyverse)
library(rethinking)
library(pradadata)
library(sjmisc)
library(viridis)

First, get the election data.They can be accessed from the Bundeswahlleiter. More conveniently, I have put them in a R package (after some cleansing):

data("elec_results")

In order to combine socioeconomic data with the election results, we can make use of data from the same source as above. Again accessible via the same R pacakge:

data("socec")

Note that a code book is available for these data:

data("socec_dict")

These data will be used as predictors for modeling the election results.

Third, we will make use of geo data in order to geoplot the modeling results. The Bundeswahlleiter provides such data (again via pradadata):

data("wahlkreise_shp")

Note: Data objects can also be downloaded from this source.

Data preparation

Now let’s merge the data frames. There will also be some janitor work such as renaming columns etc.

First, change the names of the socec data to a common format:

socec_renamed <- socec %>%
  rename(state = V01,
         area_nr = V02,
         area_name = V03,
         total_n = V06,
         germans_n = V07,
         for_prop = V08,
         pop_move_prop = V11,
         pop_migr_background_prop = V19,
         income = V26,
         unemp_prop = V47) 

Compute some more columns:

socec2 <- socec_renamed %>% 
   mutate(foreigner_n = total_n - germans_n,
         pop_move_n = pop_move_prop * total_n,
         unemp_n = unemp_prop * total_n / 100,
         pop_migr_background_n = pop_migr_background_prop * total_n / 100) %>% 
  drop_na()

Same thing with the election data, here we only need the criterion (AfD success) and the ID variables for merging:

elec_results2 <- elec_results %>%
  rename(afd_votes = AfD_3,
         area_nr = district_nr,
         area_name = district_name,
         votes_total = votes_valid_3) %>% 
   mutate(afd_prop = afd_votes/votes_total)    # valid votes only, and of the present Zweitstimme

Note that we are focusing on the Zweitstimme of the present election (hence the 3 in votes_valid_3 and in AfD_3).

Merge

socec2 %>%
  left_join(elec_results2, by = "area_name") %>% 
  left_join(wahlkreise_shp, by = c("area_name" = "WKR_NAME")) -> d_all_with_na

After-merge preparations

Add variable for East (1) vs. West (0):

d_all_with_na <- d_all_with_na %>% 
  mutate(east = case_when(
    state %in% c("Mecklenburg-Vorpommern", "Brandenburg", "Berlin", "Sachsen-Anhalt", "Sachsen", "Thüringen") ~ "yes",
    TRUE ~ "no"
    ) 
  )

d_all_with_na$east_num <- ifelse(d_all_with_na$east == "yes", 1, 0)

Main data frame: d_short and d_short_X

We will also provide a version without the geo data, and in pure (old school) data frame form (ie., not as tibble)_

d_all_with_na %>%
  rename(area_nr = area_nr.x) %>% 
  select(state,
         area_nr,
         area_name,
         total_n,
         germans_n,
         foreigner_n,
         for_prop,
         pop_move_n,
         pop_migr_background_n,
         income ,
         unemp_n,
         unemp_prop,
         votes_total,
         afd_votes,
         afd_prop,
         state,
         east,
         east_num) -> d_short_with_nas

names(d_short_with_nas)

Remove NAs:

d_short_with_nas %>% 
  drop_na() -> d_short_nona

Add state id:

d_short_nona$state_id <- coerce_index(d_short_nona$state)

Multiply by 1000 to get the real numbers so that a count model gets the “right” data

d_short_nona %>%
  mutate_at(vars(total_n, germans_n, foreigner_n, pop_move_n,
                    pop_migr_background_n, unemp_n), funs(. * 1000)
  ) -> d_short_nona_1000
glimpse(d_short_nona_1000)

Some checks

I know that there are 299 districts in Germany, so let’s check the row numbers

nrow(d_short_nona_1000) == 299

By the way, the number 316 of the data frame d_all_1000 is the sum of the districts plus the 16 federal states plus 1 for Germany itself.

Let’s do a similar check for the district names:

identical(elec_results$district_name,socec$V03)

Does not match. Maybe the order is different?

head(elec_results$district_name, 10)
head(socec$V03, 10)

Looks reassuring.

Let’s check which one is present in one but not in the other data frame:

elec_results2 %>% 
  select(area_name, area_nr) %>% 
  full_join(select(socec2, area_name, area_nr), by = "area_nr") -> merge_test

Low let’s filter for missings, ie non-matches:

merge_test %>% 
  filter(is.na(area_name.x))

“Land insgesamt” indicats the grand total of the federal state. We don’t need that data. Seems like all is fine.

Round values

Round values in order to work with counts:

d_short_not_rounded <- d_short_nona_1000  # backup

We need to convert the variables back to integer, bacause stan need integer for count models (makes sense):

d_short_nona_1000 %>% 
  mutate_at(vars(total_n:afd_votes), funs(as.integer)) -> d_short_as_int
glimpse(d_short_as_int)

Main dataframe:

d_short <- d_short_as_int

Any NAs left:

anyNA(d_short)

z-transformation

Transform to z-values:

d_short %>% 
  sjmisc::std() %>%  
  select(-c(state_z, area_nr_z, area_name_z, state_id_z, east_z, east_num_z)) -> d_short_z

names(d_short_z)

center

d_short_as_int %>% 
  sjmisc::center() %>% 
  select(-c(state_c, area_nr_c, area_name_c, state_id_c, east_c, east_num_c)) -> d_short_c

names(d_short_c)

Coerce to normal data frame, no tibble

d_short <- data.frame(d_short)
d_short_z <- data.frame(d_short_z)
d_short_c <- data.frame(d_short_c)

Data export

On a side note, this last data frame appears useful. I will upload it to a common repository, so that others can work with it.

election_modeling_data %>% 
  write_rds("election_modeling_data.rds")

d_short %>% 
  write_csv("d_short.csv")

d_short_c %>% 
  write_csv("d_short_c.csv")

d_short_z %>% 
  write_csv("d_short_z.csv")

This dataset is now made available via osf, DOI: 10.17605/OSF.IO/2YHR9. Similarly, our data frame d is available via the same plattform (note the file names of the CSV files).

model output list

I use this list to store the model outputs.

models_stan <- list()

Helper functions

The traceplot function from rethinking appears to have a bug, or does not work as I expected. Here’s a helper function:

my_traceplot <- function(model) {
  rstan::traceplot( model@stanfit, pars=names(model@start[[1]]))
}

m9: Some (metric) linear models with east/west, foreigner, unemp as predictors

d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z", "east")]

m9_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),
    mu <-  beta0[east] +  beta1*for_prop_z + beta2*unemp_prop_z,
    beta0[east] ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    beta2 ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2,
  cores = 2
)
my_traceplot(m9_stan)
precis(m9_stan, depth = 2)
models_stan[["m9_stan"]] <- m9_stan
WAIC(m9_stan)

DIC is not being produced, some issues exist.

m9a: metric model with no east/west

d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z")]

m9a_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),
    mu <- alpha +  beta1*for_prop_z + beta2*unemp_prop_z,
    alpha ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    beta2 ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2,
  cores = 2)

Looks good:

my_traceplot(m9a_stan)
precis(m9a_stan)
models_stan[["m9a_stan"]] <- m9a_stan

Here’s the data in the S4 object for the MCMC chains:

m9_stan@stanfit@sim$samples[[1]] %>% str()
precis(m9a_stan, depth = 2)
WAIC(m9a_stan)
DIC(m9a_stan)

m10: East/West as numeric predictor, not as varying intercept


d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z", "east_num")]

m10_stan <- map2stan( 
  alist(
    afd_prop ~ dnorm(mu, sigma),
    mu <- alpha + beta1*for_prop_z + beta2*unemp_prop_z + beta3*east_num,
    alpha ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    beta2 ~ dnorm(0, 1),
    beta3 ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2
)

Model output:

precis(m10_stan)
my_traceplot(m10_stan)
DIC(m10_stan)
WAIC(m10_stan)
models_stan[["m10_stan"]] <- m10_stan

Looks ok, but why are the coefficients so large? Strange.

Predictor importance

Let’s compare two models with only one predictor - either unemployment or migration or foreigners to see which model has a better entropy.

foreigner prop

d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z", "east")]

m11a_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),
    mu <- alpha +  beta1*for_prop_z,
    alpha ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2)

Looks good:

my_traceplot(m11a_stan)
precis(m11a_stan)
models_stan[["m11a_stan"]] <- m11a_stan

unemployment


d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z", "east")]

m11c_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),
    mu <- alpha +  beta1*unemp_prop_z,
    alpha ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2,
  cores = 2)
my_traceplot(m11c_stan)
precis(m11c_stan)
models_stan[["m11c_stan"]] <- m11c_stan

east only

d <- d_short_z[, c("afd_prop", "for_prop_z", "east")]

d$east_num <- ifelse(d$east == "yes", 1, 0)

m11d_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),
    mu <- beta1*east_num,
    beta1 ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
      ),
  data = d,
  chains = 2,
  cores = 2)

Looks good:

my_traceplot(m11d_stan)
precis(m11d_stan)
WAIC(m11d_stan)
models_stan[["m11d_stan"]] <- m11d_stan

Multilevel normal models

area as multilevel

d <- d_short_z[, c("afd_prop",  "area_nr")]

m12_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),    
    mu <- beta0[area_nr],
    beta0[area_nr] ~ dnorm(a, sigma),
    a ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1),
    alpha ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2,
  cores = 2
)
#> Error in map2stan(alist(afd_prop ~ dnorm(mu, sigma), mu <- beta0[area_nr], : Something went wrong in at least one chain. Debug your model while setting chains=1 and cores=1. Once the model is working with a single chain and core, try using multiple chains/cores again.
#> argument is of length zero
precis(m12_stan, depth = 2) %>% head()
#coeftab(m12_stan)
my_traceplot(m12_stan)
models_stan[["m12_stan"]] <- m12_stan

Too many coefficients…

state as multilevel


d <- d_short_z[, c("afd_prop",  "state_id")]

m13_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),    
    mu <- beta0[state_id],
    beta0[state_id] ~ dnorm(0, sigma2),
    sigma ~ dnorm(0, 1),
    sigma2 ~ dnorm(0, 1)
  ),
  data = d,
  cores = 2,
  chains = 2
)
#> Error in map2stan(alist(afd_prop ~ dnorm(mu, sigma), mu <- beta0[state_id], : Something went wrong in at least one chain. Debug your model while setting chains=1 and cores=1. Once the model is working with a single chain and core, try using multiple chains/cores again.
#> argument is of length zero
precis(m13_stan, depth = 2)
coeftab(m13_stan)
WAIC(m13_stan)
my_traceplot(m13_stan)
models_stan[["m13_stan"]] <- m13_stan

Rhat is ok. Traceplot indicates problems. neff indicates problems.

east + for_prop + unemp_prop

d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z", "east")]

m14_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),    
    mu <- beta0[east] +  beta1*for_prop_z + beta2*unemp_prop_z,
    beta0[east] ~ dnorm(0, sigma2),
    sigma ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    beta2 ~ dnorm(0, 1),
    sigma2 ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2,
  cores = 2
)
#> Error in map2stan(alist(afd_prop ~ dnorm(mu, sigma), mu <- beta0[east] + : Something went wrong in at least one chain. Debug your model while setting chains=1 and cores=1. Once the model is working with a single chain and core, try using multiple chains/cores again.
#> argument is of length zero
precis(m14_stan, depth = 2)
coeftab(m14_stan)
my_traceplot(m14_stan)
models_stan[["m14_stan"]] <- m14_stan

state + for_prop + unemp_prop

d <- d_short_z[, c("afd_prop", "for_prop_z", "unemp_prop_z", "state_id")]

m15_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),    
    mu <- beta0[state_id] +  beta1*for_prop_z + beta2*unemp_prop_z,
    beta0[state_id] ~ dnorm(0, sigma2),
    sigma ~ dnorm(0, 1),
    sigma2 ~ dnorm(0, 1),
    beta1 ~ dnorm(0, 1),
    beta2 ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2,
  cores  = 2
)
#> Error in map2stan(alist(afd_prop ~ dnorm(mu, sigma), mu <- beta0[state_id] + : Something went wrong in at least one chain. Debug your model while setting chains=1 and cores=1. Once the model is working with a single chain and core, try using multiple chains/cores again.
#> argument is of length zero
precis(m15_stan, depth = 2)
coeftab(m15_stan)
my_traceplot(m15_stan)
models_stan[["m15_stan"]] <- m15_stan

Normal null model

d <- d_short_z[, c("afd_prop"), drop = FALSE]

m16_stan <- map2stan(
  alist(
    afd_prop ~ dnorm(mu, sigma),    
    mu <- alpha,
    alpha ~ dnorm(0, 1),
    sigma ~ dnorm(0, 1)
  ),
  data = d,
  chains = 2
)
#> Error in map2stan(alist(afd_prop ~ dnorm(mu, sigma), mu <- alpha, alpha ~ : Something went wrong, when calling Stan. Check any debug messages for clues, detective.
#> argument is of length zero
precis(m16_stan, depth = 2)
coeftab(m16_stan)
my_traceplot(m16_stan)
models_stan[["m16_stan"]] <- m16_stan

Compare Normal models

Now check which one has a lower DIC or WAIC:

stan_normal_models <- list("m9_stan" = m9_stan, 
                           "m9a_stan" = m9a_stan, 
                           "m10_stan" = m10_stan,
                           "m11a_stan" = m11a_stan, 
                           "m11c_stan" = m11c_stan,
                           "m11d_stan" = m11d_stan,
                           "m12_stan" = m12_stan,
                           "m13_stan" = m13_stan,
                           "m14_stan" = m14_stan,
                           "m15_stan" = m15_stan,
                           "m16_stan" = m16_stan)

stan_normal_models_vec <- c("m9_stan" = m9_stan, 
                            "m9a_stan" = m9a_stan, 
                            "m10_stan" = m10_stan,
                            "m11a_stan" = m11a_stan, 
                            "m11c_stan" = m11c_stan,
                            "m11d_stan" = m11d_stan,
                            "m12_stan" = m12_stan,
                            "m13_stan" = m13_stan,
                            "m14_stan" = m14_stan,
                            "m15_stan" = m15_stan,
                            "m16_stan" = m16_stan)


#stan_normal_models

#save(stan_normal_models, file = "stan_normal_models.Rda")
stan_model_comparison <- compare(m9_stan, m9a_stan, 
                                 m10_stan,
                                 m11a_stan, m11c_stan, m11d_stan,
                                 m12_stan, m13_stan, m14_stan, m15_stan,
                                 m16_stan)
stan_model_comparison 

save(stan_model_comparison, file = "stan_model_comparison.Rda")

m11b is better than m11a - unemployment apears to be more important to predict AfD success compared to migration rate in the area.

Get best (normal) model

Which model ist best? Let’s define the model with the lowest WAIC as best:

stan_model_comparison@output %>% 
  rownames_to_column(var = "model_name") %>%  
  slice(which.min(WAIC)) -> best_model

And now extract the name of the best model:

best_model_name <-
  best_model %>%
  pull(model_name)

Geo plotting

AfD success in the election:

wahlkreise_shp %>% 
  left_join(select(d_short, area_nr, afd_prop), by = c("WKR_NR" = "area_nr")) %>% 
  ggplot() +
  geom_sf(aes(fill = afd_prop)) + 
  theme_void() +
  scale_fill_viridis() +
  labs(fill="Afd votes\n(Zweitstimme)",
       caption = "Data provided by the Bundeswahlleiter 2017") -> p_afd
p_afd

Unemployment rates in Germany per district:

wahlkreise_shp %>% 
  left_join(select(d_short, area_nr, unemp_n, total_n), by = c("WKR_NR" = "area_nr")) %>% 
  mutate(unemp_prop = unemp_n / total_n) %>% 
  ggplot() +
  geom_sf(aes(fill = unemp_prop)) + 
  theme_void() +
  scale_fill_viridis() +
  labs(fill="unemployment rate",
       caption = "Data provided by the Bundeswahlleiter 2017") -> p_unemp
p_unemp

Foreigner rates:

wahlkreise_shp %>% 
  left_join(select(d_short, area_nr, for_prop, total_n), by = c("WKR_NR" = "area_nr")) %>% 
  ggplot() +
  geom_sf(aes(fill = for_prop)) + 
  theme_void() +
  scale_fill_viridis() +
  labs(fill="Foreigner rate",
       caption = "Data provided by the Bundeswahlleiter 2017") -> p_foreign

p_foreign

Joint diagrams

library(gridExtra)

grid.arrange(p_unemp, p_afd, nrow = 1)
grid.arrange(p_foreign, p_afd, nrow = 1)

Computing prediction errors

Here’s a function to compute the modeling error, defined as the absolute difference of the estimated model value (of afd proportion) minus the observed value (of afd proportion).

comp_error <- function(model, data = d_short_z, fun = mean) {
  posterior_per_person <- link(model)
  
  
  as_tibble(posterior_per_person) %>% 
    summarise_all(fun) %>% 
    gather() %>% 
    rename(estimate = value) %>% 
    mutate(afd_prop = data$afd_votes / data$votes_total,
           error = abs(estimate - afd_prop)) %>% 
    pull(error) -> error_vec
  
  return(error_vec)
}

Apply the function on all models:

model_error <- lapply(stan_normal_models, comp_error)

#model_error
#save(model_error, file = "model_error.Rda")

Compute the median absolute error:

md_abs_error_all_models <- sapply(model_error, median) %>% unlist()
md_abs_error_all_models

best_model_name <- md_abs_error_all_models[which.min(md_abs_error_all_models)] %>% names()

Also, compute the IQR of the errors:

modell_error_IQR <- lapply(stan_normal_models, comp_error, fun = IQR)

Visualizing prediction error

Some preparation:

model_error %>% 
    as.data.frame() -> model_error_df

#glimpse(model_error_df)

# model_names_binom <- list("m0_stan", "m1_stan", "m2_stan", "m3_stan",
#                "m4_stan", "m5_stan", "m6_stan", "m7_stan") 

model_names <- c("m9_stan",
                 "m9a_stan",
                 "m10_stan" ,
                 "m11a_stan",
                 "m11c_stan",
                 "m11d_stan",
                 "m12_stan",
                 "m13_stan",
                 "m14_stan",
                 "m15_stan" ,
                 "m16_stan")


names(model_error_df) <- model_names

model_error_df %>% 
  mutate(afd_prop = d_short$afd_prop,
         id = 1:nrow(model_error_df)) -> model_error_df


modell_error_IQR %>% 
  as.data.frame() -> model_error_IQR_df


names(model_error_IQR_df) <- model_names

model_error_IQR_df %>% 
  mutate(id = 1:nrow(model_error_IQR_df)) -> model_error_IQR_df

Convert to long version for plotting:

model_error_IQR_df %>% 
  gather(key = model, value = iqr, -c(id)) %>% 
  mutate(stat = "IQR") -> model_error_IQR_df_long


model_error_df %>% 
  gather(key = model, value = error, -c(afd_prop, id)) %>% 
  mutate(stat = "median") -> model_error_df_long

model_error_df_long %>% 
  bind_rows(model_error_IQR_df_long) -> model_error_long


model_error_df_long %>% 
  left_join(model_error_IQR_df_long, by = c("id", "model")) %>% 
  select(-c(stat.x, stat.y)) -> model_error_md_iqr

glimpse(model_error_md_iqr)

#save(model_error_md_iqr, file = "model_error_md_iqr.RDa")

Now plot:


as_tibble(md_abs_error_all_models) %>% 
  mutate(model = unlist(model_names),
         best_model = ifelse(model_names == best_model_name, TRUE, FALSE)) -> md_abs_error_all_models

glimpse(md_abs_error_all_models)

model_error_md_iqr %>% 
  arrange(-error) %>% 
  ggplot(aes(x = id)) +
  facet_wrap(~model) +
  geom_hline(aes(yintercept = value), data = md_abs_error_all_models) +
    geom_errorbar(aes(ymin = error - (iqr/2),
                    ymax = error + (iqr/2)),
                alpha = .8,
                color = "gray40") +
  geom_point(aes(y = error)) +
  geom_label(aes(label = round(value, 3),
                 color = best_model), x = 1, y = .2, 
            data = md_abs_error_all_models, 
            hjust = 0) +
  guides(color=FALSE) +
  scale_color_manual(values = c("red", "darkgreen"))

Plotting prediction error against observed values

posterior_per_person_best_model <- link(m15_stan)
  
posterior_per_person_best_model %>%  
  as_tibble() %>% 
    summarise_all(median) %>% 
    gather() %>% 
    rename(estimate = value) %>% 
  add_column(area_nr = 1:nrow(.)) %>% 
  full_join(d_short) %>%
  mutate(error = abs(estimate - afd_prop),
         top05 = percent_rank(error) >= .95) -> d_short_w_pred_err 


polygon_pos <- data.frame(
  x = c(0, 0.4, 0.4,    0, 0.4, 0, 0 ),
  y = c(0, 0, 0.4,      0, 0.4, 0.4, 0),
  value = c("underestimates", "underestimates", "underestimates", "overestimates", "overestimates", "overestimates", "overestimates")
)
 
d_short_w_pred_err %>%  
  ggplot() +
  aes(x = afd_prop, y = estimate) +
  geom_abline(slope = 1, intercept = 0, color = "grey60") +
  geom_polygon(data = polygon_pos, aes(x = x, y = y, fill = value), alpha = .1) +
  geom_point(aes(color = error,
                 shape = east),
             alpha = .6,
             size = 2) +
  ggrepel::geom_label_repel(aes(label = area_name), data = filter(d_short_w_pred_err, top05 == TRUE)) +
  annotate("text", x = 0.4, y = 0, label = "model understimates", hjust = 1, vjust = 0) +
  annotate("text", x = 0, y = 0.4, label = "model overestimates", hjust = 0, vjust = 1) +
  labs(x = "Observed AfD votes (proportion)",
       y = "Estimated AfD votes (proportion)",
       caption = "n=299 electoral districts; data provided by Bundeswahlleiter 2017") +
  guides(fill = FALSE)

Sensecheck

I just wondered what the bivariate correlations of the predictors to afd_vote is`. Let’s check that.

library(GGally)

ggpairs(d_short, columns = c("foreigner_n", "unemp_n", "afd_votes"))

There is a substantial negatie correlations with number of foreigners and a positive correlation with unemployment, but (nearly) no correlation with unemployment numbers. But our analysis above suggest that these associations are spurious once the state is controlled for. To make things easy, let’s pick two states, say Sachsen und Bayern and check in each state the assoctions.

d_short %>%
  filter(state == "Bayern") %>% 
  ggpairs( columns = c("foreigner_n", "unemp_n", "afd_votes"))
d_short %>%
  filter(state == "Sachsen") %>% 
  ggpairs( columns = c("foreigner_n", "unemp_n", "afd_votes"))

Interestingly, in Sachsen manifests a strong negative correlation between AfD and foreigner rates.

Check some model assumptions

Let’s compute the predictions for each model:


model_predictions <- lapply(stan_normal_models, link)
 

Now plot the predictions against the error, as advised by Gelman and Hill.

First, get predictions of the best moel:

model_predictions[[best_model_name]] %>% 
  as_tibble %>% 
  summarise_all(mean) %>% 
  gather() -> best_model_preds

# save(best_model_preds, file = "best_model_preds.Rda")

Each observation is one row in this data frame.

Similarly, get errors of the best model:

model_error[[best_model_name]] %>% 
  as_tibble() %>%
  rename(error = value) %>% 
  mutate(pred = best_model_preds$value) -> best_model_pred_err
  
best_model_pred_err %>% 
  ggplot() +
  aes(x = pred, y = error) +
  geom_hline(yintercept = quantile(best_model_pred_err$error, .5),
             color = "grey60") +
  geom_hline(yintercept = quantile(best_model_pred_err$error, .975),
             color = "grey80", , linetype = "dashed") +
  geom_hline(yintercept = quantile(best_model_pred_err$error, .025),
             color = "grey80", , linetype = "dashed") +
  geom_point() +
  labs(title = best_model_name,
       xlab = "Model predictions",
       ylab = "Model error",
       caption = "Note. Horizontal lines denote .025, .5, and .975 quantiles, respectively") 
  

Check posterior distribution of predictors

Let’s have a look at the posterior distribution of the best_model.

post_best_model <- extract.samples(stan_normal_models[[best_model_name]])

This object is a list. Let’s convert it to a data frame for easier plotting.

post_best_model_df <- data_frame(
  sigma = post_best_model[["sigma"]],
  sigma2 = post_best_model[["sigma2"]],
  beta1 = post_best_model[["beta1"]],
  beta2= post_best_model[["beta2"]]
)

And now we plot a number of histograms:

post_best_model_df %>% 
  gather() %>%  
  rename(coef = key) %>% 
  ggplot() +
  aes(x = value) +
  facet_wrap(~coef, scales = "free") +
  geom_histogram() -> p_post_best_model
p_post_best_model

Now compute summary statistics:

post_best_model_df %>% 
  gather() %>% 
  rename(coef = key) %>% 
  group_by(coef) %>% 
  summarise(q05 = quantile(value, .05),
            q50 = quantile(value, .5),
            q95 = quantile(value, .95),
            value = mean(value)
  )  -> post_best_model_df_sum
  #gather(key = my_quantile, value = value, -coef) -> post_best_model_df_sum

head(post_best_model_df_sum)

Now plot both:

p_post_best_model +
  geom_rect(data = post_best_model_df_sum,
              aes(xmin = q05,
                  xmax = q95,
                  ymin = 0,
                  ymax = Inf),
            fill = "red",
            alpha = .2) +
  theme(axis.text.y=element_blank(),
        axis.ticks.y = element_blank()) +
  labs(title = best_model_name,
       y = "",
       caption = "Note. Shaded areas demark 90% mass intervals")