Communicating Uncertainty with Simulations

R for Data Analysis
Session 11

Viktoriia Semenova

University of Mannheim
Fall 2023

Uncertainty and Inference

Model Plots and Confidence Intervals

tidy(m2, conf.int = T, conf.level = 0.99) %>%
  dplyr::select(term, estimate, starts_with("conf")) %>%
  kable()
term estimate conf.low conf.high
(Intercept) 3.7350210 3.5351075 3.9349345
beauty 0.0787286 0.0360562 0.1214009
female -0.2007897 -0.3326538 -0.0689256
modelplot(m2, conf_level = 0.99) +
  geom_vline(xintercept = 0, lty = 2)

Sampling Distributions

  • The sampling distribution of a statistic is a probability distribution based on a large number of samples of size \(N\) from a given population

  • Sampling distributions represent the variability of our estimates: if we had taken different samples from the population, we would have obtained slightly different estimates

  • Sampling distributions of most of the parameters are normal:

    • Determined by two parameters, mean (center) and standard deviation (spread)

Draws from Simulated Sampling Distributions

We can use our coefficient estimates and uncertainty about them to simulate sampling distributions:

Simulated Sampling Distributions

# get draws from multivariate normal distribution 
sims <- clarify::sim(m2, n = 1000)
(Intercept) beauty female
3.785188 0.0552573 -0.1306640
3.634660 0.0981444 -0.2005275
3.804106 0.0564569 -0.1298170
3.744705 0.0734567 -0.1548727
3.700973 0.0927997 -0.2474655
3.678285 0.0813183 -0.1310003

Simulated Sampling Distributions

as_tibble(sims$sim.coefs) %>% 
  summarise_all(.funs = list(mean = ~ mean(.))) %>% 
  kable()
(Intercept)_mean beauty_mean female_mean
3.734268 0.0788016 -0.2021927
as_tibble(sims$sim.coefs) %>% 
  summarise_all(.funs = list(sd = ~ sd(.))) %>% 
  kable()
(Intercept)_sd beauty_sd female_sd
0.0788563 0.0162285 0.0500556
tidy(m2) %>% 
  kable()
term estimate std.error statistic p.value
(Intercept) 3.7350210 0.0772894 48.325167 0.00e+00
beauty 0.0787286 0.0164977 4.772087 2.50e-06
female -0.2007897 0.0509805 -3.938559 9.47e-05

Calculations with Simulated Coefficients

  • Now instead of one equation with estimated coefficients, we have many with similar, simulated coefficients
  • Each equation will result in slightly different expected value

\[ { \begin{array}{c} \tilde{\beta}_0^1 \times 1+\tilde{\beta}_1^1 \times \text{Beauty Score}_i +\tilde{\beta}_2^1 \times \text{Female}_i &= E(\tilde {\text{Course Evaluation}_i}|\text{Beauty Score}_i, \text{Female}_i)\\ \tilde{\beta}_0^2 \times 1+\tilde{\beta}_1^2 \times \text{Beauty Score}_i +\tilde{\beta}_2^2 \times \text{Female}_i &= E(\tilde {\text{Course Evaluation}_i}|\text{Beauty Score}_i, \text{Female}_i)\\ \tilde{\beta}_0^3 \times 1+\tilde{\beta}_1^3 \times \text{Beauty Score}_i +\tilde{\beta}_2^3 \times \text{Female}_i &= E(\tilde {\text{Course Evaluation}_i}|\text{Beauty Score}_i, \text{Female}_i)\\ \dots \\ \tilde{\beta}_0^{1000} \times 1+\tilde{\beta}_1^{1000} \times \text{Beauty Score}_i +\tilde{\beta}_2^{1000} \times \text{Female}_i &= E(\tilde {\text{Course Evaluation}_i}|\text{Beauty Score}_i, \text{Female}_i)\\ \end{array} } \]

Calculating Predicted Probabilities for Chosen Scenarios

\[ {\text{Beauty Score} = 1,~\text{Female} = 1} \]

lo1 <- sims$sim.coefs[1,1] +  sims$sim.coefs[1,2] * 1 +
  sims$sim.coefs[1,3] * 1 
lo1
(Intercept) 
   3.709781 
lo2 <- sims$sim.coefs[2,1] +  sims$sim.coefs[2,2] * 1 + 
  sims$sim.coefs[2,3] * 1
lo2
(Intercept) 
   3.532277 
# and so on for every row in the matrix 
# with clarify 
evs <- sim_setx(sim = sims, # object with simulated coefs  
         x = list(beauty = 1, # scenario 
                  female = 1))

# compare to manual calculations 
as.matrix(evs) %>% head(6) %>% kable()
1
3.709781
3.532277
3.730746
3.663289
3.546307
3.628604

Summarize Expected Values

Estimate 2.5 % 97.5 %
3.61296 3.470971 3.745622

We are 95% confident that expected course evaluation ranges from 3.47 to 3.75 percentage points in case \[{\text{Beauty Score} = 1,~\text{Female} = 1}\]

Expected Values for Two Scenarios

\[{\text{Beauty Score} = 1,~\text{Female} = 0}\] \[{\text{Beauty Score} = 1,~\text{Female} = 1}\]

female = 0 female = 1
3.840445 3.709781
3.732804 3.532277
3.860563 3.730746
3.818161 3.663289
3.793773 3.546307
3.759604 3.628604

Estimate 2.5 % 97.5 %
female = 0 3.81375 3.682049 3.939730
female = 1 3.61296 3.470971 3.745622

What Is the Effect of Russian TV Propaganda?

fds <- transform(evs, 
          `First Difference` = `female = 1` - `female = 0`) 
fds %>% 
  summary() %>%
  kable()
Estimate 2.5 % 97.5 %
female = 0 3.8137496 3.6820494 3.939730
female = 1 3.6129599 3.4709714 3.745622
First Difference -0.2007897 -0.3014803 -0.102592

Multiple Scenarios and First Differences

evs <- sim_setx(sim = sims, # object with simulated coefs  
         x = list(female = 0:1, # scenario with desired (plausible) values 
                  beauty = c(1, 10)))

as.matrix(evs) %>% head(3) %>% kable()
female = 0, beauty = 1 female = 0, beauty = 10 female = 1, beauty = 1 female = 1, beauty = 10
3.840445 4.337761 3.709781 4.207097
3.732804 4.616104 3.532277 4.415577
3.860563 4.368675 3.730746 4.238858

Multiple Scenarios

fds <- transform(
  evs,
  `FD_female = 0` = `female = 0, beauty = 10` - `female = 0, beauty = 1`,
  `FD_female = 1` = `female = 1, beauty = 10` - `female = 1, beauty = 1`
) 

summary(fds) %>% kable()
Estimate 2.5 % 97.5 %
female = 0, beauty = 1 3.813750 3.6820494 3.9397304
female = 0, beauty = 10 4.522307 4.3377481 4.7072476
female = 1, beauty = 1 3.612960 3.4709714 3.7456223
female = 1, beauty = 10 4.321517 4.1326041 4.5022782
FD_female = 0 0.708557 0.4208372 0.9848888
FD_female = 1 0.708557 0.4208372 0.9848888

Visualising Interaction Effects

Marginal Effects Plot

Quiz: Which of these statements are correct?

04:00

Indridason and Bowler (2014) explore the determinants of cabinet size in parliamentary systems. Below you can find a plot based on one of their model.

  1. Systematic component of the model likely includes variable Legislature Size interacted with another variable.
  2. Marginal effect of the variable Legislature Size is constant across all values of Legislature Size variable.
  3. The relationship between legislature size and cabinet size is strongest for smaller values of legislature size.
  4. For legislatures with sizes above 500, there is, on average, no significant effect of legislature size on cabinet size.
  5. Legislature size seems to be inversely related to cabinet size.