Fixed Effects

Data Analytics and Visualization with R
Session 11

Viktoriia Semenova

University of Mannheim
Spring 2023

Intro

Quiz: Which of these statements are correct?

  1. If there is no overlap in joint distributions of the matching variables, then it would not be possible to find a match, hence (exact) matching would not work.
  2. Propensity score depicts the probability of the unit being treated predicted with the main dependent variable.
  3. Matching works through adding the variation in confounders so that the relationship between \(X\) and \(Y\) can only be attributed to the variation in \(X\).
  4. Matching requires some correlation between the treatment assignment and the potential outcomes for it to work properly.

Common Support Assumption

Matching is Non-parametric

m1 <- lm(y ~ d + x, data = df)
confint(m1)
                 2.5 %     97.5 %
(Intercept) 0.60253035 0.70733015
d           0.13179296 0.24538616
x           0.01373668 0.05222461
cem_df <- matchit(d ~ x,
                  data = df,
                  method = "cem") %>%
  match.data()
m2 <- lm(y ~ d + x, data = cem_df,
         weights = weights)
confint(m2)
                   2.5 %      97.5 %
(Intercept) -0.176066150 -0.07082210
d            0.915051908  1.03682727
x            0.009702613  0.04379835

Fixed Effects

Recap

  • So far we talked about two ways to isolate the relationship between \(X\) and \(Y\), i.e. blocking back doors to identify an effect
    • Statistical control: use \(W\) to explain our \(X\) and \(Y\) variables, then work with the residuals
    • Matching: restrict the data so that observations only have similar values in \(W\) and compare within these values
  • In both cases, we assumed we have the necessary variables to close the back doors (i.e. we assume conditional independence)
  • But what if there are things we need to control but we cannot measure/observe them?

DAG: We Cannot Close All Paths

D a Treatment b Outcome a->b c Confounder c->a c->b d Unobserved d->b d->c e Unmeasured e->a e->b

Our Solution: Fixed Effects

  • If we observe each person/party/country/etc. multiple times, then we can forget about controlling for the actual back-door variable we’re interested in
  • And just control for person/party/country/etc. identity instead!
  • This will control for everything unique to that individual, whether we can measure it or not!

Fixed Effects Illustration

Example: GDP per capita and Life Expectancy


library(gapminder) # built-in dataset
data(gapminder) # load data from it
glimpse(gapminder)
Rows: 1,704
Columns: 6
$ country   <fct> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanist…
$ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia…
$ year      <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, 2002, 2007…
$ lifeExp   <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 40.822, 41.674…
$ pop       <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372, 12881816, 13…
$ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.1134, 978.0114, …
datasummary_skim(gapminder)
Unique (#) Missing (%) Mean SD Min Median Max
year 12 0 1979.5 17.3 1952.0 1979.5 2007.0
lifeExp 1626 0 59.5 12.9 23.6 60.7 82.6
pop 1704 0 29601212.3 106157896.7 60011.0 7023595.5 1318683096.0
gdpPercap 1704 0 7215.3 9857.5 241.2 3531.8 113523.1

Example: GDP per capita and Life Expectancy


# remove the means
gapminder <- gapminder %>%
  group_by(country) %>%
  mutate(
    lifeExp_c = lifeExp - mean(lifeExp),
    logGDP_c = log(gdpPercap) - mean(log(gdpPercap))
  ) %>%
  ungroup()

# compare correlations
cor(gapminder$lifeExp, log(gapminder$gdpPercap))
[1] 0.8076179
cor(gapminder$lifeExp_c, gapminder$logGDP_c)
[1] 0.6404051

Gapminder Illustration

We Control For Multiple Confounders at Once

D a GDP per capita b Life Expectancy a->b c Political Institutions c->a c->b d History d->a d->b e Economic Institutions e->a e->b f War f->a f->b g Pandemic g->a g->b

  • If these factors stay constant within country, we don’t need a big long list of back doors.

Control for Country

D a GDP per capita b Life Expectancy a->b d Country d->a d->b

How Fixed Effects Work

  • Ignoring the baseline differences between Germany, Britain, China, etc., in their GDP per capita and life expectancy, and just looking within each country
  • We are comparing countries to themselves at different time periods
  • We are ignoring all differences between countries and looking only at differences within countries
  • Fixed Effects is sometimes also referred to as the “within estimator”
  • Within variation: variation that occurs within an individual (usually) across different periods of time.
  • Between variation: variation of a variable that occurs between different individuals, usually at the same period of time, or comparing over-time averages.

Within Variation: One Country

Regression Equations for Fixed Effects

\[ \text{Life Expectancy}_{it} = \beta_{i} + \beta_1 \cdot \text{GDP per capita}_{it} + \varepsilon_{it} \]

\[ \text{Life Expectancy}_{it} = \beta_0 + \beta_1 \cdot \text{GDP per capita}_{it} + \beta_2 \cdot \text{Country}_{it} + \varepsilon_{it} \]

  • Subscript \(it\) indicates that the data varies both between countries (\(i\)) and over time (\(t\))
  • The intercept term has a subscript \(i\) instead of an \(0\), making it \(\beta_{i}\)
  • Units in the data are constrained to have the same slope (there’s no \(i\) subscript on \(\beta_2\)), but they have different intercepts

Interpretation

\[ \text{Life Expectancy}_{it} = \beta_0 + \beta_1 \cdot \text{GDP per capita}_{it} + \beta_2 \cdot \text{Country}_{it} + \varepsilon_{it} \]

  • If we have fixed effects for country, we are comparing that country to itself over time
  • And if we had fixed effects for continent, we are comparing country in that continent only to other country in that continent
  • We can include more than one FE where needed

Pooled Ordinary Least Squares

With Country Fixed Effects

When to Use Fixed Effects

  • Panel data: refers to situations where the number of time periods is quite short and the number of units quite high.
    • NES panel is like this: 2000 respondents asked questions at various points in time over the course of an election (or multiple elections).
  • Time-series cross-sectional data: has fewer units and many time periods (e.g., U.S. states over time or Western European countries over time)

Two-way Fixed Effects

  • We can include multiple fixed effects in one equation, e.g. for
  • Two-way fixed effects is common for TSCS data
    • FE for both individual and time

\[ Y_{it} = \beta_i + \beta_t + \beta_1X_{it} + \varepsilon_{it} \]

  • Here we are looking at variation within individual as well as within year, i.e. the variation that’s left as being variation relative to what we’d expect given that individual, and given that year
  • Estimator focuses more heavily on individuals that have a lot of variation over time

Clustered Standard Errors

  • Regression assumption: error terms are independent of each other, but with hierarchical data, this is likely violated
  • Clustered SEs account for any sort of correlation between errors within each grouping, e.g. country
  • Including a control for a group indicator in the model vs. clustering at that group level are different but often complimentary
    • The indicator variable suggests that group membership is an important predictor of the outcome and possibly on an important back door
    • Clustering suggests that group membership is related to the ability of the model to predict well
  • Good practice: use theory and/or cluster at the level of treatment
  • Often clustered SEs are default implementation

What Fixed Effects Do Not Fix

  • Reverse causality (e.g., crime rates vs. police spending per capita)
  • Time-variant unobserved heterogeneity (e.g., war or pandemic)
  • Fixed effects framework does not straightforwardly extend to non-linear models, especially when the number of groups is large
    • Approach 1: Use OLS
    • Approach 2: Look for implementations, such as feglm()/fenegbin() in fixest package

Factor Varies Over Time

D a GDP per capita b Life Expectancy a->b d Country d->a d->b e War e->a e->b

  • Solution: include War into regression model

Example

  • Suppose you want to know the effect of a teacher on the test scores of high school students
  • Some potential back doors might go through: parents’ intelligence, age, demographics, school, last year’s teacher
  • If you used fixed effects for students, what back doors would still be open?

Main Takeaways

  • Fixed Effects is essentially a dummy variable regression
  • It is useful for panel data (i.e. when we have repeated observations across units) and it allows us to isolate the effect within individuals
  • Fixed effects combines together lots of different constant-within-country back doors into something that lets us identify the model even if we can’t measure them all
  • Use clustered SEs when working with FEs