Problem Set: Minimum Legal Drinking Age
Minimum legal drinking age and mortality
This problem is based on data from Carpenter & Dobkin (2009; AEJ Applied), who study the causal effect of alcohol consumption on mortality in US youths. In 1984, the US Federal Government introduced legislation that compelled states to introduce a minimum legal drinking age (MLDA) of 21. Before then, different states had different minimum ages at which it was legal to purchase alcohol: some had an MLDA of 18, others 19, and others 20. Carter & Dobkin (2009) do not rely upon variation in states’ MLDAs before 1984. (If you’re interested to know why, see the introduction of their paper.) Instead they take a regression discontinuity approach.
Since 1984, a US resident’s birthday has created a sharp change in ease of access to alcohol. The day before your 21st birthday, you cannot buy alcohol legally; on the day itself and forever after you can. If we view the treatment as being able to buy alcohol legally, this is a sharp RD design. The outcome of interest is all-cause mortality. If legal access to alcohol causes an increase in mortality, we should see a “jump” in mortality rates just after people turn 21. For more background, see the paper.
The Data
Because access to the underlying individual mortality data is restricted, here we will work with group averages. The data you will need to complete the following is available from https://ditraglia.com/data/mlda.dta. The dataset contains multiple columns, but you’ll only need two of them. The first is agecell
. This variable contains age in months, stored as a whole number of years plus a decimal. For example, agecell == 19.06849
means roughly 19 years and a month. (It’s a bit inelegant, I agree!) The second is all
, which gives all-cause mortality rates per 100,000 individuals.
These variables were constructed as follows: underlying both of them is individual data on mortality. These individual data were grouped into fifty “bins” based on age in months. The average age in the bin was stored in agecell
and the mortality rate in the bin was stored in all
. (I provide this explanation only in case you’re curious: you won’t need to worry about it below!)
Exercises
- Load the data (it’s a .dta file, so you’ll want to use
read_dta()
from thehaven
package). Typeset the linear RD model and define the Conditional Average Treatment Effect (CATE). - Use a linear RD model to estimate the causal effect of legal access to alcohol on death rates. Plot your results and carry out appropriate statistical inference. Interpret the CATE and discuss your findings. Hint: Consult the “Regression discontinuity” lecture slides if you need a reminder of the econometrics or the implementation in R.
- Repeat the preceding part using a quadratic rather than linear specification. Compare and contrast your findings.
- RD analysis is fundamentally local in nature: the mortality rates of individuals far from the cut-off should not inform us about the causal effect for 21 year olds. Repeat parts 2 and 3 after restricting your sample to ages between 20 and 22, inclusive, to check the sensitivity of your results. Discuss your findings.