# Exercise - NSW Experiment

This problem is based on Dehejia & Wahba (2002; ReStat) “Propensity Score Matching Methods for Nonexperimental Causal Studies.” To answer it you will need two datasets: `nsw_dw.dta`

and `cps_controls.dta`

. Both of these are available online from `https://users.nber.org/~rdehejia/data/`

. In answering the following questions you may find it helpful to consult the associated paper. While we will not replicate the whole paper, you will be able to compare some of your results to theirs. The file `nsw_dw.dta`

contains *experimental data* from the “National Supported Work (NSW) Demonstration, a program that randomly assigned eligible workers to either receive on-the-job training (treatment) or not (control). The dataset contains observations for 445 men, of whom 185 were assigned to the treatment group and 260 were assigned to the control group:

Name | Description |
---|---|

`treat` |
Dummy variable: `1` denotes treated, `0` control |

`age` |
self-explanatory |

`education` |
years of education |

`black` |
dummy variable: `1` denotes black |

`hispanic` |
dummy variable: `1` denotes hispanic |

`married` |
dummy variable: `1` denotes married, `0` unmarried |

`nodegree` |
dummy variable: `1` denotes no high school degree |

`re74` |
real earnings in 1974 (pre-treatment) |

`re75` |
real earnings in 1975 (pre-treatment) |

`re78` |
real earnings in 1978 (post-treatment) |

The file `cps_controls.dta`

contains observations of the *same variables* for 15,992 men who were *not* subjects in the NSW experiment. This information is drawn from the Current Population Survey (CPS). Because none of these men were in the experiment, none of them received the treatment. Hence `treat`

equals zero for all of them.

Below we will compare treatment effect estimates from the *experimental data* from the NSW with alternatives constructed by applying regression adjustment and propensity score weighing applied to a “composite” sample that includes treated individuals from the NSW and untreated individuals from the CPS. Here’s the idea. The NSW was a randomized controlled trial, so we can easily compute an unbiased estimate of the ATE. There’s no need for selection-on-observables assumptions, valid instruments, or any other clever identification strategies. But in many real-world situations observational data are all that we have to work with. Can we somehow use the NSW data to see how well various observational methods *would have performed* compared to the experimental ideal? Here’s a possible way to accomplish this. The problem of causal inference is one of constructing a valid control group. How would our causal effect estimates change if we replaced the *real* NSW control group with a “fake” control group constructed from the CPS using statistical modeling? The challenge is that NSW participants were not a random sample from the US population, whereas CPS respondents were.

- Data Cleaning:
- Load the experimental data from
`nsw_dta.dta`

and store it in a tibble called`experimental`

. - Rename
`re74`

to`earnings74`

and do the same for`re75`

and`re78`

. - Convert the dummy variables
`black`

and`hispanic`

into a single character variable`race`

that takes on the values`white`

,`black`

, or`hispanic`

. Hint: use`case_when()`

from`dplyr`

. - Convert
`treat`

,`degree`

and`marriage`

from dummy variables to character variables. Choose meaningful names so that each becomes self-explanatory. (E.g. a binary`sex`

dummy becomes a character variable that is either`male`

or`female`

.) - Earnings of zero in a particular year indicate that a person was unemployed. Use this fact to create two variables:
`employment74`

and`employment75`

. Each of these should be a character variable that takes on the value`employed`

or`unemployed`

to indicate a person’s employment status in 1974 and 1975. - Drop any variables that have become redundant in light of the steps you carried out above. You can also drop
`data_id`

since it takes on the same value for every observation in this dataset.

- Load the experimental data from
- Experimental Results:
- Use
`datasummary_skim()`

to make two tables of summary statistics for`experimental`

: one for the numerical variables and another for the categorical ones. - Use
`datasummary_balance()`

to make a table that compares average values of the variables in`experimental`

across treatment and control groups. Comment on the results. - Construct an approximate 95% confidence interval for the average treatment effect of the NSW program based on the data from
`experimental`

. Interpret your findings.

- Use
- Construct the Composite Sample:
- Load the CPS data from
`cps_controls.dta`

and store it in a tibble called`cps_controls`

. - Clean
`cps_controls`

using the same steps that you applied to`experimental`

above. (Consider writing a function so you don’t have to do the same thing twice!) - Use
`bind_rows()`

from`dplyr`

to create a tibble called`composite`

that includes*all*of the individuals from`cps_controls`

and*only the treated individuals*from`experimental`

. - Use
`datasummary_balance()`

to compare the two groups in`composite`

: the treatment group from the NSW and the “controls” from the CPS. - Comment on your findings. What, if anything, does the difference of mean outcomes between the two groups in
`composite`

tell us about the treatment effect of interest?

- Load the CPS data from
- Regression Adjustment:
- Regress 1978 earnings on the other variables in
`composite`

and display the results. - Explain in detail how, and under what assumptions, the regression from the preceding part can be used to estimate the treatment effect of interest. If we assume that these assumptions hold, what is our estimate? How does it compare to the experimental results from above?

- Regress 1978 earnings on the other variables in
- Propensity Score Weighting
- Run a logistic regression to estimate the
*propensity score*using the data from`composite`

. Because`glm()`

requires a numeric outcome variable, I suggest creating a tibble called`logit_data`

that makes the necessary adjustments beforehand. Think carefully about which variables to include. You don’t necessarily have to match the precise specification that the authors use in their paper (although you can if you like: see note A from Table 2) but there is one variable in`composite`

that definitely should*not*be included in your model. Which one is it and why? - Make two histograms of your estimated propensity scores from the preceding part: one for the treated individuals and one for the untreated. What do your results suggest? (Feel free to make additional plots or compute additional summary statistics to support your argument.)
- Calculate the propensity score weighting estimator. You should obtain a
*crazy*result. (You should obtain a*crazy*result!) - Repeat the preceding part except this time
*drop*any observations with a propensity score less than 0.1 or greater than 0.9 before calculating the propensity score weighting estimator. (You should get a*non-crazy*result!) - Find an explanation for the difference in your results between parts (c) and (d).

- Run a logistic regression to estimate the