Exercise - The Colonial Origins of Comparative Development

This question is based on Acemoglu, Johnson, and Robinson (2001) (AJR). To answer it you will need to consult the paper. You will also need a copy of the dataset ajr.dta, which is available from my website at https://ditraglia.com/data/ajr.dta. Notice that ajr.dta is a STATA file so you’ll need to use an appropriate R package to open it. Here is a description of the variables from ajr.dta that you’ll need below:

Name Description
longname Full name of country, e.g. Canada
shortnam Abbreviated country name, e.g. CAN
logmort0 Natural log of early European settler mortality
risk Avg. protection against expropriation risk 1985-1995 (0 to 10)
loggdp Natural log of 1995 GDP/capita at purchasing power parity
latitude Absolute value of latitude (scaled between 0 and 1)
meantemp 1987 mean annual temperature in degrees Celsius
rainmin Minimum monthly rainfall
malaria % of Popn. living where falciparum malaria is endemic in 1994

The most important variables are loggdp, which is the outcome variable (\(Y\)), risk which is the regressor of interest (\(X\)), and logmort0, which AJR propose as an instrumental variable (\(Z\)) for risk. Both loggdp and logmort0 are fairly self-explanatory, but risk is a bit strange. The larger the value of risk, the more protection a country has against expropriation. In other words, large values of risk indicate better institutions, as described in the first paragraph of AJR.

For simplicity we will not consider the possibility of heterogeneous treatment effects in this problem. Moreover, because the original AJR paper does not report robust standard errors, feel free to use “plain vanilla” standard errors throughout. (The sample size is small enough that the robust standard errors would be very noisy in any case!)


  1. Read the abstract, introduction and conclusion of AJR and answer the following:
    1. What is the key question that AJR try to answer?
    2. Give an overview of AJR’s key theory.
    3. For \(Z\) to be a valid instrument, it must satisfy two assumptions: relevance and exogeneity. Explain what these assumptions mean in the context of AJR. Can either of them be checked using the available data?
  2. OLS Regression:
    1. Regress loggdp on risk and store the result in an object called ols.
    2. Display the results of part (a) in a cleanly formatted regression table, using appropriate R packages.
    3. Discuss your results from (b) in light of your readings from AJR. Can we interpret the results of ols causally? Why or why not?
  3. IV Regression:
    1. Estimate the first-stage regression of risk on logmort0 and store your results in an object called first_stage. Display and discuss your findings.
    2. Estimate the reduced-form regression of loggdp on logmort0 and store your results in an object called reduced_form. Display and discuss your findings.
    3. Use the ivreg function from AER to carry out an IV regression of loggdp on risk using logmort0 as an instrument for risk and store your results in an object called iv.
    4. Display your results from iv. How do they compare to the results of ols? Discuss in light of your answer to 2(c) above.
    5. Verify that you get the same estimate as in part (d) by running IV “by hand” using first_stage and reduced_form.
  4. This question asks you consider a potential criticism of AJR. The critique depends on two claims. Claim #1: a country’s current disease environment, e.g. the prevalence of malaria, is an important determinant of GDP/capita. Claim #2: a country’s disease environment today depends on its disease environment in the past, which in turn affected early European settler mortality.
    1. Explain how claims #1 and #2 taken together would call into question the IV results from Question 3 above.
    2. Suppose that we consider re-running our IV analysis from Question 3 including malaria as an additional regressor. Explain why this might address the concerns you raised in the preceding part.
    3. Repeat Question 2 including malaria as an additional regressor.
    4. Repeat Question 3 part (a) adding malaria to the first-stage regression.
    5. Repeat Question 3 parts (c) and (d) including malaria in the IV regression. Treat malaria as exogenous. This means we will not need an instrument for this variable: instead it serves as its own instrument. See “Details” in the help file for ivreg to see how to specify this.
    6. In light of your results from this question, what do you make of the criticism of AJR based on a country’s disease environment?
  5. This question asks you to consider another potential criticism of AJR promoted by Jeffrey Sachs who stresses “geographical” explanations of economic development.
    1. Repeat part (e) from Question 4 but add latitude, rainmin, and meantemp as additional control regressors in addition to malaria. Each of these variables will serve as its own instrument. Continue to instrument risk using logmort0.
    2. Discuss your results. What do you make of AJR’s view vis-a-vis Sachs’s critique?