Download fake-panel-data.csv from https://ditraglia.com/data. This dataset was simulated according to the one-way error components model described above. It contains six columns: person is a unique person identifier (name), year is a year index (1-5), x and y are the regressor and outcome variable, and epsilon and eta are the error terms. (In real data you wouldn’t have the errors, but this is a simulation!)
Use lm to regress y on x with “classical” standard errors. Repeat with standard errors clustered by person using lm_robust(). Discuss your results.
Plot y against x along with the regression line from part 1.
Repeat 2, but use a different color for the points that correspond to each person in the dataset and plot a separate regression line for each person.
What does the plot you made in part 3 suggest? Use the columns epsilon and eta to check your conjecture.
Finally, use lm_robust() to regress y on xand a dummy variable for each person, clustering the standard errors by person. Discuss your results.