It’s nearly impossible to overstate the value that economists ascribe to cleverness. Like most obsessions, this one is not altogether healthy.
My general philosophy in life is never to rely on being clever; instead I want to rely on being thorough and having a justifiable workflow.
A theoretical statistician knows all about measure theory but has never seen a measurement whereas the actual use of measure theory by the applied statistician is a set of measure zero.
You don’t need to learn how to code. You just need to be able to tell a computer what to do in a way that it will respond, understand what it’s doing and how to optimize that, and fix it when it’s not working.
This is the course website for Core Empirical Research Methods (core ERM), a 1st-year MPhil course in the Economics Department at the University of Oxford. Core ERM will help you develop the basic skills you’ll need to carry out applied economic research. It will cover a mix of applied econometrics, programming/computing, and research skills. The prerequisites are basic familiarity with programming in some language, not necessarily R, and an introductory course in econometrics at the masters level. If you are interested in auditing this course see Auditing Core ERM below.
Because Core ERM is about doing applied economics, it will not be a traditional lecture course. Students should bring their laptops to lectures so that they can follow along with live demos and work on examples in small groups. While there will still be some lecture-style material, the overall format will be closer to a “lab” in the natural sciences. As such attendance is mandatory if you are taking this course for credit. Please see Attendance below for more details. GTAs (Graduate Teaching Assistants) will attend each lecture to help give you individualized help if you get stuck while working through in-class exercises. See Required Software for details on how to configure your machine for core ERM.
Lecturer: Francis J. DiTraglia
Teaching Assistants (GTAs)
We will not use canvas for core ERM. Instead, all course materials will be posted on the course website and all other communication will take place on ed. Please register for the discussion board by following this link. I have enabled self sign-up for all email addresses that end in @ox.ac.uk
or @*.ox.ac.uk
so either your college or departmental email address should work. Please do not send email messages to your GTAs or the course instructor; we ask that you use the discussion board instead. If you have a post about course content, we kindly request that you post it publicly–you are free to remain anonymous when posting publicly–so that our answer can benefit the other students in the course. Your classmates may also know the answer and be able to help you faster than we can, so there’s both a private and public benefit to this approach. For personal issues or questions specific to your mini-project please can send us a private message on the discussion board. Keeping all course communication in one place will allow us to spend more time helping you learn and less time on course admin.
All class meetings will take place in the Manor Road Building (MRB)
Weeks 1-8 of Trinity Term, MRB Lecture Theatre. Lecture attendance is required if you are taking this course for credit. (See Attendance for details.)
Weeks 2-9 of Trinity Term in MRB Seminar Room D. Attendance is optional but strongly recommended. These sessions are particularly valuable for troubleshooting code problems for problem sets, getting feedback on your mini-project, and deepening your understanding of challenging concepts.
You can drop in to speak with me in room 2132 of the Manor Road building during the half hour before each of our lectures, in other words:
Office hours will commence on Thursday May 1st, since presumably there’s nothing much for us to discuss before the course has actually started :)
In this course we will use the R programming language via a front-end called RStudio. Both are freely available on all major platforms. To install them follow these instructions. To smooth out the inevitable start-of-term kinks, during weeks 1 and 2 we will work with RStudio via Posit Cloud. Please sign up for a free account here. This will allow you to get right to work at the start of term even if you encounter problems installing R. Eventually you will need to get R and RStudio working on your own machine, however. The week 3 drop-in surgery is an excellent place to get help with installation issues.
Because core ERM is an interactive, lab-based course, lecture attendance is mandatory. It is also in your best interest. A major part of your assessment is based on problem set. We will work through many of these together during lectures, but recordings will not be made available during the term. Because the material in core ERM is highly cumulative–each week builds on the last–regular attendance is the easiest and most reliable way to ensure that you gain the skills you will need to pass the course.
Moreover, while I would prefer to rely on the carrot rather than the stick, I will keep track of attendance at lectures in TT 2025. Students who miss more than five lectures without prior authorization will be contacted by the director of graduate studies and the senior tutor of their college. If you are in the UK on a student visa, it is particularly important that you attend regularly, as the government requires me to certify that you have been actively engaged with your course of study during the term. While it would never be my goal to try to get anyone into trouble, I am legally and ethically bound to report your attendance accurately when it is formally requested of me.
This course is pass/fail and will be assessed entirely on the basis of coursework assignments. Before we go any further: yes it is possible to fail core ERM. See Re-sits for more details. All assignments must be submitted via Inspera. See Inspera Submission Requirements for more details on how to submit. Your coursework assignments come in two parts, each of which will be assessed using the same marking criteria as detailed below. To pass the course, you must pass both parts of the assessment. The two parts are as follows:
Part A will consist of four problem sets due in TT weeks 2, 4, 6, and 8:
Part B will consist of “mini-project” of your choice that you will complete between weeks 3 and 9 of term. Your mini-project will be due at noon on Wednesday of TT Week 9. For full details, see the Mini-Project FAQs below. Because you choose the mini-project, you can work on something that is intrinsically interesting to you. Ideally the topic will be relevant to your MPhil thesis: you can kill two birds with one stone. And because you will complete your mini project during the term, you will have the opportunity to get help and feedback from me and your GTAs at lectures and the weekly drop-in surgeries.
You are allowed, and indeed encouraged, to discuss course problems and assignments with your classmates and GTAs, but you are not allowed to directly copy code or results from another student. The work that you submit for assessment must be your own, even if it incorporates suggestions from your classmates and GTAs.
There are some restrictions on how you are allowed to use large language models (LLMs) in your problem set submissions. In short: you can consult them in the same way that you are free to consult your classmates and GTAs, e.g. as a tool to help you learn R, help debug code, and so on. But you are not allowed to paste in problem set questions and ask for solutions. For example, asking “Can you explain how to filter rows in dplyr?” is acceptable, while asking “How would I solve question 3 from problem set 2?” is not permitted. For the same reason, you are not permitted to use tools that autocomplete code as you type–e.g. GitHub Copilot–when completing problem sets. Generative AI can most likely generate correct solutions to all of my problem set problems, so you may find yourself sorely tempted. There are two reasons why you should not succumb. First, perfectly correct solutions generated by ChatGPT and Claude look sufficiently dissimilar to the examples that I provide in my course materials that it is extremely easy for me to tell that they were AI-generated. Second, if you rely solely on AI, you will never learn to code. And if you never learn to code, you will put yourself out of a job. AI tools substitute for humans with low coding ability; they complement humans with high coding ability.
I insist that you learn to code, but I also insist that you learn to use AI. For this reason, we will help you set up Github Copilot and teach you how to use it. On your mini-project you are free to use generate AI however you see fit: there are no restrictions whatsoever. But please bear in mind that any code you submit must adhere to my Marking Criteria.
Most of the course material for core ERM is delivered through a series of online videos with associated exercises and solutions. You are expected to watch these videos at home before our class meetings, work through the short exercises, and check your work against my solutions. This will allow us to use class time to do more interesting and exciting things, including working together on problem set questions and mini-projects.
Problem set questions will be posted here during the term. Please consult the marking criteria and academic integrity policy for more information.
When I first taught this course back in 2022, I started writing a book to accompany it. This turned out to be a tall order, but I did manage to produce ten draft chapters. You can view them at https://empirical-methods.com. Based on my experiences teaching version 1.0 of core ERM, I decided to make a number of changes to the course. While much of the material in my draft book remains relevant, my lecture slides will be the final authority on the course material in the present version of core ERM. I hope to rework the book before next year’s version of core ERM.
Barring a serious personal issue that affects your studies, there is no reason why you should fail core ERM. If you attend class, participate actively, and get help at the drop-in surgeries as needed, you will develop all of the skills needed to complete the course assignments to the appropriate standard. If for some reason you do fail core ERM, you will be given the opportunity to re-sit any failed assignments the next time that core ERM is offered, i.e. in Trinity term of next year. (Remember: you need to pass all four problem sets and your mini-project to pass the course.) Clearly this is something you will want to avoid, so take my advice and do what’s necessary to pass the first time around.
The course mini project is a small independent project of your own choosing, designed to require roughly the same time commitment as two problem sets. You will complete your project between Weeks 2 and 9 of term and submit it as part of your course assessment.
Your project should be a replication (or partial replication) of a reputable paper in economics or a closely related field. Specifically, you will:
Obtain the original data used in the paper.
Write R code to clean the data.
Reproduce a few key tables and figures from the paper — especially those containing summary statistics and main results.
You are free to choose which parts of the paper to replicate. However, to ensure comparable workload across projects, each submission must include:
At least three tables and/or figures, including:
A table of summary statistics.
The results of the primary analysis.
A robustness check or heterogeneity analysis.
Even if the original paper does not include these elements, your replication must.
We strongly encourage you to choose papers that use microdata (e.g., individual-level survey data like the UK Labour Force Survey). However, if you select a paper using macro data (e.g., national unemployment rates) – where data cleaning is typically simpler – we expect you to replicate at least four tables and/or figures.
On top of showing that you can run the same analysis as the authors, a high-quality replication must also engage critically with the paper and its findings. We ask that your final submission include a 2-4 page-long introduction that covers the motivation, contribution and key findings of the paper. Whenever you are replicating an analysis, you should provide the estimating equation, and comment on the methodology. You should discuss what the estimation actually does (i.e. what are we estimating?), and how it does it (i.e. why is this the right specification / method, how is the parameter of interest identified). Finally, you should comment on your results, evaluate whether the paper is indeed replicable, and suggest specific areas where improvements could be made to the paper.
When selecting a paper, you must adhere to the following five rules:
Each student must choose a different paper.
There must not already be R replication code available online for your chosen paper (code in other languages, such as Stata, is fine).
The necessary data must be available online and free of special access restrictions. You may use data that is not available directly in the replication files, but it must be publicly available and reasonably accessible (e.g. UK labour force data is fine, but Swedish administrative data is not).
The paper must have been published within the last 10 years in a high-quality economics journal. Recommended journals include:
You must receive approval from either me or one of the GTAs before beginning work.
Subject to these rules, you are free to choose any paper you like. We strongly suggest browsing the resources listed below to find suitable options. Kindly note that we will not accept working papers.
If you’re unsure where to start, two excellent resources are:
You can also attend GTA drop-in sessions to receive suggestions.
You are required to produce the following outputs:
Completion of Paper Sign-Up Sheet
Deadline: Friday, Week 3 at noon
You must submit your selected paper for approval on this sign-up sheet.
Initial Report
Deadline: Friday, Week 6 at noon (As part of your Problem Set #3 Submission)
A 1–2 page description of your project, including:
A brief summary of the paper (methods, identification, key findings)
A description of the original replication files for the paper (organization, data availability)
A draft of the summary statistics table that you will include in the final submission
Final Submission
Deadline: Wednesday Week 9 at noon
The full replication report and code files.
Assignments in Core ERM will be graded pass/fail based on five criteria. Criteria 1–3 are all-or-nothing and necessary to pass a given assignment. Criteria 4 and 5 allow for partial marks.
tinytex
.Between 80 and 90 students take Core ERM for credit each year, but the MRB lecture theatre seats 120. Provided that there’s space left in the room, any member of the university is most welcome to attend my lectures without asking for permission in advance. I ask only that you respect the following guidelines. First, please sit in the back row if you’re auditing so that I can more easily gauge attendance etc. Second, the Drop-In Surgeries are only for students who are taking the course for credit. Third, Core ERM lectures are fairly interactive: I and the GTAs will circulate to help students who encounter difficulties while working on the exercises. I won’t go so far as to say that we won’t help you if you’re auditing, but we will need to prioritize the students who are taking the course for credit.