For the primary CodeChella in the summertime of 2020, I had somebody make this design for example difference-in-differences ideas. See should you can spot the remedy, the lacking Y(0), the two×2, the no-anticipation and parallel tendencies assumptions, the untreated comparability group. You would possibly even discover slightly covariate imbalance hiding in there.
I point out this as a result of 2020 was the start of a chapter for me. It was after I started to earnestly research the brand new diff-in-diff literature. I simply barely missed the mixtape being submitted to Yale with the fabric in it aside from Bacon’s decomposition. It was the summer season I did CodeChella over Twitch and 1500 individuals got here. And it was the yr we banded collectively to work on the JEL paper.
After which right this moment, late 2025. The sequel to my ebook, Causal Inference: the Remix, had its edits despatched in for approval yesterday. There’ll be two new chapters on diff-in-diff. And the publication of our current paper in JEL was this yr. And now the undertaking appears formally executed because the GitHub repository was accomplished and posted yesterday. So now it feels slightly like the top of a really lengthy chapter of my life.
So with that little nostalgia out of the best way, Andrew Baker, Brantly Callaway, Andrew Goodman-Bacon, Pedro Sant’Anna and I revealed a paper earlier this yr, “Distinction-in-Variations: A Practitioner’s Information” at Journal of Financial Literature. I checked out my outdated emails with the earlier editor at JEL, and I feel we received invited to do that factor in both 2019 or 2020. Anyway right here’s the ungated file.
The paper is pedagogical, meant to be one thing somebody endeavor their first diff-in-diff may use to assist them. So for the applying within the paper, we constructed county-level mortality information linked to a easy binary remedy: whether or not a county adopted Medicaid enlargement below the ACA. We weren’t the primary to review that query with DID, however we wished an utility with broad enchantment, one thing clear sufficient that the design concepts wouldn’t get misplaced. So we used this utility, because it had all of the items we wanted for exposition like covariates, inhabitants weights, a lot of handled models in a single yr ample for a easy 2×2, and differential timing for extra complicated instances.
The paper is structured across the thought of constructing blocks just like the “2×2” and the typical remedy impact for the remedy group by time interval. Right here’s a little bit of what we cowl.
-
2×2 DID, the best setup: handled and untreated models, earlier than and after. Right here we stroll by parallel tendencies, no anticipation, easy first distinction comparisons, regression specs, all of the core necessities.
-
2×T, which makes use of a number of pre-treatment durations and naturally results in the event-study design.
-
DID with covariates, and the assumptions wanted for identification while you embody them.
-
G×T, the differential-timing design that my coauthors have every formed in main methods.
We additionally speak about subtleties like inhabitants weighting. I feel the dialogue of covariates is extraordinarily good, and necessary, and possibly even a subject that will get overshadowed by differential timing if I’m being sincere. Covariates are virtually essential, and the way you deal with them due to this fact is just too, and the truth that I’m so desirous about them might be a mirrored image of being buddies with these guys; that and a undertaking of my very own that satisfied me they have been essential.
Anyway, on to the factor individuals hold asking us about: the info.
During the last yr, because the paper got here out, individuals have written all 5 of us with variations of the identical query:
“When will the info and code be out there so we are able to replicate the paper?”
Properly that’s what I’m getting at — excellent news. The complete replication bundle is now dwell on Pedro’s GitHub repo. It’s thorough, documented, and out there in each R and Stata. And the codes for replications are right here on Pedro’s web site. It’s in html and allows you to toggle between the R and Stata interface. The heavy lifting that went into making this — if solely you may have seen how the sausage was made. What a distinction period it’s that there exists one thing as meticulously put collectively as this replication bundle. It was was sobering watching how cautious and attentive and meticulous my coauthors have been. I’ll take a lot of this expertise with me into all the pieces I do going ahead.
Each undertaking teaches you one thing, stretches you greater than you anticipate. I don’t know if it was this paper or simply working alongside these 4 individuals for the previous six years, however my whole worldview about diff-in-diff — and causal inference extra broadly — is completely different now. They’re sensible, hardworking, inventive, and genuinely good individuals. I’m grateful our paths crossed.
Due to Steven Durlauf and David Romer for all their assist as editors. Their curiosity in it received it began and introduced over the end line. It turned lovely because of their steerage, for which I’m actually grateful. And due to the referees who helped a lot too. And as I stated because of these guys for instructing me a lot.
