Tuesday, June 9, 2026

Ought to I Embody Covariates in Diff-in-Diff?


I’ve heard the next sufficient occasions that it has registered. And it occurs amongst people who find themselves often pretty seasoned researchers. So each the frequency and the speaker has made me suppose it’s most likely a typical sufficient perception. And that’s this:

If I embody covariates, and my diff-in-diff estimates change, then I don’t imagine the diff-in-diff estimates.

It is available in many varieties, however that’s often it in a nutshell. And immediately I wish to simply write what might be going to be the primary of some substacks on it, however I’m going to attempt to be transient, which would require doing a few these. However first, I flipped a coin 3 occasions, it got here up head all thrice, and subsequently this will probably be paywalled (ultimately under it is going to be).

Thanks once more on your assist! In case you’re dying to be taught extra concerning the significance of together with covariates in diff-in-diff, then take into account changing into a paying subscriber! At $5/month, which is absolutely the naked minimal Substack permits me to cost, it’s a steal!

It’s well-known that diff-in-diff has one key assumption referred to as parallel traits. And in case you fulfill it, you don’t want to incorporate any covariates as controls. Let me begin with an illustration of what it means to fulfill parallel traits. Our final result will probably be earnings, and I’ll have evaluate faculty educated employees (our remedy group) with highschool solely employees (out management group). We’ll symbolize untreated potential final result as Y(0) and the handled final result as Y(1), and subsequently a remedy impact as Y(1) – Y(0).

First, let’s say that males’s highschool solely earnings grows +10 a yr, however feminine’s highschool solely earnings develop +8 a yr (euros, {dollars}, kilos, something). We are able to write this as:

(Y_{it}(0)
=
alpha
+
10t cdot M_i
+
8t cdot (1-M_i)
+
varepsilon_{it})

the place M is a dummy variable equalling 1 if biologically male and 0 if organic feminine, alpha is a degree fixed that may be totally different for women and men if we needed, and the epsilon is in expectation zero. Therefore when M=1, then E[Y(0)] grows at a charge of 10, and when M=0, then E[Y(0)] grows at a charge of 8. Discover that that is an final result mannequin. It states that there’s a “return” to being a male, a “return” to being a feminine, however that it isn’t the identical.

However subtly, discover additionally that that return is similar whether or not you’re handled or not. If you’re handled, then in fact we by no means see Y(0). We solely see Y(1). However that simply signifies that for faculty educated employees, Y(0) is counterfactual.

And on this final result mannequin, we’re saying that prime college solely males have totally different traits than females — not simply totally different ranges (i.e., alpha) however traits.

Second, let’s say that 75% of our faculty educated employees are males and 75% of our highschool educated employees are males. First, let’s take a primary distinction for everybody within the pattern.

(Delta Y_{it}(0)=8+2M_i+Deltavarepsilon_{it})

After we take expectations, we get:

(E[Delta Y_{it}(0)]=8+2M_i)

Observe that the alpha dropped out as a result of it was a continuing for every individual i. So even when we allowed women and men to make totally different baseline earnings, the primary distinction wipes them out. It simply doesn’t wipe out the impact of intercourse on traits. That’s the important thing right here.

Now, recall I mentioned that the 2 teams have been balanced. 75% of the remedy group was male and 75% of the management group was male. Which means we are able to can calculate utilizing that equation the development in common earnings for each teams, and because it doesn’t rely on remedy standing, the development would be the identical. And it is going to be 9.5. And that’s as a result of 8+2 x 0.75 = 8 + 1.5 = 9.5.

So the 2 teams are balanced, they each develop at 9.5, and thus the faculty group and the highschool group fulfill unconditional parallel traits and in consequence, you do not want to regulate for intercourse in your diff-in-diff. You don’t as a result of each 2×2 is the same as this:

(2 occasions 2 = ATT + PT_{bias})

And since we simply confirmed that there isn’t a parallel traits bias, the 2×2 is an unbiased and constant estimate of the ATT. Finished.

Related Articles

Latest Articles