On a latest examination I requested college students to increase the logic of propensity rating weighting to deal with a remedy that takes on three moderately than two values: principally a stripped-down model of Imbens (2000). Almost everybody figured this out with out a lot hassle, which is sweet information! On the similar time, I observed some frequent misconceptions in regards to the all-important selection-on-observables assumption:
[
mathbb{E}[Y_0|D,X] = mathbb{E}[Y_0|X] quad textual content{and} quad
mathbb{E}[Y_1|D,X] = mathbb{E}[Y_1|X]
]
the place ((Y_0, Y_1)) are the potential outcomes comparable to a binary remedy (D) and (X) is a vector of noticed covariates. Since greater than a handful of scholars made the identical errors, it appeared like alternative for a brief publish.
The next two statements about choice on observables are false:
- Beneath choice on observables, if I do know the worth of somebody’s covariate vector (X), then studying her remedy standing (D) offers no extra details about the typical worth of her noticed consequence (Y).
- Choice on observables requires the remedy (D) and potential outcomes ((Y_0,Y_1)) to be conditionally impartial given covariates (X).
If you happen to’ve studied remedy results, pause for a second and see should you can determine what’s unsuitable with every of them earlier than studying additional.
The primary assertion:
Beneath choice on observables, if I do know the worth of somebody’s covariate vector (X), then studying her remedy standing (D) offers no extra details about the typical worth of her noticed consequence (Y).
is a verbal description of the next conditional imply independence situation:
[
mathbb{E}[Y|X,D] = mathbb{E}[Y|X].
]
So what’s unsuitable with this equality? The potential outcomes ((Y_0, Y_1)) and the noticed consequence (Y) are associated based on
[
Y = Y_0 + D (Y_1 – Y_0).
]
Taking conditional expectations of each side and utilizing the choice on observables assumption
[
begin{aligned}
mathbb{E}[Y|X,D] &= mathbb{E}[Y_0|X,D] + D mathbb{E}[Y_1 – Y_0|D,X]
&= mathbb{E}[Y_0|X] + D mathbb{E}[Y_1 – Y_0|X].
finish{aligned}
]
In distinction, conditioning on (X) alone offers
[
begin{aligned}
mathbb{E}[Y|X] &= mathbb{E}[Y_0|X] + mathbb{E}[D(Y_1 – Y_0)|X]
&= mathbb{E}[Y_0|X] + mathbb{E}_X[Dmathbb{E}(Y_1 – Y_0|D,X)]
&= mathbb{E}[Y_0|X] + mathbb{E}_X[Dmathbb{E}(Y_1 – Y_0|X)]
&= mathbb{E}[Y_0|X] + mathbb{E}(D|X) cdot mathbb{E}(Y_1 – Y_0|X)
finish{aligned}
]
by iterated expectations and the choice on observables assumption, since (mathbb{E}(Y_1 – Y_0|X)) is a measurable perform of (X). Subtracting these expressions, we discover that
[
mathbb{E}(Y|X,D) – mathbb{E}(Y|X) = left[ D – mathbb{E}(D|X) right] cdot mathbb{E}(Y_1 – Y_0|X)
]
in order that (mathbb{E}(Y|X,D) = mathbb{E}(Y|X)) if and provided that the RHS equals zero.
So how may the RHS equal zero? A method is that if (D = mathbb{E}(D|X)). Since (D) is a binary random variable, this could require (mathbb{E}(D|X)) to be a binary random variable as properly. However discover that (mathbb{E}(D|X) = mathbb{P}(D=1|X)) is just the propensity rating (p(X)). As a result of (X) is a random variable, so is (p(X)). However (p(X)) can not tackle the values zero or one. If it did, this could violate the overlap assumption: (0 < p(X) < 1).
So we will’t have (D = mathbb{E}(D|X)), however what about (mathbb{E}(Y_1 – Y_0|X)=0)? Since ((Y_1 – Y_0)) is the remedy impact of (D), it follows that (mathbb{E}(Y_1 – Y_0|X)) is the conditional common remedy impact (textual content{ATE}(X)) given (X). It’s not a contradiction for (textual content{ATE}(X)) to equal zero, however take into consideration what it might imply: it might require that the typical remedy impact for an individual with covariates ((X = x)) is strictly zero regardless of (x). Furthermore, by iterated expectations it might suggest that
[
text{ATE} = mathbb{E}(Y_1 – Y_0) = mathbb{E}_X[mathbb{E}(Y_1 – Y_0|
X)] = mathbb{E}[text{ATE}(X)] = 0
]
so the typical remedy impact would even be zero. Once more, this isn’t a contradiction however it might positively be odd to imagine that the remedy impact is zero earlier than you even attempt to estimate it!
To summarize: the primary assertion above can’t be an implication of choice on observables as a result of it might both require a violation of the overlap assumption, or suggest that there isn’t any remedy impact in anyway. To appropriate the assertion, we merely want to alter the final three phrases:
Beneath choice on observables, if I do know the worth of somebody’s covariate vector (X), then studying her remedy standing (D) offers no extra details about the typical values of her potential outcomes ((Y_0, Y_1)).
This can be a appropriate verbal assertion of the imply exclusion restriction (mathbb{E}(Y_0|D,X) = mathbb{E}(Y_0|X)) and (mathbb{E}(Y_1|D,X) = mathbb{E}(Y_1|X)).
And this leads properly to the second false impression:
Choice on observables requires the remedy (D) and potential outcomes ((Y_0,Y_1)) to be conditionally impartial given covariates (X).
To see why that is false, take into account an instance through which
[
begin{aligned}
Y &= (1 – D) cdot (alpha_0 + X’beta_0 + U_0) + D cdot (alpha_1 + X’ beta_1 + U_1)
U_0|(D,X) &sim text{Normal}(0,1 – D/2)
U_1|(D,X) &sim text{Normal}(0,1 + D).
end{aligned}
]
Discover that the distributions of (U_0) and (U_1) given ((D,X)) depend upon (D). Now, by iterated expectations,
[
begin{aligned}
mathbb{E}(U_0|X) &= mathbb{E}_X)[mathbb{E}(U_0|D,X)] = 0
mathbb{E}(U_0) &= mathbb{E}_{X}[mathbb{E}(U_0|X)] = 0
finish{aligned}
]
and equally (mathbb{E}(U_1|X) = mathbb{E}(U_1)=0). Substituting (D=0) and (D=1), we will calculate the potential outcomes and common remedy impact as follows
[
begin{aligned}
Y_0 &= alpha_0 + X’beta_0 + U_0
Y_1 &= alpha_1 + X’beta_1 + U_1
text{ATE} &= mathbb{E}(Y_1 – Y_0) = (alpha_1 – alpha_0) + mathbb{E}[X’](beta_1 – beta_0).
finish{aligned}
]
It follows that (D) is not conditionally impartial of ((Y_0, Y_1)) given (X). Specifically, the variance of the potential outcomes is dependent upon (D) even after conditioning on (X):
[
begin{aligned}
text{Var}(Y_0|X,D) &= text{Var}(U_0|X,D) = 1 – D/2
text{Var}(Y_1|X,D) &= text{Var}(U_1|X,D) = 1 + D.
end{aligned}
]
Despite this, the choice on observables assumption nonetheless holds:
[
begin{aligned}
mathbb{E}(Y_0|D,X) &= alpha_0 + X’beta_0 + mathbb{E}(U_0|D,X) = alpha_0 + X’beta_0
mathbb{E}(Y_0|X) &= alpha_0 + X’beta_0 + mathbb{E}(U_0|X) = alpha_0 + X’beta_0
end{aligned}
]
and equally (mathbb{E}(Y_1|D,X) = mathbb{E}(Y_1|X) = alpha_1 + X’beta_0). Whereas this instance is admittedly a bit peculiar, the purpose is extra basic: as a result of the typical remedy impact is an expectation, figuring out it solely requires assumptions about conditional means. The second assertion is even simpler to appropriate than the primary: we’d like solely add a single phrase:
Choice on observables requires the remedy (D) and potential outcomes ((Y_0,Y_1)) to be conditionally imply impartial given covariates (X).
Conditional independence implies conditional imply independence, however the converse is fake.
So what’s the ethical right here? First, it’s essential to differentiate between the noticed consequence (Y) and the potential outcomes ((Y_0, Y_1)). Second, the varied notions of “unrelatedness” between random variables—independence, conditional imply independence, and uncorrelatedness—will be complicated. Remember to take note of precisely which situation is used and why. In a future publish, I’ll have extra to say in regards to the relationships between these notions.
