Utilizing mlexp to estimate endogenous remedy results in a probit mannequin

March 6, 2026

3

I take advantage of options new to Stata 14.1 to estimate a median remedy impact (ATE) for a probit mannequin with an endogenous remedy. In 14.1, we added new prediction statistics after mlexp that margins can use to estimate an ATE.

I’m constructing on a earlier publish by which I demonstrated methods to use mlexp to estimate the parameters of a probit mannequin with pattern choice. Our outcomes match these obtained with biprobit; see [R] biprobit for extra particulars. In a future publish, I take advantage of these methods to estimate treatment-effect parameters not but accessible from one other Stata command.

Probit mannequin with remedy

On this part, I describe the potential-outcome framework used to outline an ATE. For every remedy degree, there may be an final result that we might observe if an individual had been to pick out that remedy degree. When the result is binary and there are two remedy ranges, we are able to specify how the potential outcomes (y_{0i}) and (y_{1i}) are generated from the regressors ({bf x}_i) and the error phrases (epsilon_{0i}) and (epsilon_{1i}):

[begin{eqnarray*}
y_{0i} &=& {bf 1}({bf x}_i{boldsymbol beta}_0 + epsilon_{0i} > 0) cr
y_{1i} &=& {bf 1}({bf x}_i{boldsymbol beta}_1 + epsilon_{1i} > 0)
end{eqnarray*}]

(Assuming that every error is commonplace regular, this provides us a bivariate probit mannequin.) The indicator perform ({bf 1}(cdot)) outputs 1 when its enter is true and 0 in any other case.

The probit mannequin for potential outcomes (y_{0i}) and (y_{1i}) with remedy (t_i) assumes that we observe the result

[begin{equation}
y_i = (1-t_i) y_{0i} + t_i y_{1i}
nonumber
end{equation}]

So we observe (y_{1i}) beneath the remedy ((t_{i}=1)) and (y_{0i}) when the remedy is withheld ((t_{i}=0)).

The remedy (t_i) is set by regressors ({bf z}_i) and commonplace regular error (u_i):

[begin{equation}
t_i = {bf 1}({bf z}_i{boldsymbol psi} + u_i > 0)
nonumber
end{equation}]

Probit mannequin with endogenous remedy

We might estimate the parameters ({boldsymbol beta}_0) and ({boldsymbol beta}_1) utilizing a probit regression on (y_i) if (t_i) was not associated to the unobserved errors (epsilon_{0i}) and (epsilon_{1i}). This will not at all times be the case. Suppose we modeled whether or not mother and father ship their youngsters to non-public faculty and used personal tutoring for the kid as a remedy. Unobserved elements that affect personal faculty enrollment could also be correlated with the unobserved elements that affect whether or not personal tutoring is given. The remedy could be correlated with the unobserved errors of the result.

We will deal with (t_i) as endogenous by permitting (epsilon_{0i}) and (epsilon_{1i}) to be correlated with (u_i). On this publish, we are going to assume that these correlations are the identical. Formally, (epsilon_{0i}), (epsilon_{1i}), and (u_i) are trivariate regular with covariance:

[begin{equation}
left[begin{matrix}
1 & rho_{01} & rho_{t} cr
rho_{01} & 1 & rho_{t} cr
rho_{t} & rho_{t} & 1
end{matrix}right]
nonumber
finish{equation}]

The correlation (rho_{01}) can’t be recognized as a result of we by no means observe each (y_{0i}) and (y_{1i}). Nevertheless, identification of (rho_{01}) just isn’t essential to estimate the opposite parameters, as a result of we are going to observe the covariates and final result in observations from every remedy group.

The log-likelihood for commentary (i) is

[begin{eqnarray*}
ln L_i = & & {bf 1}(y_i =1 mbox{ and } t_i = 1) ln Phi_2({bf x}_i{boldsymbol beta}_1, {bf z}_i{boldsymbol gamma},rho_t) + cr
& & {bf 1}(y_i=0 mbox{ and } t_i=1)ln Phi_2(-{bf x}_i{boldsymbol beta}_1, {bf z}_i{boldsymbol gamma},-rho_t) + cr
& & {bf 1}(y_i=1 mbox{ and } t_i=0) ln Phi_2({bf x}_i{boldsymbol beta}_0, -{bf z}_i{boldsymbol gamma},-rho_t) + cr
& & {bf 1}(y_i=0 mbox{ and } t_i = 0)ln Phi_2(-{bf x}_i{boldsymbol beta}_0, -{bf z}_i{boldsymbol gamma},rho_t)
end{eqnarray*}]

the place (Phi_2) is the bivariate regular cumulative distribution perform.

This mannequin is a variation of the bivariate probit mannequin. For introduction to the bivariate probit mannequin, see Pindyck and Rubinfeld (1998).

The info

We’ll simulate information from a probit mannequin with an endogenous remedy after which estimate the parameters of the mannequin utilizing mlexp. Then, we are going to use margins to estimate the ATE. We simulate a random pattern of 10,000 observations.


. set seed 3211

. set obs 10000
variety of observations (_N) was 0, now 10,000

. gen x = rnormal() + 4

. gen b = rpoisson(1)

. gen z = rnormal()

First, we generate the regressors. The variable (x) has a traditional distribution with a imply of 4 and variance of 1. It’s used as a regressor for the result and remedy. The variable (b) has a Poisson distribution with a imply of 1 and might be used as a remedy regressor. A normal regular variable (z) can be used as a remedy regressor.


. matrix cm = (1, .3,.7  .3, 1, .7  .7, .7, 1)

. drawnorm ey0 ey1 et, corr(cm)

. gen t = .5*x - .1*b + .4*z - 2.4 + et > 0

. gen y0 = .6*x - .8 + ey0 > 0

. gen y1 = .3*x - 1.2 + ey1 > 0

. gen y = (1-t)*y0 + t*y1

Subsequent, we draw the unobserved errors. The potential final result and remedy errors could have correlation (.7). We generate the errors utilizing the drawnorm command. Lastly, the result and remedy indicators are created.

Estimating the mannequin parameters

Now, we are going to use mlexp to estimate the parameters of the probit mannequin with an endogenous remedy. As within the earlier publish, we use the cond() perform to calculate completely different values of the chance based mostly on the completely different values of (y) and (t). We use the issue variable operator ibn on (t) in equation y to permit for a unique intercept at every degree of (t). An interplay between (t) and (x) can be laid out in equation y. This permits for a unique coefficient on (x) at every degree of (t). We additionally specify vce(strong) in order that we are able to use vce(unconditional) after we use margins later.


. mlexp (ln(cond(t,cond(y,binormal({y: i.t#c.x ibn.t},            ///
>                                  {t: x b z _cons}, {rho}),      /// 
>                         binormal(-{y:},{t:}, -{rho})),          ///
>                  cond(y,binormal({y:},-{t:},-{rho}),            ///
>                         binormal(-{y:},-{t:},{rho})))))         ///
>         , vce(strong)

preliminary:       log pseudolikelihood = -13862.944
various:   log pseudolikelihood = -15511.071
rescale:       log pseudolikelihood = -13818.369
rescale eq:    log pseudolikelihood = -10510.488
Iteration 0:   log pseudolikelihood = -10510.488  (not concave)
Iteration 1:   log pseudolikelihood = -10004.946  
Iteration 2:   log pseudolikelihood = -9487.4032  
Iteration 3:   log pseudolikelihood = -9286.0118  
Iteration 4:   log pseudolikelihood =  -9183.901  
Iteration 5:   log pseudolikelihood = -9181.9207  
Iteration 6:   log pseudolikelihood = -9172.0256  
Iteration 7:   log pseudolikelihood = -9170.8198  
Iteration 8:   log pseudolikelihood = -9170.7994  
Iteration 9:   log pseudolikelihood = -9170.7994  

Most chance estimation

Log pseudolikelihood = -9170.7994               Variety of obs     =     10,000

------------------------------------------------------------------------------
             |               Sturdy
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
y            |
       t#c.x |
          0  |   .5829362   .0223326    26.10   0.000     .5391651    .6267073
          1  |   .2745585   .0259477    10.58   0.000     .2237021     .325415
             |
           t |
          0  |  -.7423227   .0788659    -9.41   0.000     -.896897   -.5877483
          1  |  -1.088765   .1488922    -7.31   0.000    -1.380589   -.7969419
-------------+----------------------------------------------------------------
t            |
           x |   .4900691   .0148391    33.03   0.000     .4609851    .5191532
           b |  -.1086717   .0132481    -8.20   0.000    -.1346375   -.0827059
           z |   .4135792   .0150112    27.55   0.000     .3841579    .4430006
       _cons |  -2.354418   .0640056   -36.78   0.000    -2.479867   -2.228969
-------------+----------------------------------------------------------------
        /rho |   .7146737   .0377255    18.94   0.000     .6407331    .7886143
------------------------------------------------------------------------------

Our parameter estimates are near their true values.

Estimating the ATE

The ATE of (t) is the anticipated worth of the distinction between (y_{1i}) and (y_{0i}), the common distinction between the potential outcomes. Utilizing the legislation of iterated expectations, we’ve

[begin{eqnarray*}
E(y_{1i}-y_{0i}) &=& E{E(y_{1i}-y_{0i}|{bf x}_i)} cr
&=& E{Phi({bf x}_i{boldsymbol beta}_1)-
Phi({bf x}_i{boldsymbol beta}_0)}
end{eqnarray*}]

This may be estimated as a predictive margin.

Now, we estimate the ATE utilizing margins. We specify the traditional likelihood expression within the expression() choice. The xb() time period refers back to the linear prediction of the primary equation, which we are able to now predict in Stata 14.1. We specify r.t in order that margins will take the distinction of the expression beneath (t=1) and (t=0). We specify vce(unconditional) to acquire commonplace errors for the inhabitants ATE reasonably than the pattern ATE. The distinction(nowald) choice is specified to omit the Wald check for the distinction.


. margins r.t, expression(regular(xb())) vce(unconditional) distinction(nowald)

Contrasts of predictive margins

Expression   : regular(xb())

--------------------------------------------------------------
             |            Unconditional
             |   Distinction   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
           t |
   (1 vs 0)  |  -.4112345   .0248909     -.4600197   -.3624493
--------------------------------------------------------------

We estimate that the ATE of (t) on (y) is (-.41). So taking the remedy decreases the likelihood of a optimistic final result by (.41) on common over the inhabitants.

We’ll evaluate this estimate to the pattern distinction of (y_{1}) and (y_{0}).


. gen diff = y1 - y0

. sum diff

    Variable |        Obs        Imply    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
        diff |     10,000      -.4132    .5303715         -1          1

In our pattern, the common distinction of (y_{1}) and (y_{0}) can be (-.41).

Conclusion

I’ve demonstrated methods to estimate the parameters of a mannequin with a posh chance perform: the probit mannequin with an endogenous remedy utilizing mlexp. See [R] mlexp for extra particulars about mlexp. I’ve additionally demonstrated methods to use margins to estimate the ATE for the probit mannequin with an endogenous remedy. See [R] margins for extra particulars about margins.

Reference

Pindyck, R. S., and D. L. Rubinfeld. 1998. Econometric Fashions and Financial Forecasts. 4th ed. New York: McGraw-Hill.

Utilizing mlexp to estimate endogenous remedy results in a probit mannequin

Related Articles

The key to guessing extra precisely with maths

Unemployment causes, by age and training – FlowingData

Drive organizational progress with Amazon Lex multi-developer CI/CD pipeline

Latest Articles

The key to guessing extra precisely with maths

Unemployment causes, by age and training – FlowingData

Drive organizational progress with Amazon Lex multi-developer CI/CD pipeline

Why local-first issues for JavaScript

AI Turning Knowledge Into Choices for Security Packages