Wednesday, March 25, 2026

Becoming ordered probit fashions with endogenous covariates with Stata’s gsem command


The brand new command gsem permits us to suit all kinds of fashions; among the many many potentialities, we are able to account for endogeneity on totally different fashions. For example, I’ll match an ordinal mannequin with endogenous covariates.

 

Parameterizations for an ordinal probit mannequin

 
The ordinal probit mannequin is used to mannequin ordinal dependent variables. Within the regular parameterization, we assume that there’s an underlying linear regression, which relates an unobserved steady variable (y^*) to the covariates (x).

[y^*_{i} = x_{i}gamma + u_i]

The noticed dependent variable (y) pertains to (y^*) by a sequence of cut-points (-infty =kappa_0<kappa_1<dots< kappa_m=+infty) , as follows:

[y_{i} = j {mbox{ if }} kappa_{j-1} < y^*_{i} leq kappa_j]

Offered that the variance of (u_i) can’t be recognized from the noticed knowledge, it’s assumed to be equal to 1. Nevertheless, we are able to think about a re-scaled parameterization for a similar mannequin; an easy method of seeing this, is by noting that, for any optimistic quantity (M):

[kappa_{j-1} < y^*_{i} leq kappa_j iff
Mkappa_{j-1} < M y^*_{i} leq Mkappa_j
]

that’s,

[kappa_{j-1} < x_igamma + u_i leq kappa_j iff
Mkappa_{j-1}< x_i(Mgamma) + Mu_i leq Mkappa_j
]

In different phrases, if the mannequin is recognized, it may be represented by multiplying the unobserved variable (y) by a optimistic quantity, and it will imply that the usual error of the residual part, the coefficients, and the cut-points might be multiplied by this quantity.

Let me present you an instance; I’ll first match a regular ordinal probit mannequin, each with oprobit and with gsem. Then, I’ll use gsem to suit an ordinal probit mannequin the place the residual time period for the underlying linear regression has a regular deviation equal to 2. I’ll do that by introducing a latent variable (L), with variance 1, and coefficient (sqrt 3). This might be added to the underlying latent residual, with variance 1; then, the ‘new’ residual time period may have variance equal to (1+((sqrt 3)^2times Var(L))= 4), so the usual deviation might be 2. We’ll see that in consequence, the coefficients, in addition to the cut-points, might be multiplied by 2.


. sysuse auto, clear
(1978 Car Information)

. oprobit rep mpg disp , nolog

Ordered probit regression                         Variety of obs   =         69
                                                  LR chi2(2)      =      14.68
                                                  Prob > chi2     =     0.0006
Log probability = -86.352646                       Pseudo R2       =     0.0783

------------------------------------------------------------------------------
       rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |   .0497185   .0355452     1.40   0.162    -.0199487    .1193858
displacement |  -.0029884   .0021498    -1.39   0.165     -.007202    .0012252
-------------+----------------------------------------------------------------
       /cut1 |  -1.570496   1.146391                      -3.81738    .6763888
       /cut2 |  -.7295982   1.122361                     -2.929386     1.47019
       /cut3 |   .6580529   1.107838                     -1.513269    2.829375
       /cut4 |    1.60884   1.117905                     -.5822132    3.799892
------------------------------------------------------------------------------

. gsem (rep <- mpg disp, oprobit), nolog

Generalized structural equation mannequin             Variety of obs   =         69
Log probability = -86.352646

--------------------------------------------------------------------------------
               |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
rep78 <-       |
           mpg |   .0497185   .0355452     1.40   0.162    -.0199487    .1193858
  displacement |  -.0029884   .0021498    -1.39   0.165     -.007202    .0012252
---------------+----------------------------------------------------------------
rep78          |
         /cut1 |  -1.570496   1.146391    -1.37   0.171     -3.81738    .6763888
         /cut2 |  -.7295982   1.122361    -0.65   0.516    -2.929386     1.47019
         /cut3 |   .6580529   1.107838     0.59   0.553    -1.513269    2.829375
         /cut4 |    1.60884   1.117905     1.44   0.150    -.5822132    3.799892
--------------------------------------------------------------------------------

. native a = sqrt(3)

. gsem (rep <- mpg disp L@`a'), oprobit var(L@1) nolog

Generalized structural equation mannequin             Variety of obs   =         69
Log probability = -86.353008

 ( 1)  [rep78]L = 1.732051
 ( 2)  [var(L)]_cons = 1
--------------------------------------------------------------------------------
               |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
rep78 <-       |
           mpg |    .099532     .07113     1.40   0.162    -.0398802    .2389442
  displacement |  -.0059739   .0043002    -1.39   0.165    -.0144022    .0024544
             L |   1.732051  (constrained)
---------------+----------------------------------------------------------------
rep78          |
         /cut1 |  -3.138491   2.293613    -1.37   0.171     -7.63389    1.356907
         /cut2 |  -1.456712   2.245565    -0.65   0.517    -5.857938    2.944513
         /cut3 |   1.318568    2.21653     0.59   0.552     -3.02575    5.662887
         /cut4 |   3.220004   2.236599     1.44   0.150     -1.16365    7.603657
---------------+----------------------------------------------------------------
         var(L)|          1  (constrained)
--------------------------------------------------------------------------------

 

Ordinal probit mannequin with endogenous covariates

 
This mannequin is outlined analogously to the mannequin fitted by -ivprobit- for probit fashions with endogenous covariates; we assume an underlying mannequin with two equations,

[
begin{eqnarray}
y^*_{1i} =& y_{2i} beta + x_{1i} gamma + u_i &
y_{2i} =& x_{1i} pi_1 + x_{2i} pi_2 + v_i & ,,,,,, (1)
end{eqnarray}
]

the place (u_i sim N(0, 1) ), (v_isim N(0,s^2) ), and (corr(u_i, v_i) = rho).

We don’t observe (y^*_{1i}); as a substitute, we observe a discrete variable (y_{1i}), corresponding to, for a set of cut-points (to be estimated) (kappa_0 = -infty < kappa_1 < kappa_2 dots < kappa_m = +infty ),

[y_{1i} = j {mbox{ if }} kappa_{j-1} < y^*_{1i} leq kappa_j ]

 

The parameterization we’ll use

 
I’ll re-scale the primary equation, preserving the correlation. That’s, I’ll think about the next system:

[
begin{eqnarray}
z^*_{1i} =&
y_{2i}b +x_{1i}c + t_i + alpha L_i &
y_{2i} = &x_{1i}pi_1 + x_{2i}pi_2 + w_i + alpha L_i & ,,,,,, (2)
end{eqnarray}
]

the place (t_i, w_i, L_i) are impartial, (t_i sim N(0, 1)) , (w_i sim N(0,sigma^2)), (L_i sim N(0, 1))

[y_{1i} = j {mbox{ if }} lambda_{j-1} < z^*_{1i} leq lambda_j ]

By introducing a latent variable in each equations, I’m modeling a correlation between the error phrases. The fist equation is a re-scaled model of the unique equation, that’s, (z^*_1 = My^*_1),

[ y_{2i}b +x_{1i}c + t_i + alpha_i L_i
= M(y_{2i}beta) +M x_{1i}gamma + M u_i ]

This suggests that
[M u_i = t_i + alpha_i L_i, ]
the place (Var(u_i) = 1) and (Var(t_i + alpha L_i) = 1 + alpha^2), so the size is (M = sqrt{1+alpha^2} ).

The second equation stays the identical, we simply categorical (v_i) as (w_i + alpha L_i). Now, after estimating the system (2), we are able to recuperate the parameters in (1) as follows:

[beta = frac{1}{sqrt{1+ alpha^2}} b]
[gamma = frac{1}{sqrt{1+ alpha^2}} c]
[kappa_j = frac{1}{sqrt{1+ alpha^2}} lambda_j ]

[V(v_i) = V(w_i + alpha L_i) =V(w_i) + alpha^2].

[rho = Cov(t_i + alpha L_i, w_i + alpha L_i) =
frac{alpha^2}{(sqrt{1+alpha^2}sqrt{V(w_i)+alpha^2)}}]

Be aware: This parameterization assumes that the correlation is optimistic; for destructive values of the correlation, (L) must be included within the second equation with a destructive signal (that’s, L@(-a) as a substitute of L@a). When attempting to carry out the estimation with the unsuitable signal, the mannequin most probably gained’t obtain convergence. In any other case, you will notice a coefficient for L that’s just about zero. In Stata 13.1 now we have included options that help you match the mannequin with out this restriction. Nevertheless, this time we’ll use the older parameterization, which can help you visualize the totally different elements extra simply.

 

Simulating knowledge, and performing the estimation

 


clear
set seed 1357
set obs 10000
forvalues i = 1(1)5 {
    gen x`i' =2* rnormal() + _n/1000
}

mat C = [1,.5  .5, 1]
drawnorm z1 z2, cov(C)

gen y2 = 0
forvalues i = 1(1)5 {
    exchange y2 = y2 + x`i'
}
exchange y2 = y2 + z2

gen y1star = y2 + x1 + x2 + z1
gen xb1 = y2 + x1 + x2

gen y1 = 4
exchange y1 = 3 if xb1 + z1 <=.8
exchange y1 = 2 if xb1 + z1 <=.3
exchange y1 = 1 if xb1 + z1 <=-.3
exchange y1 = 0 if xb1 + z1 <=-.8

gsem (y1 <- y2 x1 x2 L@a, oprobit) (y2 <- x1 x2 x3 x4 x5 L@a), var(L@1)

native y1 y1
native y2 y2

native xaux  x1 x2 x3 x4 x5
native xmain  y2 x1 x2

native s2 sqrt(1+_b[`y1':L]^2)
foreach v in `xmain'{
    native trans `trans' (`y1'_`v': _b[`y1':`v']/`s2')
}

foreach v in `xaux' _cons {
    native trans `trans' (`y2'_`v': _b[`y2':`v'])
}

qui tab `y1' if e(pattern)
native ncuts = r(r)-1
forvalues i = 1(1) `ncuts'{
    native trans `trans' (cut_`i': _b[`y1'_cut`i':_cons]/`s2')
}

native s1 sqrt(  _b[var(e.`y2'):_cons]  +_b[`y1':L]^2)

native trans `trans' (sig_2: `s1')
native trans `trans' (rho_12: _b[`y1':L]^2/(`s1'*`s2'))
nlcom `trans'

 

Outcomes

 
That is the output from gsem:


Generalized structural equation mannequin             Variety of obs   =      10000
Log probability = -14451.117

 ( 1)  [y1]L - [y2]L = 0
 ( 2)  [var(L)]_cons = 1
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
y1 <-        |
          y2 |   1.379511   .0775028    17.80   0.000     1.227608    1.531414
          x1 |   1.355687   .0851558    15.92   0.000     1.188785    1.522589
          x2 |   1.346323   .0833242    16.16   0.000      1.18301    1.509635
           L |   .7786594   .0479403    16.24   0.000     .6846982    .8726206
-------------+----------------------------------------------------------------
y2 <-        |
          x1 |   .9901353   .0044941   220.32   0.000      .981327    .9989435
          x2 |   1.006836   .0044795   224.76   0.000      .998056    1.015615
          x3 |   1.004249   .0044657   224.88   0.000     .9954963    1.013002
          x4 |   .9976541   .0044783   222.77   0.000     .9888767    1.006431
          x5 |   .9987587   .0044736   223.26   0.000     .9899907    1.007527
           L |   .7786594   .0479403    16.24   0.000     .6846982    .8726206
       _cons |   .0002758   .0192417     0.01   0.989    -.0374372    .0379887
-------------+----------------------------------------------------------------
y1           |
       /cut1 |  -1.131155   .1157771    -9.77   0.000    -1.358074   -.9042358
       /cut2 |  -.5330973   .1079414    -4.94   0.000    -.7446585    -.321536
       /cut3 |   .2722794   .1061315     2.57   0.010     .0642654    .4802933
       /cut4 |     .89394   .1123013     7.96   0.000     .6738334    1.114047
-------------+----------------------------------------------------------------
       var(L)|          1  (constrained)
-------------+----------------------------------------------------------------
    var(e.y2)|   .3823751    .074215                      .2613848    .5593696
------------------------------------------------------------------------------

These are the outcomes we receive after we remodel the values reported by gsem to the unique parameterization:


------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       y1_y2 |   1.088455   .0608501    17.89   0.000     .9691909    1.207719
       y1_x1 |   1.069657   .0642069    16.66   0.000      .943814    1.195501
       y1_x2 |   1.062269   .0619939    17.14   0.000      .940763    1.183774
       y2_x1 |   .9901353   .0044941   220.32   0.000      .981327    .9989435
       y2_x2 |   1.006836   .0044795   224.76   0.000      .998056    1.015615
       y2_x3 |   1.004249   .0044657   224.88   0.000     .9954963    1.013002
       y2_x4 |   .9976541   .0044783   222.77   0.000     .9888767    1.006431
       y2_x5 |   .9987587   .0044736   223.26   0.000     .9899907    1.007527
    y2__cons |   .0002758   .0192417     0.01   0.989    -.0374372    .0379887
       cut_1 |   -.892498   .0895971    -9.96   0.000    -1.068105   -.7168909
       cut_2 |  -.4206217   .0841852    -5.00   0.000    -.5856218   -.2556217
       cut_3 |   .2148325   .0843737     2.55   0.011     .0494632    .3802018
       cut_4 |    .705332   .0905974     7.79   0.000     .5277644    .8828997
       sig_2 |   .9943267    .007031   141.42   0.000     .9805462    1.008107
      rho_12 |   .4811176   .0477552    10.07   0.000     .3875191     .574716
------------------------------------------------------------------------------

The estimates are fairly near the values used for the simulation. When you attempt to carry out the estimation with the unsuitable signal for the coefficient for L, you’ll get a quantity that’s just about zero (when you get convergence in any respect). On this case, the evaluator is telling us that the most effective worth it could actually discover, offered the restrictions now we have imposed, is zero. When you see such outcomes, you might wish to strive the other signal. If each give a zero coefficient, it signifies that that is the answer, and there may be not endogeneity in any respect. If one in all them is just not zero, it signifies that the non-zero worth is the answer. As acknowledged earlier than, in Stata 13.1, the mannequin may be fitted with out this restriction.



Related Articles

Latest Articles