An ordered-probit inverse likelihood weighted (IPW) estimator

January 15, 2026

43

teffects ipw makes use of multinomial logit to estimate the weights wanted to estimate the potential-outcome means (POMs) from a multivalued therapy. I present the best way to estimate the POMs when the weights come from an ordered probit mannequin. Second circumstances outline the ordered probit estimator and the next weighted common used to estimate the POMs. I exploit gmm to acquire constant commonplace errors by stacking the ordered-probit second circumstances and the weighted imply second circumstances.

An ordered-probit IPW estimator

I’ve some simulated information by which the noticed end result y is the potential end result equivalent to therapy state 0, 1, or 2. The therapy degree t was generated from an ordered probit mannequin with covariates x1 and x2. You may obtain the information by clicking on choose.dta.

An ordered probit is the primary of a number of steps required to estimate the treatment-level 0 POM.

Instance 1: Ordered probit end result


. use choose

. oprobit t x1 x2

Iteration 0:   log probability = -5168.1477
Iteration 1:   log probability = -4332.0156
Iteration 2:   log probability = -4316.7593
Iteration 3:   log probability = -4316.7225
Iteration 4:   log probability = -4316.7225

Ordered probit regression                       Variety of obs     =     10,000
                                                LR chi2(2)        =    1702.85
                                                Prob > chi2       =     0.0000
Log probability = -4316.7225                     Pseudo R2         =     0.1647

------------------------------------------------------------------------------
           t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7878772   .0435013    18.11   0.000     .7026162    .8731382
          x2 |   1.017705   .0438282    23.22   0.000     .9318036    1.103607
-------------+----------------------------------------------------------------
       /cut1 |   2.084122   .0335056                      2.018452    2.149792
       /cut2 |   2.824316   .0404692                      2.744997    2.903634
------------------------------------------------------------------------------

Now, I estimate the chances of every therapy degree and use them to acquire the weights wanted by the IPW estimator for every POM.

Instance 2: Predicted chances and weights


. predict double pr0 pr1 pr2, pr

. generate double ipw0 = (t==0)/pr0

. generate double ipw1 = (t==1)/pr1

. generate double ipw2 = (t==2)/pr2

I exploit the ipw0 weights to estimate the POM for therapy degree 0.

Instance 3: Estimating POM for therapy 0


. regress y [pw=ipw0]
(sum of wgt is   9.9798e+03)

Linear regression                               Variety of obs     =      8,511
                                                F(0, 8510)        =       0.00
                                                Prob > F          =          .
                                                R-squared         =     0.0000
                                                Root MSE          =     1.5629

------------------------------------------------------------------------------
             |               Sturdy
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   1.105473   .0191744    57.65   0.000     1.067887    1.143059
------------------------------------------------------------------------------

The treatment-level-0 POM is estimated to be 1.11. The usual error reported by regress is just not constant as a result of regress doesn’t know that estimated coefficients had been used to compute the weights ipw0.

I now estimate the opposite two POMs utilizing regress.

Instance 4: Estimating POMs for therapy ranges 1 and a pair of


. regress y [pw=ipw1]
(sum of wgt is   9.9065e+03)

Linear regression                               Variety of obs     =        974
                                                F(0, 973)         =       0.00
                                                Prob > F          =          .
                                                R-squared         =     0.0000
                                                Root MSE          =     1.6007

------------------------------------------------------------------------------
             |               Sturdy
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   1.524924   .0647766    23.54   0.000     1.397806    1.652042
------------------------------------------------------------------------------

. regress y [pw=ipw2]
(sum of wgt is   9.9707e+03)

Linear regression                               Variety of obs     =        515
                                                F(0, 514)         =       0.00
                                                Prob > F          =          .
                                                R-squared         =     0.0000
                                                Root MSE          =     1.6389

------------------------------------------------------------------------------
             |               Sturdy
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   1.920994   .1265199    15.18   0.000     1.672434    2.169554
------------------------------------------------------------------------------

These weighted means are constant for the treatment-level-1 and treatment-level-2 POMs, however the usual errors usually are not constant, as a result of regress doesn’t know that the weights had been estimated.

Utilizing gmm to resolve the multistep estimation drawback

Every step of this IPW estimator is outlined by second circumstances. Fixing all of the second circumstances concurrently removes the multistep estimation drawback. On this part, I exploit gmm to resolve all of the second circumstances concurrently.

I start through the use of gmm to copy the oprobit outcomes. The rating equations solved by oprobit are basically second circumstances.

The rating equations for the ordered probit mannequin will be expressed as three generalized capabilities which are multiplied by instrumental variables to acquire the second circumstances. These generalized error capabilities are

start{align*}
e_1 &=
(y==0)frac{-phi(a_1-xb)}{F(a_1-xb)}
+ (y==1)frac{-(phi(a_2-xb)-phi(a_1-xb))}{(F(a_2-xb)-F(a_1-xb))}
&quad + (y==2)frac{-phi(a_2-xb)}{(1-F(a_2-xb))}
%
e_2 &=
(y==0)frac{phi(a_1-xb)}{F(a_1-xb)}
+ (y==1)frac{-phi(a_1-xb)}{(F(a_2-xb)-F(a_1-xb))}
+ (y==2)0
%
e_3 &=
(y==0)0
+ (y==1)frac{phi(a_2-xb)}{F(a_2-xb)-F(a_1-xb)}
+ (y==2)frac{-phi(a_2-xb)}{(1-F(a_2-xb))}
finish{align*}

Multiplying (e_1) respectively by (x_1) and (x_2) creates the 2 rating equations that I view as second equations that outline the coefficients on (x_1) and (x_2). In different phrases, I kind the second circumstances for the coefficients on (x_1) and (x_2) by multiplying (e_1) by the instrumental variables (x_1) and (x_2), respectively. Multiplying (e_2) by 1 creates the rating equation that defines the (a_1) cutoff. Multiplying (e_3) by 1 creates the rating equation that defines the (a_2) cutoff.

Under, I exploit gmm to resolve these second circumstances.

Instance 5: Ordered probit by gmm


. matrix b0 = (.1, .2, .1, .2)

. gmm (e1:                                                                 
>  (t==0)*(-normalden(-{xb:x1 x2}+{a1})/regular({a1}-{xb:}))                
> +(t==1)*(-(normalden({a2}-{xb:})-normalden({a1}-{xb:}))/                 
>           (regular({a2}-{xb:})-normal({a1}-{xb:})))                       
> +(t==2)*(normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                   
>  )                                                                       
>  (e2:                                                                    
>  (t==0)*(normalden({a1}-{xb:})/regular({a1}-{xb:}))                       
> +(t==1)*(-normalden({a1}-{xb:})/(regular({a2}-{xb:})-normal({a1}-{xb:}))) 
> +(t==2)*0                                                                
>  )                                                                       
>  (e3:                                                                    
>  (t==0)*0                                                                
> +(t==1)*(normalden({a2}-{xb:})/(regular({a2}-{xb:})-normal({a1}-{xb:})))  
> +(t==2)*(-normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                  
>  )                                                                       
>  ,                                                                       
>  onestep winitial(identification)                                              
>  devices(e1: x1 x2, noconstant)                                      
>  devices(e2: )                                                       
>  devices(e3: ) from(b0)

Step 1
Iteration 0:   GMM criterion Q(b) =  1.1148682
Iteration 1:   GMM criterion Q(b) =  .19813694
Iteration 2:   GMM criterion Q(b) =  .01214783
Iteration 3:   GMM criterion Q(b) =    .000558
Iteration 4:   GMM criterion Q(b) =  1.254e-06
Iteration 5:   GMM criterion Q(b) =  1.962e-12
Iteration 6:   GMM criterion Q(b) =  8.527e-23

notice: mannequin is strictly recognized

GMM estimation

Variety of parameters =   4
Variety of moments    =   4
Preliminary weight matrix: Id                   Variety of obs   =     10,000

------------------------------------------------------------------------------
             |               Sturdy
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7878772   .0434813    18.12   0.000     .7026554     .873099
          x2 |   1.017705   .0427286    23.82   0.000     .9339587    1.101452
-------------+----------------------------------------------------------------
         /a1 |   2.084122   .0332366    62.71   0.000     2.018979    2.149264
         /a2 |   2.824316   .0405148    69.71   0.000     2.744908    2.903723
------------------------------------------------------------------------------
Devices for equation e1: x1 x2
Devices for equation e2: _cons
Devices for equation e3: _cons

. matrix b0 = e(b)

Within the code above, the error operate e1: and its devices x1 and x2 outline the second circumstances that outline the coefficients on x1 and x2. The error operate e2: defines the second situation for the cutoff a1. The error operate e3: defines the second situation for the cutoff a2.

The observations on the generalized error capabilities are lacking when all of the parameters are assigned the identical worth, as a result of (F(a_2-xb)-F(a_1-xb)= 0) on this case. I specified beginning values utilizing possibility from() as a result of gmm makes use of zero as a beginning worth for every parameter, which makes the generalized error capabilities zero on this case.

I saved the purpose estimates within the matrix b0 to make use of these as beginning values for the ordered probit parameters in instance 7.

Extra particulars in regards to the syntax of gmm are offered in Understanding the generalized technique of moments (GMM): A easy instance, Utilizing gmm to resolve two-step estimation issues, and Estimating parameters by most probability and technique of moments utilizing mlexp and gmm.

Under is the gmm syntax for estimating the three weighted means, when taking the oprobit parameters as given.

Instance 6: Weighted means by gmm


. gmm (e4: ((t==0)/pr0)*(y - {POM0}))      
>     (e5: ((t==1)/pr1)*(y - {POM1}))      
>     (e6: ((t==2)/pr2)*(y - {POM2}))      
>    ,                                     
>    onestep                               
>    winitial(identification)                    
>    devices(e4: )                     
>    devices(e5: )                     
>    devices(e6: )

Step 1
Iteration 0:   GMM criterion Q(b) =  7.1678846
Iteration 1:   GMM criterion Q(b) =  5.602e-27
Iteration 2:   GMM criterion Q(b) =  2.221e-32

notice: mannequin is strictly recognized

GMM estimation

Variety of parameters =   3
Variety of moments    =   3
Preliminary weight matrix: Id                   Variety of obs   =     10,000

------------------------------------------------------------------------------
             |               Sturdy
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       /POM0 |   1.105473   .0191732    57.66   0.000     1.067894    1.143052
       /POM1 |   1.524924   .0647433    23.55   0.000     1.398029    1.651819
       /POM2 |   1.920994    .126397    15.20   0.000      1.67326    2.168727
------------------------------------------------------------------------------
Devices for equation e4: _cons
Devices for equation e5: _cons
Devices for equation e6: _cons

Under, I used the ordered probit estimates saved in instance 5 as beginning values for the ordered probit parameters and 0.1 for the POM parameters. Combining the oprobit circumstances and the weighted imply circumstances yields

Instance 7: Ordered probit IPW utilizing gmm


. matrix b0 = (b0, .1, .1, .1 )

. gmm (e1:                                                                 
>  (t==0)*(-normalden(-{xb:x1 x2}+{a1})/regular({a1}-{xb:}))                
> +(t==1)*(-(normalden({a2}-{xb:})-normalden({a1}-{xb:}))/                 
>         (regular({a2}-{xb:})-normal({a1}-{xb:})))                         
> +(t==2)*(normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                   
>  )                                                                       
>  (e2:                                                                    
>  (t==0)*(normalden({a1}-{xb:})/regular({a1}-{xb:}))                       
> +(t==1)*(-normalden({a1}-{xb:})/(regular({a2}-{xb:})-normal({a1}-{xb:}))) 
> +(t==2)*0                                                                
>  )                                                                       
>  (e3:                                                                    
>  (t==0)*0                                                                
> +(t==1)*(normalden({a2}-{xb:})/(regular({a2}-{xb:})-normal({a1}-{xb:})))  
> +(t==2)*(-normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                  
>  )                                                                       
>  (e4:                                                                    
>  ((t==0)/regular({a1}-{xb:}))*(y - {POM0}))                               
>  (e5:                                                                    
>  ((t==1)/(regular({a2}-{xb:})-normal({a1}-{xb:})))*(y - {POM1}))          
>  (e6:                                                                    
>  ((t==2)/(1-normal({a2}-{xb:})))*(y - {POM2}))                           
>  ,                                                                       
>  onestep winitial(identification)                                              
>  devices(e1: x1 x2, noconstant)                                      
>  devices(e2: )                                                       
>  devices(e3: )                                                       
>  devices(e4: )                                                       
>  devices(e5: )                                                       
>  devices(e6: )                                                       
>  from(b0)

Step 1
Iteration 0:   GMM criterion Q(b) =  6.2961378
Iteration 1:   GMM criterion Q(b) =  1.668e-20
Iteration 2:   GMM criterion Q(b) =  1.736e-31

notice: mannequin is strictly recognized

GMM estimation

Variety of parameters =   7
Variety of moments    =   7
Preliminary weight matrix: Id                   Variety of obs   =     10,000

------------------------------------------------------------------------------
             |               Sturdy
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7878772   .0434813    18.12   0.000     .7026554     .873099
          x2 |   1.017705   .0427286    23.82   0.000     .9339587    1.101452
-------------+----------------------------------------------------------------
         /a1 |   2.084122   .0332366    62.71   0.000     2.018979    2.149264
         /a2 |   2.824316   .0405148    69.71   0.000     2.744908    2.903723
       /POM0 |   1.105473   .0181701    60.84   0.000      1.06986    1.141086
       /POM1 |   1.524924   .0615369    24.78   0.000     1.404314    1.645534
       /POM2 |   1.920994    .123474    15.56   0.000     1.678989    2.162998
------------------------------------------------------------------------------
Devices for equation e1: x1 x2
Devices for equation e2: _cons
Devices for equation e3: _cons
Devices for equation e4: _cons
Devices for equation e5: _cons
Devices for equation e6: _cons

The purpose estimates and the usual errors reported by gmm are constant.

Executed and undone

I confirmed the best way to estimate the POMs when the weights come from an ordered probit mannequin. Second circumstances outline the ordered probit estimator and the next weighted common used to estimate the POMs. I used gmm to acquire constant commonplace errors by stacking the ordered-probit second circumstances and the weighted imply second circumstances.

An ordered-probit inverse likelihood weighted (IPW) estimator

Related Articles

Programming an estimation command in Stata: A greater OLS command

Deterministic vs Stochastic Defined (ML & Danger Examples)

AI could also be extensively used, however drives enterprise at simply 25% of companies

Latest Articles

Programming an estimation command in Stata: A greater OLS command

Deterministic vs Stochastic Defined (ML & Danger Examples)

AI could also be extensively used, however drives enterprise at simply 25% of companies

Google AI Introduces STATIC: A Sparse Matrix Framework Delivering 948x Sooner Constrained Decoding for LLM Based mostly Generative Retrieval

Can the Iranian regime survive after Khamenei?