regress, probit, or logit? – The Stata Weblog

February 18, 2026

2

In a earlier put up I illustrated that the probit mannequin and the logit mannequin produce statistically equal estimates of marginal results. On this put up, I evaluate the marginal impact estimates from a linear chance mannequin (linear regression) with marginal impact estimates from probit and logit fashions.

My simulations present that when the true mannequin is a probit or a logit, utilizing a linear chance mannequin can produce inconsistent estimates of the marginal results of curiosity to researchers. The conclusions hinge on the probit or logit mannequin being the true mannequin.

Simulation outcomes

For all simulations beneath, I exploit a pattern measurement of 10,000 and 5,000 replications. The true data-generating processes (DGPs) are constructed utilizing one discrete covariate and one steady covariate. I examine the common impact of a change within the steady variable on the conditional chance (AME) and the common impact of a change within the discrete covariate on the conditional chance (ATE). I additionally take a look at the impact of a change within the steady variable on the conditional chance, evaluated on the imply worth of the covariates (MEM), and the impact of a change within the discrete covariate on the conditional chance, evaluated on the imply worth of the covariates (TEM).

In Desk 1, I current the outcomes of a simulation when the true DGP satisfies the assumptions of a logit mannequin. I present the common of the AME and the ATE estimates and the 5% rejection fee of the true null hypotheses. I additionally present an approximate true worth of the AME and ATE. I get hold of the approximate true values by computing the ATE and AME, on the true values of the coefficients, utilizing a pattern of 20 million observations. I’ll present extra particulars on the simulation in a later part.

Desk 1: Common Marginal and Therapy Results: True DGP Logit

Simulation Outcomes for N=10,000 and 5,000 Replications
Statistic	Approximate True Worth	Logit	Regress (LPM)
AME of x1	-.084	-.084	-.094
5% Rejection Charge		.050	.99
ATE of x2	.092	.091	.091
5% Rejection Charge		.058	.058

From Desk 1, we see that the logit mannequin estimates are near the true worth and that the rejection fee of the true null speculation is shut to five%. For the linear chance mannequin, the rejection fee is 99% for the AME. For the ATE, the rejection fee and level estimates are shut to what’s estimated utilizing a logit.

For the MEM and TEM, we’ve got the next:

Desk 2: Marginal and Therapy Eects at Imply Values: True DGP Logit

Simulation Outcomes for N=10,000 and 5,000 Replications
Statistic	Approximate True Worth	Logit	Regress (LPM)
MEM of x1	-.099	-.099	-.094
5% Rejection Charge		.054	.618
TEM of x2	.109	.109	.092
5% Rejection Charge		.062	.073

Once more, logit estimates behave as anticipated. For the linear chance mannequin, the rejection fee of the true null speculation is 62% for the MEM. For the TEM the rejection fee is 7.3%, and the estimated impact is smaller than the true impact.

For the AME and ATE, when the true GDP is a probit, we’ve got the next:

Desk 3: Common Marginal and Therapy Results: True DGP Probit

Simulation Outcomes for N=10,000 and 5,000 Replications
Statistic	Approximate True Worth	Probit	Regress (LPM)
AME of x1	-.094	-.094	-.121
5% Rejection Charge		.047	1
ATE of x2	.111	.111	.111
5% Rejection Charge		.065	.061

The probit mannequin estimates are near the true worth, and the rejection fee of the true null speculation is shut to five%. For the linear chance mannequin, the rejection fee is 100% for the AME. For the ATE, the rejection fee and level estimates are shut to what’s estimated utilizing a probit.

For the MEM and TEM, we’ve got the next:

Desk 4: Marginal and Therapy Results at Imply Values: True DGP Probit

Simulation Outcomes for N=10,000 and 5,000 Replications
Statistic	Approximate True Worth	Probit	Regress (LPM)
MEM of x1	-.121	-.122	-.121
5% Rejection Charge		.063	.054
TEM of x2	.150	.150	.110
5% Rejection Charge		.059	.158

For the MEM, the probit and linear chance mannequin produce dependable inference. For the TEM, the probit marginal results behave as anticipated, however the linear chance mannequin has a rejection fee of 16%, and the purpose estimates are usually not near the true worth.

Simulation design

Beneath is the code I used to generate the information for my simulations. Within the first half, traces 6 to 13, I generate final result variables that fulfill the assumptions of the logit mannequin, y, and the probit mannequin, yp. Within the second half, traces 15 to 19, I compute the marginal results for the logit and probit fashions. I’ve a steady and a discrete covariate. For the discrete covariate, the marginal impact is a therapy impact. Within the third half, traces 21 to 29, I compute the marginal results evaluated on the means. I’ll use these estimates later to compute approximations to the true values of the consequences.


program outline mkdata
    syntax, [n(integer 1000)]
    clear
    quietly set obs `n'
    // 1. Producing knowledge from probit, logit, and misspecified 
    generate x1    = rchi2(2)-2
    generate x2    = rbeta(4,2)>.2
    generate u     = runiform()
    generate e     = ln(u) -ln(1-u) 
    generate ep    = rnormal()
    generate xb    = .5*(1 - x1 + x2)
    generate y     =  xb + e > 0
    generate yp    = xb + ep > 0 
    // 2. Computing probit & logit marginal and therapy results 
    generate m1   = exp(xb)*(-.5)/(1+exp(xb))^2
    generate m2   = exp(1 -.5*x1)/(1+ exp(1 -.5*x1 )) - ///
	              exp(.5 -.5*x1)/(1+ exp(.5 -.5*x1 ))
    generate m1p  = normalden(xb)*(-.5)
    generate m2p  = regular(1 -.5*x1 ) - regular(.5 -.5*x1)
    // 3. Computing marginal and therapy results at means
    quietly imply x1 x2 
    matrix A        = r(desk)
    scalar a        = .5 -.5*A[1,1] + .5*A[1,2]
    scalar b1       =  1 -.5*A[1,1]
    scalar b0       = .5 -.5*A[1,1]
    generate mean1  = exp(a)*(-.5)/(1+exp(a))^2
    generate mean2  = exp(b1)/(1+ exp(b1)) - exp(b0)/(1+ exp(b0))
    generate mean1p = normalden(a)*(-.5)
    generate mean2p = regular(b1) - regular(b0)
finish

I approximate the true marginal results utilizing a pattern of 20 million observations. This can be a affordable technique on this case. For instance, take the common marginal impact for a steady covariate, (x_{okay}), within the case of the probit mannequin:

[begin{equation*}
frac{1}{N}sum_{i=1}^N phileft(x_{i}mathbb{beta}right)beta_{k}
end{equation*}]

The expression above is an approximation of (Eleft(phileft(x_{i}mathbb{beta}proper)beta_{okay}proper)). To acquire this anticipated worth, we would wish to combine over the distribution of all of the covariates. This isn’t sensible and would restrict my selection of covariates. As an alternative, I draw a pattern of 20 million observations, compute (frac{1}{N}sum_{i=1}^N phileft(x_{i}mathbb{beta}proper)beta_{okay}), and take it to be the true worth. I observe the identical logic for the opposite marginal results.

Beneath is the code I exploit to compute the approximate true marginal results. I draw the 20 million observations, compute the averages that I wil use in my simulation, and create locals for every approximate true worth.


. mkdata, n(`L')
(2 lacking values generated)

. native values "m1 m2 mean1 mean2 m1p m2p mean1p mean2p"

. native means  "mx1 mx2 meanx1 meanx2 mx1p mx2p meanx1p meanx2p"

. native n : phrase rely `values'

. 
. forvalues i= 1/`n' {
  2.         native a: phrase `i' of `values'
  3.         native b: phrase `i' of `means'
  4.         sum `a', meanonly
  5.         native `b' = r(imply)
  6. }

Now, I’m able to run all of the simulations that I used to provide the ends in the earlier part. The code that I used for the simulations for the TEM and the MEM when the true DGP is a logit is given by:


. postfile lpm y1l y1l_r y1lp y1lp_r y2l y2l_r y2lp y2lp_r ///
>                 utilizing simslpm, exchange 

. forvalues i=1/`R' {
  2.         quietly {
  3.                 mkdata, n(`N')
  4.                 logit  y x1 i.x2, vce(strong) 
  5.                 margins, dydx(*) atmeans put up  vce(unconditional)
  6.                 native y1l = _b[x1]
  7.                 check _b[x1] = `meanx1'
  8.                 native y1l_r   = (r(p)<.05) 
  9.                 native y2l = _b[1.x2]
 10.                 check _b[1.x2] = `meanx2'
 11.                 native y2l_r   = (r(p)<.05) 
 12.                 regress  y x1 i.x2, vce(strong) 
 13.                 margins, dydx(*) atmeans put up  vce(unconditional)
 14.                 native y1lp = _b[x1]
 15.                 check _b[x1] = `meanx1'
 16.                 native y1lp_r   = (r(p)<.05) 
 17.                 native y2lp = _b[1.x2]
 18.                 check _b[1.x2] = `meanx2'
 19.                 native y2lp_r   = (r(p)<.05) 
 20.                 put up lpm (`y1l') (`y1l_r') (`y1lp') (`y1lp_r') ///
>                          (`y2l') (`y2l_r') (`y2lp') (`y2lp_r')
 21.         }
 22. }

. postclose lpm

. use simslpm, clear 

. sum 

    Variable |        Obs        Imply    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         y1l |      5,000   -.0985646      .00288  -.1083639  -.0889075
       y1l_r |      5,000       .0544     .226828          0          1
        y1lp |      5,000   -.0939211    .0020038  -.1008612  -.0868043
      y1lp_r |      5,000       .6182    .4858765          0          1
         y2l |      5,000    .1084959     .065586  -.1065291   .3743112
-------------+---------------------------------------------------------
       y2l_r |      5,000       .0618     .240816          0          1
        y2lp |      5,000    .0915894     .055462  -.0975456   .3184061
      y2lp_r |      5,000       .0732    .2604906          0          1

For the outcomes for the AME and the ATE when the true DGP is a logit, I exploit margins with out the atmeans possibility. The opposite circumstances are related. I exploit strong commonplace errors for all computations as a result of my chance mannequin is an approximation to the true chance, and I exploit the choice vce(unconditional) to account for the truth that I’m utilizing two-step M-estimation. See Wooldridge (2010) for extra particulars on two-step M-estimation.

You may get hold of the code used to provide these outcomes right here.

Conclusion

Utilizing a probit or a logit mannequin yields equal marginal results. I present proof that the identical can’t be stated of the marginal impact estimates of the linear chance mannequin when put next with these of the logit and probit fashions.

Acknowledgment

This put up was impressed by a query posed by Stephen Jenkins after my earlier put up.

Reference

Wooldridge, J. M. 2010. Econometric Evaluation of Cross Part and Panel Knowledge. 2nd ed. Cambridge, Massachusetts: MIT Press.

regress, probit, or logit? – The Stata Weblog

Related Articles

Personalization options could make LLMs extra agreeable | MIT Information

Claude Sonnet 4.6 improves coding expertise

Cease settling — Workplace for Mac drops 77% for lifetime entry

Latest Articles

Personalization options could make LLMs extra agreeable | MIT Information

Claude Sonnet 4.6 improves coding expertise

Cease settling — Workplace for Mac drops 77% for lifetime entry

Some snakes lack the ‘starvation hormone.’ Specialists are hungry to know why

AI That Auto-Generates Analysis Diagrams