Saturday, February 28, 2026

Programming an estimation command in Stata: Permitting for choices


I make three enhancements to the command that implements the abnormal least-squares (OLS) estimator that I mentioned in Programming an estimation command in Stata: Permitting for pattern restrictions and issue variables. First, I enable the consumer to request a strong estimator of the variance-covariance of the estimator (VCE). Second, I enable the consumer to suppress the fixed time period. Third, I retailer the residual levels of freedom in e(df_r) in order that check will use the (t) or (F) distribution as a substitute of the conventional or (chi^2) distribution to compute the (p)-value of Wald exams.

That is the ninth put up within the sequence Programming an estimation command in Stata. I like to recommend that you just begin at first. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this sequence.

Permitting for sturdy normal errors

The syntax of myregress6, which I mentioned in Programming an estimation command in Stata: Permitting for pattern restrictions and issue variables, is

myregress6 depvar [indepvars] [if] [in]

the place the impartial variables could be time-series or issue variables. myregress7 has the syntax

myregress7 depvar [indepvars] [if] [in] [, robust ]

By default, myregress7 estimates the VCE assuming that the errors are independently and identically distributed (IID). If the choice sturdy is specified, myregress7 makes use of the sturdy estimator of the VCE. See Cameron and Trivedi (2005), Inventory and Watson (2010), Wooldridge (2015) for introductions to OLS; see Programming an estimation command in Stata: Utilizing Stata matrix instructions and features to compute OLS objects for the formulation and Stata matrix implementations. Click on on the file title to obtain any code block. To keep away from scrolling, view the code within the do-file editor, or your favourite textual content editor, to see the road numbers.

Code block 1: myregress7.ado


*! model 7.0.0  30Nov2015
program outline myregress7, eclass
    model 14

    syntax varlist(numeric ts fv) [if] [in] [, Robust]
    marksample touse

    gettoken depvar indeps : varlist
    _fv_check_depvar `depvar'

    tempname zpz xpx xpy xpxi b V
    tempvar  xbhat res res2 

    quietly matrix accum `zpz' = `varlist' if `touse'
    native N                    = r(N)
    native p                    = colsof(`zpz')
    matrix `xpx'               = `zpz'[2..`p', 2..`p']
    matrix `xpy'               = `zpz'[2..`p', 1]
    matrix `xpxi'              = syminv(`xpx')
    native okay                    = `p' - diag0cnt(`xpxi') - 1
    matrix `b'                 = (`xpxi'*`xpy')'
    quietly matrix rating double `xbhat' = `b' if `touse'
    quietly generate double `res'       = (`depvar' - `xbhat') if `touse'
    quietly generate double `res2'      = (`res')^2 if `touse'
    if "`sturdy'" == "" {
        quietly summarize `res2' if `touse' , meanonly
        native sum           = r(sum)
        native s2            = `sum'/(`N'-(`okay'))
        matrix `V'          = `s2'*`xpxi'
    }
    else {
        tempname M
        quietly matrix accum `M' = `indeps' [iweight=`res2'] if `touse'
        matrix `V'               = (`N'/(`N'-(`okay')))*`xpxi'*`M'*`xpxi'
        native vce                   "sturdy"          
        native vcetype               "Strong"          
    }
    ereturn put up `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `okay'
    ereturn native  vce     "`vce'"
    ereturn native  vcetype "`vcetype'"
    ereturn native  cmd     "myregress7"
    ereturn show
finish

A consumer might specify the sturdy choice by typing sturdy, robus, robu, rob, ro, or r. In different phrases, r is the minimal abbreviation of the choice sturdy. Line 5 of myregress7 implements this syntax. Specifying sturdy is elective as a result of Strong is enclosed within the sq. brackets. r is the minimal abbreviation as a result of the R is in uppercase and the remaining letters are in lowercase.

If the consumer specifies sturdy, or a legitimate abbreviation thereof, the native macro sturdy accommodates the phrase “sturdy”; in any other case, the native macro sturdy is empty. Line 25 makes use of this reality to find out which VCE must be computed; it specifies that traces 26–31 must be executed if the native macro sturdy is empty and that traces 32-36 ought to in any other case be executed. Traces 26-31 compute the IID estimator of the VCE. Traces 32-34 compute the sturdy estimator of the VCE. Traces 35 and 36 respectively put “sturdy” and “Strong” into the native macros vce and vcetype.

Line 41 places the contents of the native macro vce into the native macro e(vce), which informs customers and postestimation instructions which VCE estimator was used. By conference, e(vce) is empty for the IID case. Line 42 places the contents of the native macro vcetype into the native macro e(vcetype), which is utilized by ereturn show to appropriately label the usual errors as sturdy.

I now run a regression with sturdy normal errors.

Instance 1: myregress7 with sturdy normal errors

. sysuse auto
(1978 Vehicle Knowledge)

. myregress7 worth mpg trunk i.rep78, sturdy
------------------------------------------------------------------------------
             |               Strong
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -262.7053   74.75538    -3.51   0.000    -409.2232   -116.1875
       trunk |   41.75706   73.71523     0.57   0.571    -102.7221    186.2362
             |
       rep78 |
          2  |   654.7905   1132.425     0.58   0.563    -1564.721    2874.302
          3  |   1170.606   823.9454     1.42   0.155    -444.2979    2785.509
          4  |   1473.352   650.4118     2.27   0.023     198.5679    2748.135
          5  |   2896.888   937.6981     3.09   0.002     1059.034    4734.743
             |
       _cons |   9726.377   2040.335     4.77   0.000     5727.393    13725.36
------------------------------------------------------------------------------

Suppressing the fixed time period

myregress8 has the syntax

myregress8 depvar [indepvars] [if] [in] [, robust noconstant ]

Code block 2: myregress8.ado


*! model 8.0.0  30Nov2015
program outline myregress8, eclass
    model 14

    syntax varlist(numeric ts fv) [if] [in] [, Robust noCONStant ]
    marksample touse

    gettoken depvar indeps : varlist
    _fv_check_depvar `depvar'

    tempname zpz xpx xpy xpxi b V
    tempvar  xbhat res res2 

    quietly matrix accum `zpz' = `varlist' if `touse' , `fixed'
    native N                    = r(N)
    native p                    = colsof(`zpz')
    matrix `xpx'               = `zpz'[2..`p', 2..`p']
    matrix `xpy'               = `zpz'[2..`p', 1]
    matrix `xpxi'              = syminv(`xpx')
    native okay                    = `p' - diag0cnt(`xpxi') - 1
    matrix `b'                 = (`xpxi'*`xpy')'
    quietly matrix rating double `xbhat' = `b' if `touse'
    quietly generate double `res'       = (`depvar' - `xbhat') if `touse'
    quietly generate double `res2'      = (`res')^2 if `touse'
    if "`sturdy'" == "" {
        quietly summarize `res2' if `touse' , meanonly
        native sum           = r(sum)
        native s2            = `sum'/(`N'-(`okay'))
        matrix `V'          = `s2'*`xpxi'
    }
    else {
        tempname M
        quietly matrix accum `M' = `indeps' [iweight=`res2']     ///
            if `touse' , `fixed'
        matrix `V'               = (`N'/(`N'-(`okay')))*`xpxi'*`M'*`xpxi'
        native vce                   "sturdy"          
        native vcetype               "Strong"          
    }
    ereturn put up `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `okay'
    ereturn native  vce     "`vce'"
    ereturn native  vcetype "`vcetype'"
    ereturn native  cmd     "myregress8"
    ereturn show
finish

The syntax command on line 5 places “noconstant” into the native macro fixed if the consumer varieties nocons, noconst, noconsta, noconstan, or noconstant; in any other case, the native macro fixed is empty. The minimal abbreviation of choice noconstant is nocons as a result of the lowercase no is adopted by CONStant. Notice that specifying the choice creates the native macro fixed as a result of the no is adopted by uppercase letters specifying the minimal abbreviation.

To implement the choice, I specified what’s contained within the native macro fixed as an choice on the matrix accum command on line 14 and on the matrix accum command unfold over traces 33 and 34. The matrix accum command that begins on line 33 is simply too lengthy for one line. I used /// to remark out the end-of-line character and proceed the command on line 34.

I now illustrate the noconstant choice.

Instance 2: myregress8 with choice noconstant

. myregress8 worth mpg trunk ibn.rep78, noconstant
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -262.7053   73.49434    -3.57   0.000    -406.7516   -118.6591
       trunk |   41.75706    93.9671     0.44   0.657    -142.4151    225.9292
             |
       rep78 |
          1  |   9726.377   2790.009     3.49   0.000      4258.06    15194.69
          2  |   10381.17   2607.816     3.98   0.000     5269.943    15492.39
          3  |   10896.98   2555.364     4.26   0.000     5888.561     15905.4
          4  |   11199.73    2588.19     4.33   0.000      6126.97    16272.49
          5  |   12623.27   2855.763     4.42   0.000     7026.073    18220.46
------------------------------------------------------------------------------

Utilizing t or F distributions

The output tables reported in examples 1 and a couple of use the conventional distribution to compute (p)-values and confidence intervals, as a result of Wald-based postestimation commmands like check and ereturn show use the conventional or the (chi^2) distribution except the residual levels of freedom are saved in e(df_r).

Code block 3: myregress9.ado


*! model 9.0.0  30Nov2015
program outline myregress9, eclass
    model 14

    syntax varlist(numeric ts fv) [if] [in] [, Robust noCONStant ]
    marksample touse

    gettoken depvar indeps : varlist
    _fv_check_depvar `depvar'

    tempname zpz xpx xpy xpxi b V
    tempvar  xbhat res res2 

    quietly matrix accum `zpz' = `varlist' if `touse' , `fixed'
    native N                    = r(N)
    native p                    = colsof(`zpz')
    matrix `xpx'               = `zpz'[2..`p', 2..`p']
    matrix `xpy'               = `zpz'[2..`p', 1]
    matrix `xpxi'              = syminv(`xpx')
    native okay                    = `p' - diag0cnt(`xpxi') - 1
    matrix `b'                 = (`xpxi'*`xpy')'
    quietly matrix rating double `xbhat' = `b' if `touse'
    quietly generate double `res'       = (`depvar' - `xbhat') if `touse'
    quietly generate double `res2'      = (`res')^2 if `touse'
    if "`sturdy'" == "" {
        quietly summarize `res2' if `touse' , meanonly
        native sum           = r(sum)
        native s2            = `sum'/(`N'-(`okay'))
        matrix `V'          = `s2'*`xpxi'
    }
    else {
        tempname M
        quietly matrix accum `M' = `indeps' [iweight=`res2']     ///
            if `touse' , `fixed'
        matrix `V'               = (`N'/(`N'-(`okay')))*`xpxi'*`M'*`xpxi'
        native vce                   "sturdy"          
        native vcetype               "Strong"          
    }
    ereturn put up `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `okay'
    ereturn scalar df_r    = `N'-`okay'
    ereturn native  vce     "`vce'"
    ereturn native  vcetype "`vcetype'"
    ereturn native  cmd     "myregress8"
    ereturn show
finish

Line 42 of myregress9.ado shops the residual levels of freedom in e(df_r). Instance 3 illustrates that ereturn show and check now use the (t) and (F) distributions.

Instance 3: t or F distributions after myregress9

. myregress9 worth mpg trunk ibn.rep78, noconstant
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -262.7053   73.49434    -3.57   0.001    -409.6184   -115.7923
       trunk |   41.75706    93.9671     0.44   0.658    -146.0805    229.5946
             |
       rep78 |
          1  |   9726.377   2790.009     3.49   0.001     4149.229    15303.53
          2  |   10381.17   2607.816     3.98   0.000     5168.219    15594.12
          3  |   10896.98   2555.364     4.26   0.000     5788.882    16005.08
          4  |   11199.73    2588.19     4.33   0.000     6026.011    16373.45
          5  |   12623.27   2855.763     4.42   0.000     6914.677    18331.85
------------------------------------------------------------------------------

. check trunk

 ( 1)  trunk = 0

       F(  1,    62) =    0.20
            Prob > F =    0.6583

Finished and undone

I added an choice for the sturdy estimator of the VCE, I added an choice to suppress the fixed time period, and I saved the residual levels of freedom in e(df_r) in order that Wald-based postestimation instructions will use (t) or (F) distributions. I illustrated choice parsing by instance, however I skipped the overall concept and lots of particulars. Sort . assist syntax for extra particulars about parsing choices utilizing the syntax command.

Within the subsequent put up, I implement the fashionable syntax for sturdy and cluster-robust normal errors.

References

Cameron, A. C., and P. Okay. Trivedi. 2005. Microeconometrics: Strategies and functions. Cambridge: Cambridge College Press.

Inventory, J. H., and M. W. Watson. 2010. Introduction to Econometrics. third ed. Boston, MA: Addison Wesley New York.

Wooldridge, J. M. 2015. Introductory Econometrics: A Fashionable Strategy. sixth ed. Cincinnati, Ohio: South-Western.



Related Articles

Latest Articles