Thursday, February 26, 2026

Programming an estimation command in Stata: Utilizing a subroutine to parse a posh choice


I make two enhancements to the command that implements the atypical least-squares (OLS) estimator that I mentioned in Programming an estimation command in Stata: Permitting for choices. First, I add an choice for a cluster-robust estimator of the variance-covariance of the estimator (VCE). Second, I make the command settle for the fashionable syntax for both a sturdy or a cluster-robust estimator of the VCE. Within the course of, I take advantage of subroutines in my ado-program to facilitate the parsing, and I focus on some superior parsing methods.

That is the tenth put up within the sequence Programming an estimation command in Stata. I like to recommend that you just begin initially. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this sequence.

Permitting for a sturdy or a cluster-robust VCE

The syntax of myregress9, which I mentioned in Programming an estimation command in Stata: Permitting for choices, is

myregress9 depvar [indepvars] [if] [in] [, robust noconstant]

The syntax of myregress10, which I focus on right here, is

myregress10 depvar [indepvars] [if] [in] [, vce(robust | cluster clustervar) noconstant]

By default, myregress10 estimates the VCE assuming that the errors are independently and identically distributed (IID). If the choice vce(strong) is specified, myregress10 makes use of the strong estimator of the VCE. If the choice vce(cluster clustervar) is specified, myregress10 makes use of the cluster-robust estimator of the VCE. See Cameron and Trivedi (2005), Inventory and Watson (2010), or Wooldridge (2010, 2015) for introductions to OLS; see Programming an estimation command in Stata: Utilizing Stata matrix instructions and capabilities to compute OLS objects for the formulation and Stata matrix implementations.

I like to recommend that you just click on on the file identify to obtain the code for my myregress10.ado. To keep away from scrolling, view the code within the do-file editor, or your favourite textual content editor, to see the road numbers.

Code block 1: myregress10.ado


*! model 10.0.0  02Dec2015
program outline myregress10, eclass sortpreserve
    model 14

    syntax varlist(numeric ts fv) [if] [in] [, vce(string) noCONStant ]
    marksample touse

    gettoken depvar indeps : varlist
    _fv_check_depvar `depvar'

    tempname zpz xpx xpy xpxi b V
    tempvar  xbhat res res2 

    if `"`vce'"' != "" {
        my_vce_parse , vce(`vce') 
        native vcetype     "strong"
        native clustervar  "`r(clustervar)'"
        if "`clustervar'" != "" {
            markout `touse' `clustervar'
            kind `clustervar'
        }
    }

    quietly matrix accum `zpz' = `varlist' if `touse' , `fixed'
    native N                    = r(N)
    native p                    = colsof(`zpz')
    matrix `xpx'               = `zpz'[2..`p', 2..`p']
    matrix `xpy'               = `zpz'[2..`p', 1]
    matrix `xpxi'              = syminv(`xpx')
    matrix `b'                 = (`xpxi'*`xpy')'
    native ok                    = `p' - diag0cnt(`xpxi') - 1
    quietly matrix rating double `xbhat' = `b' if `touse'
    quietly generate double `res'       = (`depvar' - `xbhat') if `touse'
    quietly generate double `res2'      = (`res')^2 if `touse'

    if "`vcetype'" == "strong" {
        if "`clustervar'" == "" {
            tempname M
            quietly matrix accum `M' = `indeps'         ///
                [iweight=`res2'] if `touse' , `fixed'
            native fac                = (`N'/(`N'-`ok'))
            native df_r               = (`N'-`ok')
        }
        else  {
            tempvar idvar
            tempname M
            quietly egen `idvar' = group(`clustervar') if `touse'
            quietly summarize `idvar' if `touse', meanonly
            native Nc   = r(max)
            native fac  = ((`N'-1)/(`N'-`ok')*(`Nc'/(`Nc'-1)))
            native df_r = (`Nc'-1)
            matrix opaccum `M' = `indeps' if `touse'     ///
                , group(`clustervar') opvar(`res')
        }
        matrix `V' = (`fac')*`xpxi'*`M'*`xpxi'
        native vce                   "strong"          
        native vcetype               "Strong"          
    }
    else {                            // IID Case
        quietly summarize `res2' if `touse' , meanonly
        native sum           = r(sum)
        native s2            = `sum'/(`N'-`ok')
        native df_r          = (`N'-`ok')
        matrix `V'          = `s2'*`xpxi'
    }

    ereturn put up `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `ok'
    ereturn scalar df_r    = `df_r'
    ereturn native  vce     "`vce'"
    ereturn native  vcetype "`vcetype'"
    ereturn native clustvar "`clustvar'"
    ereturn native  cmd     "myregress10"
    ereturn show
finish

program outline my_vce_parse, rclass
    syntax  [, vce(string) ]

    native case : phrase rely `vce'
    
    if `case' > 2 {
        my_vce_error , typed(`vce')
    }

    native 0 `", `vce'"' 
    syntax  [, Robust CLuster * ]

    if `case' == 2 {
        if "`strong'" == "strong" | "`cluster'" == "" {
            my_vce_error , typed(`vce')
        }

        seize affirm numeric variable `choices'
        if _rc {
            my_vce_error , typed(`vce')
        }

        native clustervar "`choices'" 
    }
    else {    // case = 1
        if "`strong'" == "" {
            my_vce_error , typed(`vce')
        }

    }

    return clear    
    return native clustervar "`clustervar'" 
finish

program outline my_vce_error
    syntax , typed(string)

    show `"{pink}{bf:vce(`typed')} invalid"'
    error 498
finish

The syntax command on line 5 places regardless of the consumer encloses in vce() into a neighborhood macro known as vce. For instance, if the consumer varieties


. myregress10 value mpg trunk , vce(whats up there)

the native macro vce will include “whats up there”. If the consumer doesn’t specify one thing within the vce() choice, the native macro vce will likely be empty. Line 14 makes use of this situation to execute strains 15–21 provided that the consumer has specified one thing in choice vce().

When the consumer specifies one thing within the vce() choice, line 15 calls the ado subroutine my_vce_parse to parse what’s within the native macro vce. my_vce_parse shops the identify of the cluster variable in r(clustervar) and offers with error situations, as I focus on beneath. Line 16 shops “strong” into the native macro vcetype, and line 17 shops the contents of the native macro r(clustervar) created by my_vce_parse into the native macro and clustervar.

If the consumer doesn’t specify one thing in vce(), the native macro vcetype will likely be empty and line 36 ensures that myregress10 will compute an IID estimator of the VCE.

Traces 19 and 20 are solely executed if the native macro clustervar will not be empty. Line 19 updates the touse variable, whose identify is saved within the native macro touse, to account for lacking values within the cluster variable, whose identify is saved in clustervar. Line 20 kinds the dataset within the ascending order of the cluster variable. Customers don’t want estimation instructions resorting their datasets. On line 2, I specified the sortpreserve choice on program outline to maintain the dataset within the order it was in when myregress10 was executed by the consumer.

Traces 36–65 compute the requested estimator for the VCE. Recall that the native macro vcetype is empty or it accommodates “strong” and that the native macro clustervar is empty or it accommodates the identify of the cluster variable. The if and else statements use the values saved in vcetype and clustervar to execute considered one of three blocks of code.

  1. Traces 38–42 compute a sturdy estimator of the VCE when vcetype accommodates “strong” and clustervar is empty.
  2. Traces 45–53 compute a cluster-robust of the VCE when vcetype accommodates “strong” and clustervar accommodates the identify of the cluster variable.
  3. Traces 60–64 compute an IID estimator of the VCE when vcetype doesn’t include “strong”.

Line 73 shops the identify of the cluster variable in e(clustervar), if the native macro clustervar will not be empty.

Traces 78–111 outline the rclass ado-subroutine my_vce_parse, which performs two duties. First, it shops the identify of the cluster variable within the native macro r(clustervar) when the consumer specifies vce(cluster clustervar). Second, it finds instances through which the consumer specified a syntax error in vce() and returns an error in such instances.

Placing these parsing particulars right into a subroutine makes the principle command a lot simpler to comply with. I like to recommend that you just encapsulate particulars in subroutines.

The ado-subroutine my_vce_parse is native to the ado-command myregress10; the identify my_vce_parse is in a namespace native to myregress10, and my_vce_parse can solely be executed from inside myregress10.

Line 79 makes use of syntax to retailer regardless of the consumer specified within the choice vce() within the native macro vce. Line 81 places the variety of phrases in vce into the native macro case. Line 83 causes the ado-subroutine my_vce_error to show an error message and return error code 498 when there are greater than two phrases in vce. (Recall that vce ought to include both strong or cluster clustervar.)

Having dominated out the instances with greater than two phrases, line 87 shops what the native macro vce accommodates within the native macro 0. Line 88 makes use of syntax to parse what’s within the native macro 0. If the consumer specified vce(strong), or a legitimate abbreviation thereof, syntax shops “strong” within the native macro strong; in any other case, the native macro strong is empty. If the consumer specified vce(cluster one thing), or a legitimate abbreviation of cluster, syntax shops “cluster” within the native macro cluster; in any other case, the native macro cluster is empty. The choice * causes syntax to place any remaining choices into the native macro choices. On this case, syntax will retailer the one thing within the native macro choices.

Bear in mind the trick utilized in strains 87 and 88. Possibility parsing is ceaselessly made a lot simpler by storing what a neighborhood macro accommodates within the native macro 0 and utilizing syntax to parse it.

When there are two phrases within the native macro vce, strains 91–100 make sure that the primary phrase is “cluster” and that the second phrase, saved within the native macro choices, is the identify of a numeric variable. When all is properly, line 100 shops the identify of this numeric variable within the native macro clustervar. Traces 95–98 use a refined building to show a customized error message. Moderately than let affirm show an error message, strains 95–98 use seize and an if situation to show our customized error message. Intimately, line 95 makes use of affirm to substantiate that the native macro choices accommodates the identify of a numeric variable. seize places the return code produced by affirm within the scalar _rc. When choices accommodates the identify of a numeric variable, affirm produces the return code 0 and seize shops 0 in _rc; in any other case, affirm produces a constructive return code, and seize shops this constructive return code in _rc.

When all is properly, line 109 clears no matter was in r(), and line 110 shops the identify of the cluster variable in r(clustervar).

Traces 113–118 outline the ado-subroutine my_vce_error, which shows a customized error message. Like my_vce_parse, my_vce_error is native to myregress10.ado.

Carried out and undone

I added an choice for a cluster-robust estimator of the VCE, and I made myregress10 settle for the fashionable syntax for both a sturdy or a cluster-robust estimator of the VCE. Within the course of, I used subroutines in myregress10.ado to facilitate the parsing, and I mentioned some superior parsing methods.

Studying myregress10.ado would have been tougher to learn if I had not used subroutines to simplify the principle routine.

Though it could appear that I’ve coated each attainable nuance, I’ve solely handled just a few. Sort assist syntax for extra particulars about parsing choices utilizing the syntax command.

References

Cameron, A. C., and P. Okay. Trivedi. 2005. Microeconometrics: Strategies and functions. Cambridge: Cambridge College Press.

Inventory, J. H., and M. W. Watson. 2010. Introduction to Econometrics. third ed. Boston, MA: Addison Wesley New York.

Wooldridge, J. M. 2010. Econometric Evaluation of Cross Part and Panel Information. 2nd ed. Cambridge, Massachusetts: MIT Press.

Wooldridge, J. M. 2015. Introductory Econometrics: A Trendy Method. sixth ed. Cincinnati, Ohio: South-Western.



Related Articles

Latest Articles