Monday, February 9, 2026

Programming an estimation command in Stata: Dealing with issue variables in a poisson command utilizing Mata


mypoisson2.ado handles issue variables and computes its Poisson regression leads to Mata. I focus on the code for mypoisson2.ado, which I obtained by including the tactic for dealing with issue variables mentioned in Programming an estimation command in Stata: Dealing with issue variables in optimize() to mypoisson1.ado, mentioned in Programming an estimation command in Stata: A poisson command utilizing Mata.

That is the twenty-first publish within the collection Programming an estimation command in Stata. I like to recommend that you simply begin initially. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this collection.

A Poisson command with Mata computations

mypoisson2 computes Poisson regression leads to Mata. The syntax of the mypoisson2 command is

mypoisson2 depvar indepvars [if] [in] [, noconstant]

the place indepvars can comprise issue variables or time-series variables.

Within the the rest of this publish, I focus on the code for mypoisson2.ado. I like to recommend that you simply click on on the filename to obtain the code. To keep away from scrolling, view the code within the do-file editor, or your favourite textual content editor, to see the road numbers.

Code block 1: mypoisson2.ado


*! model 2.0.0  07Feb2016
program outline mypoisson2, eclass sortpreserve
    model 14

    syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant ]
    marksample touse

    gettoken depvar indepvars : varlist
    _fv_check_depvar `depvar'

    tempname b mo V N rank

    getcinfo `indepvars' , `fixed'
    native  cnames "`r(cnames)'"
    matrix `mo' = r(mo)

    mata: mywork("`depvar'", "`cnames'", "`touse'", "`fixed'", ///
       "`b'", "`V'", "`N'", "`rank'", "`mo'")

    if "`fixed'" == "" {
        native cnames "`cnames' _cons"
    }
    matrix colnames `b' = `cnames'
    matrix colnames `V' = `cnames'
    matrix rownames `V' = `cnames'

    ereturn publish `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `rank'
    ereturn native  cmd     "mypoisson1"

    ereturn show

finish

program getcinfo, rclass
    syntax varlist(ts fv), [ noCONStant ]

    _rmcoll `varlist' , `fixed' broaden
    native cnames `r(varlist)'
    native p : phrase depend `cnames'
    if "`fixed'" == "" {
        native p = `p' + 1
        native cons _cons
    }

    tempname b mo

    matrix `b' = J(1, `p', 0)
    matrix colnames `b' = `cnames' `cons'
    _ms_omit_info `b'
    matrix `mo' = r(omit)

    return native  cnames "`cnames'"
    return matrix mo = `mo'
finish

mata:

void mywork( string scalar depvar,  string scalar indepvars,
             string scalar touse,   string scalar fixed,
             string scalar bname,   string scalar Vname,
             string scalar nname,   string scalar rname,
             string scalar mo)
{

    actual vector y, b
    actual matrix X, V, Ct
    actual scalar n, p, rank

    y = st_data(., depvar, touse)
    n = rows(y)
    X = st_data(., indepvars, touse)
    if (fixed == "") {
        X = X,J(n, 1, 1)
    }
    p = cols(X)

    Ct = makeCt(mo)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval2())
    optimize_init_params(S, J(1, p, .01))
    optimize_init_constraints(S, Ct)

    b    = optimize(S)
    V    = optimize_result_V_oim(S)
    rank = p - diag0cnt(invsym(V))

    st_matrix(bname, b)
    st_matrix(Vname, V)
    st_numscalar(nname, n)
    st_numscalar(rname, rank)
}

actual matrix makeCt(string scalar mo)
{
    actual vector mo_v
    actual scalar ko, j, p

    mo_v = st_matrix(mo)
    p    = cols(mo_v)
    ko   = sum(mo_v)
    if (ko>0) {
        Ct   = J(0, p, .)
        for(j=1; j<=p; j++) {
            if (mo_v[j]==1) {
                Ct  = Ct  e(j, p)
            }
        }
        Ct = Ct, J(ko, 1, 0)
    }
    else {
        Ct = J(0,p+1,.)
    }

    return(Ct)

}

void plleval2(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb

    xb = X*b'
   val = sum(-exp(xb) + y:*xb - lnfactorial(y))
}

finish

As with packages that I’ve beforehand mentioned, there may be an ado half and Mata half. Strains 2–56 are the ado half; they outline mypoisson2 and the subroutine getcinfo. Strains 58–133 are the Mata half; they outline the Mata work operate mywork() utilized in mypoisson2, the makeCt operate utilized in mywork(), and the evaluator operate plleval2() utilized in mywork().

The ado-command mypoisson2 has the next elements:

  1. Strains 5–11 parse what the consumer typed, determine the pattern, and create non permanent names for Stata objects used within the computations or returned by our Mata work operate.
  2. Strains 13–15 use the subroutine getcinfo to get details about the user-specified covariates after which retailer this data within the native cnames and a Stata matrix.
  3. Strains 17–18 name the Mata work operate.
  4. Strains 20–30 publish the outcomes returned by the Mata work operate to e().
  5. Line 32 shows the outcomes.

The Mata operate mywork() has the next elements:

  1. Strains 60–65 parse the arguments.
  2. Strains 67–69 declare vectors, matrices, and scalars which can be native to mywork().
  3. Strains 71–90 compute the outcomes.
  4. Strains 92–95 copy the computed outcomes to Stata, utilizing the names that have been handed as arguments.

I now focus on the ado-code in some element, specializing in solely the elements which can be new to mypoisson2.ado.

The subroutine getcinfo encapsulates the computations carried out in examples 3, 4, and 5 in Programming an estimation command in Stata: Dealing with issue variables in optimize(). getcinfo makes use of _rmcoll to determine which covariates should be omitted, shops the names of the covariates to be omitted within the native macro cnames, then makes use of _ms_omit_info to create a vector containing a 1 for omitted variables and a 0 in any other case. getcinfo places cnames into r(cnames) and the vector figuring out the omitted variables into r(mo).

Strains 14–15 retailer the knowledge put into r() by getcinfo within the native macro cnames and the Stata vector whose title is contained within the native macro mo. Strains 23–25 use cnames to place row names on the vector of level estimates and row and column names on the estimated variance–covariance matrix of the estimator (VCE). Line 18 passes the vector to mywork().

I now focus on the Mata code in some element, once more specializing in solely the brand new elements. Line 79 will get the constraint matrix Ct wanted to deal with any omitted variables from makeCt(). Strains 98–121 outline makeCt(), which encapsulates the computations that kind Cm in instance 6 in Programming an estimation command in Stata: Dealing with issue variables in optimize(). Line 86 makes use of optimize_init_constraints() to place Ct within the optimize() object. Ct incorporates a matrix with zero rows when there are not any constraints, and placing a constraint matrix with zero rows into the optimize() object tells optimize() that there are not any constraints.

The output in examples 1 and a couple of confirms that mypoisson2 produces the identical outcomes as poisson when a full set of indicator variables is included in a mannequin with a continuing time period.

Instance 1: mypoisson2 outcomes


. clear all

. use accident3

. mypoisson2 accidents cvalue ibn.children visitors, noconstant
Iteration 0:   f(p) = -843.66874
Iteration 1:   f(p) = -573.50561
Iteration 2:   f(p) = -545.86215
Iteration 3:   f(p) = -545.11765
Iteration 4:   f(p) = -545.10899
Iteration 5:   f(p) = -545.10898
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |  -.6582923   .0703823    -9.35   0.000    -.7962391   -.5203456
             |
        children |
          0  |   .7157575    .282144     2.54   0.011     .1627653     1.26875
          1  |  -.9465934   .3111915    -3.04   0.002    -1.556518   -.3366693
          2  |  -.8589336   .3097583    -2.77   0.006    -1.466049   -.2518184
          3  |  -2.518175   .4366261    -5.77   0.000    -3.373947   -1.662404
             |
     visitors |   .1383977   .0307285     4.50   0.000      .078171    .1986243
------------------------------------------------------------------------------

Instance 2: poisson outcomes


. poisson accidents cvalue ibn.children visitors, noconstant

Iteration 0:   log probability = -1250.3959
Iteration 1:   log probability = -553.73534
Iteration 2:   log probability = -545.14915
Iteration 3:   log probability = -545.10902
Iteration 4:   log probability = -545.10898

Poisson regression                              Variety of obs     =        505
                                                Wald chi2(6)      =     285.69
Log probability = -545.10898                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |  -.6582924   .0703822    -9.35   0.000     -.796239   -.5203457
             |
        children |
          0  |   .7157576   .2821434     2.54   0.011     .1627666    1.268749
          1  |  -.9465933    .311191    -3.04   0.002    -1.556516   -.3366701
          2  |  -.8589334   .3097578    -2.77   0.006    -1.466048   -.2518192
          3  |   -2.51817    .436625    -5.77   0.000     -3.37394   -1.662401
             |
     visitors |   .1383977   .0307284     4.50   0.000     .0781711    .1986242
------------------------------------------------------------------------------

Completed and undone

I mentioned mypoisson2, which handles issue variables and makes use of Mata to compute Poisson regression outcomes. In my subsequent publish, I add strong and cluster–strong estimators of the VCE.



Related Articles

Latest Articles