Thursday, February 5, 2026

Programming an estimation command in Stata: Making predict work


I make predict work after mypoisson5 by writing an ado-command that computes the predictions and by having mypoisson5 retailer the title of this new ado-command in e(predict). The ado-command that computes predictions utilizing the parameter estimates computed by ado-command mytest ought to be named mytest_p, by conference. Within the subsequent part, I talk about mypoisson5_p, which computes predictions after mypoisson5. In part Storing the title of the prediction command in e(predict), I present that storing the title mypoisson5_p in e(predict) requires solely a one-line change to mypoisson4.ado, which I mentioned in Programming an estimation command in Stata: Including analytical derivatives to a poisson command utilizing Mata.

That is the twenty-fourth put up within the sequence Programming an estimation command in Stata. I like to recommend that you simply begin originally. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this sequence.

An ado-command that computes predictions

The syntax of mypoisson5_p is

mypoisson5_p [type] newvarname [if] [in] [, n xb]

mypoisson5_p computes the anticipated variety of counts when choice n is specified, and it computes the linear predictions when choice xb is specified. n is the default choice, if neither xb or n is specified by the person. Regardless of the syntax diagram, the person could not specify each xb and n.

Now think about the code for this command in code block 1.

mypoisson5_p.ado


*! model 1.0.0  10Mar2016
program outline mypoisson5_p
    model 14

    syntax newvarname [if] [in] , [ xb n ]

    marksample touse, novarlist

    native nopts : phrase rely `xb' `n'
    if `nopts' >1 {
        show "{err}just one statistic could also be specified"
        exit 498
    }

    if `nopts' == 0 {
        native n n
        show "anticipated counts"
    }

    if "`xb'" != "" {
        _predict `typlist' `varlist' if `touse' , xb
    }
    else {
        tempvar xbv
        quietly _predict double `xbv' if `touse' , xb
        generate `typlist' `varlist' = exp(`xbv') if `touse'
    }
finish

Line 5 makes use of syntax to parse the syntax specified within the syntax diagram above. Line 5 specifies that mypoisson5_p requires the title of a brand new variable, that it permits an if or in situation, and that it accepts the choices xb and n. syntax newvarname specifies that the person should specify a reputation for a variable that’s not within the dataset in reminiscence. syntax shops the title of the brand new variable within the native macro varlist. If the person specifies a variable kind along with the variable title, the sort can be saved within the native macro typlist. For instance, if the person specified


. mypoisson5_p double yhat

the native macro varlist would include “yhat” and the native macro typlist would include “double”. If the person doesn’t specify a kind, the native macro typlist comprises nothing.

Line 7 makes use of marksample to create a sample-identification variable whose title can be within the native macro touse. In contrast to the examples in Programming an estimation command in Stata: Permitting for pattern restrictions and issue variables, I specified the choice novarlist on marksample in order that marksample will use solely the user-specified if or in restrictions to create the sample-identification variable and never use the nonexistent observations within the new variable.

The choices xb and n specify which statistic to compute. The syntax command on line 5 permits customers to specify

  1. the xb choice,
  2. the n choice,
  3. each the xb choice and the n choice, or
  4. neither the xb choice nor the n choice.

In case (1), the native macro xb will include “xb” and the native macro n will include nothing. In case (2), the native macro xb will include nothing and the native macro n will include “n”. In case (3), the native macro xb will include “xb” and the native macro n will include “n”. In case (4), the native macro xb will include nothing and the native macro n will include nothing.

The syntax diagram and its dialogue indicate that instances (1), (2), and (4) are legitimate, however that case (3) could be an error. Line 9 places the variety of choices specified by the person within the native macro nopts. The remainder of the code makes use of nopts, xb, and n to deal with instances (1)–(4).

Strains 10–13 deal with case (3) by exiting with a well mannered error message when nopts comprises 2.

Strains 15–18 deal with case (4) by placing “n” within the native macro n when nopts comprises 0.

At this level, we’ve got dealt with instances (3) and (4), and we use xb and n to deal with instances (1) and (2), as a result of both xb just isn’t empty and n is empty, or xb is empty and n just isn’t empty.

Strains 20–22 deal with case (1) by utilizing _predict to compute the xb predictions when the native macro xb just isn’t empty. Be aware that the predictions are computed on the precision specified by the person.

Strains 23–27 deal with case (2) by utilizing _predict to compute xb in a brief variable that’s subsequently used to compute n. Be aware that the short-term variable for xb is at all times computed in double precision and that the n is computed on the precision specified by the person.

Storing the title of the prediction command in e(predict)

To compute the xb statistic, customers kind


. predict double yhat, xb

as an alternative of typing


. mypoisson5_p double yhat, xb

This syntax works as a result of the predict command makes use of the ado-command whose title is saved in e(predict). On line 50 of mypoisson5 in code block 2, I retailer “mypoisson5_p” in e(predict). This addition is the one distinction between mypoisson5.ado in code block 2 and mypoisson4.ado in code block 5 in Programming an estimation command in Stata: Including analytical derivatives to a poisson command utilizing Mata.

Code block 2: mypoisson5.ado


*! model 5.0.0  10Mar2016
program outline mypoisson5, eclass sortpreserve
    model 14

    syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant vce(string) ]
    marksample touse

    _vce_parse `touse' , optlist(Strong) argoptlist(CLuster) : , vce(`vce')
    native vce        "`r(vce)'"
    native clustervar "`r(cluster)'"
    if "`vce'" == "strong" | "`vce'" == "cluster" {
        native vcetype "Strong"
    }
    if "`clustervar'" != "" {
        seize verify numeric variable `clustervar'
        if _rc {
            show in pink "invalid vce() choice"
            show in pink "cluster variable {bf:`clustervar'} is " ///
                "string variable as an alternative of a numeric variable"
            exit(198)
        }
        kind `clustervar'
    }

    gettoken depvar indepvars : varlist
    _fv_check_depvar `depvar'

    tempname b mo V N rank

    getcinfo `indepvars' , `fixed'
    native  cnames "`r(cnames)'"
    matrix `mo' = r(mo)

    mata: mywork("`depvar'", "`cnames'", "`touse'", "`fixed'", ///
       "`b'", "`V'", "`N'", "`rank'", "`mo'", "`vce'", "`clustervar'")

    if "`fixed'" == "" {
        native cnames "`cnames' _cons"
    }
    matrix colnames `b' = `cnames'
    matrix colnames `V' = `cnames'
    matrix rownames `V' = `cnames'

    ereturn put up `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `rank'
    ereturn native  vce      "`vce'"
    ereturn native  vcetype  "`vcetype'"
    ereturn native  clustvar "`clustervar'"
    ereturn native  predict "mypoisson5_p"
    ereturn native  cmd     "mypoisson5"

    ereturn show

finish

program getcinfo, rclass
    syntax varlist(ts fv), [ noCONStant ]

    _rmcoll `varlist' , `fixed' develop
    native cnames `r(varlist)'
    native p : phrase rely `cnames'
    if "`fixed'" == "" {
        native p = `p' + 1
        native cons _cons
    }

    tempname b mo

    matrix `b' = J(1, `p', 0)
    matrix colnames `b' = `cnames' `cons'
    _ms_omit_info `b'
    matrix `mo' = r(omit)

    return native  cnames "`cnames'"
    return matrix mo = `mo'
finish

mata:

void mywork( string scalar depvar,  string scalar indepvars,
             string scalar touse,   string scalar fixed,
             string scalar bname,   string scalar Vname,
             string scalar nname,   string scalar rname,
             string scalar mo,
             string scalar vcetype, string scalar clustervar)
{

    actual vector y, b
    actual matrix X, V, Ct
    actual scalar n, p, rank

    y = st_data(., depvar, touse)
    n = rows(y)
    X = st_data(., indepvars, touse)
    if (fixed == "") {
        X = X,J(n, 1, 1)
    }
    p = cols(X)

    Ct = makeCt(mo)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf2")
    optimize_init_params(S, J(1, p, .01))
    optimize_init_constraints(S, Ct)

    b    = optimize(S)

    if (vcetype == "strong") {
        V    = optimize_result_V_robust(S)
    }
    else if (vcetype == "cluster") {
        cvar = st_data(., clustervar, touse)
        optimize_init_cluster(S, cvar)
        V    = optimize_result_V_robust(S)
    }
    else {                 // vcetype should IID
        V    = optimize_result_V_oim(S)
    }
    rank = p - diag0cnt(invsym(V))

    st_matrix(bname, b)
    st_matrix(Vname, V)
    st_numscalar(nname, n)
    st_numscalar(rname, rank)
}

actual matrix makeCt(string scalar mo)
{
    actual vector mo_v
    actual scalar ko, j, p

    mo_v = st_matrix(mo)
    p    = cols(mo_v)
    ko   = sum(mo_v)
    if (ko>0) {
        Ct   = J(0, p, .)
        for(j=1; j<=p; j++) {
            if (mo_v[j]==1) {
                Ct  = Ct  e(j, p)
            }
        }
        Ct = Ct, J(ko, 1, 0)
    }
    else {
        Ct = J(0,p+1,.)
    }

    return(Ct)

}

void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
    if (todo==2) {
        hess = -quadcross(X, mu, X)
    }
}

finish

Instance 1 illustrates that our implementation works by evaluating the predictions obtained after mypoisson5 with these obtained after poisson.

Instance 1: predict after mypoisson5


. clear all

. use accident3

. quietly poisson accidents cvalue children visitors

. predict double n1
(choice n assumed; predicted variety of occasions)

. quietly mypoisson5 accidents cvalue children visitors

. predict double n2
anticipated counts

. record n1 n2 in 1/5

     +-----------------------+
     |        n1          n2 |
     |-----------------------|
  1. | .15572052   .15572052 |
  2. | .47362502   .47362483 |
  3. | .46432954   .46432946 |
  4. | .84841301   .84841286 |
  5. | .40848207   .40848209 |
     +-----------------------+

Executed and undone

I made predict work after mypoisson5 by writing an ado-command that computes the prediction and by storing the title of this ado-command in e(predict).

In my subsequent put up, I talk about how one can verify {that a} working command continues to be working, a subject often called certification.



Related Articles

Latest Articles