Programming an estimation command in Stata: Dealing with issue variables in optimize()

February 11, 2026

4

(
newcommand{xb}{{bf x}}
newcommand{betab}{boldsymbol{beta}})I focus on a technique for dealing with issue variables when performing nonlinear optimization utilizing optimize(). After illustrating the difficulty brought on by issue variables, I current a technique and apply it to an instance utilizing optimize().

That is the twenty publish within the sequence Programming an estimation command in Stata. I like to recommend that you simply begin at first. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this sequence.

How poisson handles issue variables

Contemplate the Poisson regression by which I embrace a full set of indicator variables created from the specific variable youngsters and a continuing time period.

Instance 1: Collinear issue variables


. clear all

. use accident3

. poisson accidents cvalue ibn.youngsters site visitors, coeflegend
word: 3.youngsters omitted due to collinearity

Iteration 0:   log probability = -546.35782
Iteration 1:   log probability = -545.11016
Iteration 2:   log probability = -545.10898
Iteration 3:   log probability = -545.10898

Poisson regression                              Variety of obs     =        505
                                                LR chi2(5)        =     361.62
                                                Prob > chi2       =     0.0000
Log probability = -545.10898                     Pseudo R2         =     0.2491

------------------------------------------------------------------------------
   accidents |      Coef.  Legend
-------------+----------------------------------------------------------------
      cvalue |  -.6582924  _b[cvalue]
             |
        youngsters |
          0  |   3.233932  _b[0bn.kids]
          1  |   1.571582  _b[1.kids]
          2  |   1.659241  _b[2.kids]
          3  |          0  _b[3o.kids]
             |
     site visitors |   .1383977  _b[traffic]
       _cons |  -2.518175  _b[_cons]
------------------------------------------------------------------------------

The total set of indicator variables is collinear with the fixed time period. The output exhibits that no variables have been dropped however the title 3o.youngsters specifies that 3.youngsters was omitted. Omitted variables aren’t dropped; as an alternative their coefficients are constrained to zero.

Specifying variables as omitted as an alternative of dropping them permits postestimation options equivalent to margins to work correctly.

For the case in instance 1, poisson is maximizing the log-likelihood operate topic to constraint that (beta_{3.youngsters}=0). When it comes to the parameter vector (betab), I signify this constraint by

[
left[begin{matrix}
0&0&0&0&1&0&0
end{matrix}right]
betab’ = 0
]

the place (betab=(
betab_{cvalue},
betab_{0.youngsters},
betab_{1.youngsters},
betab_{2.youngsters},
betab_{3.youngsters},
betab_{site visitors},
betab_{_cons}))

Usually, I can signify (q) linear equality constraints on a (1times okay) parameter vector as

[
{bf C}betab’ = {bf c}
]

the place ({bf C}) is a (qtimes okay) matrix and ({bf c}) is a (qtimes 1) vector. These constraints are conveniently represented as (widetilde{bf C}=left[{bf C},{bf c}right]).

I now present tips on how to use optimize() to unravel optimization issues topic to linear equality constraints by placing (widetilde{bf C}) into the optimize object. In code block 1, I take advantage of optimize() to maximise the Poisson log-likelihood operate for the issue in instance 1. Code block 1 augments instance 3 in Programming an estimation command in Stata: Utilizing optimize() to estimate Poisson parameters through the use of optimize_init_constraints() to impose a linear equality constraint on the coefficient vector.

Code block 1: Linear equality constraints in optimize()


mata:
void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb

    xb = X*b'
   val = -exp(xb) + y:*xb - lnfactorial(y)
}

y  = st_data(., "accidents")
X  = st_data(., "cvalue ibn.youngsters site visitors")
X  = X,J(rows(X), 1, 1)

C  = e(5, 7)
c  = 0
Ct = C,c

S  = optimize_init()
optimize_init_argument(S, 1, y)
optimize_init_argument(S, 2, X)
optimize_init_evaluator(S, &plleval3())
optimize_init_evaluatortype(S, "gf0")
optimize_init_params(S, J(1, 7, .01))
optimize_init_constraints(S, Ct)

bh = optimize(S)
optimize_result_params(S)
finish

Solely line 13, strains 16–18, and line 26 differ from the code in instance 3 in Programming an estimation command in Stata: Utilizing optimize() to estimate Poisson parameters. Line 13 illustrates that st_data() can create the indicator variables from the issue variable ibn.youngsters. Traces 16–18 outline the constraint matrix (widetilde{bf C}) for this downside. Line 26 places (widetilde{bf C}) into the optimize() object S, which causes optimize() to maximise the Poisson log-likelihood operate with evaluator plleval3() topic to constraints laid out in matrix Ct.

Instance 2 illustrates that code block 1 reproduces the purpose estimates reported in instance 1.

Instance 2: Linear equality constraints in optimize()


. do laptop

. mata:
------------------------------------------------- mata (kind finish to exit) -----
: void plleval3(actual scalar todo, actual vector b,     ///
>               actual vector y,    actual matrix X,     ///
>               val, grad, hess)
> {
>     actual vector  xb
>
>     xb = X*b'
>    val = -exp(xb) + y:*xb - lnfactorial(y)
> }
word: argument todo unused
word: argument grad unused
word: argument hess unused

:
: y  = st_data(., "accidents")

: X  = st_data(., "cvalue ibn.youngsters site visitors")

: X  = X,J(rows(X), 1, 1)

:
: C  = e(5, 7)

: c  = 0

: Ct = C,c

:
: S  = optimize_init()

: optimize_init_argument(S, 1, y)

: optimize_init_argument(S, 2, X)

: optimize_init_evaluator(S, &plleval3())

: optimize_init_evaluatortype(S, "gf0")

: optimize_init_params(S, J(1, 7, .01))

: optimize_init_constraints(S, Ct)

:
: bh = optimize(S)
Iteration 0:   f(p) = -845.47138
Iteration 1:   f(p) = -572.68676
Iteration 2:   f(p) = -545.68381
Iteration 3:   f(p) = -545.11241
Iteration 4:   f(p) = -545.10898
Iteration 5:   f(p) = -545.10898
: optimize_result_params(S)
                  1              2              3              4
    +-------------------------------------------------------------
  1 |  -.6582923624    3.233932519    1.571581623     1.65924145
    +-------------------------------------------------------------
                  5              6              7
     ----------------------------------------------+
  1               0      .13839766   -2.518174926  |
     ----------------------------------------------+

: finish
-------------------------------------------------------------------------------

.
finish of do-file

Code block 1 exhibits tips on how to use a linear equality constraint to deal with collinear variables after we know which variables are omitted. Within the code for an estimation command, we should

discover which variables will probably be omitted, and
create the constraint matrix (widetilde{bf C}) that imposes the constraints implied by omitting these variables.

Instance 3 illustrates that _rmcoll shops a listing of variables that identifies which variables will probably be omitted in r(varlist), thereby fixing downside 1.

Instance 3: Utilizing _rmcoll to establish omitted variables


. _rmcoll cvalue ibn.youngsters site visitors, develop
word: 3.youngsters omitted due to collinearity

. return listing

scalars:
          r(k_omitted) =  1

macros:
            r(varlist) : "cvalue 0bn.youngsters 1.youngsters 2.youngsters 3o.youngsters site visitors"

. native cnames "`r(varlist)' _cons"

I specified the choice develop in order that _rmcoll would develop any issue variables. The expanded variable listing within the native r(varlist) identifies 3.youngsters as a variable that should be omitted. I then put this expanded variable listing, augmented by the title _cons, within the native macro cnames.

Right here is a top level view for the answer to downside 2 that I current in examples 4–6.

In instance 4, I create the Stata vector bt, whose column names are contained in cnames.
In instance 5, I take advantage of _ms_omit_info to create the Stata vector bto, which signifies which variables will probably be omitted from bt.
In instance 6, I create a Mata matrix specifying the constraints from bto.

Now for the small print, starting with instance 4.

Instance 4: Placing the coefficient names on a Stata vector


. matrix bt = J(1, 7, 0)

. matrix colnames bt = `cnames'

. matrix listing bt

bt[1,7]
                   0.       1.       2.      3o.
     cvalue     youngsters     youngsters     youngsters     youngsters  site visitors    _cons
r1        0        0        0        0        0        0        0

cnames accommodates the names of the coefficients for this downside, so I create a conformable row vector bt, make cnames the column names on bt, and show bt. The values in bt don’t matter; the column names are the essential data.

In instance 5, _ms_omit_info makes use of the column names on a Stata vector to create the vector r(omit), which specifies which variables are omitted.

Instance 5: Making a vector that signifies omitted variables


. matrix bt = J(1, 8, 0)

. matrix colnames bt = `cnames'

. _ms_omit_info bt

. return listing

scalars:
             r(k_omit) =  1

matrices:
               r(omit) :  1 x 8

. matrix bto = r(omit)

. matrix listing bto

bto[1,8]
    c1  c2  c3  c4  c5  c6  c7  c8
r1   0   0   0   0   1   0   0   0

A component of r(omit) is 1 if the corresponding variable is omitted. A component of r(omit) is 0 if the corresponding variable will not be omitted. I put a duplicate of r(omit) in bto.

In instance 6, I create a constraint matrix from bto. The loop in instance 6 will create the constraint matrix implied by any r(omit) vector created by _ms_omit_info.

Instance 6: Making a constraint matrix from r(omit)


. mata:
------------------------------------------------- mata (kind finish to exit) -----
: mo = st_matrix("bto")

: ko = sum(mo)

: p  = cols(mo)

: if (ko>0) {
>     Cm   = J(0, p, .)
>     for(j=1; j<=p; j++) {
>         if (mo[j]==1) {
>             Cm  = Cm  e(j, p)
>         }
>     }
>     Cm = Cm, J(ko, 1, 0)
> }
> else {
>     Cm = J(0,p+1,.)
> }

: "Constraint matrix is "
  Constraint matrix is 

: Cm
       1   2   3   4   5   6   7   8   9
    +-------------------------------------+
  1 |  0   0   0   0   1   0   0   0   0  |
    +-------------------------------------+

: finish
-------------------------------------------------------------------------------

After copying bto to the Mata vector mo, I put the variety of constraints within the scalar ko and the variety of parameters in p. If there are constraints, I initialize Cm to be a matrix with zero rows and p columns, use a for loop to iteratively append a brand new row corresponding to every constraint recognized in mo, and end by appending a ko (instances) 1 column of zeros on to Cm. If there are not any constraints, I put a matrix with zero rows and p+1 columns in Cm.

No matter whether or not there are any omitted variables, I can put the Cm matrix created by the tactic in instance 6 into an optimize() object. If there are not any omitted variables, Cm may have zero rows, and no constraints will probably be imposed. If there are omitted variables, Cm may have ko rows, and the constraints for the omitted variables will probably be imposed.

Code block 2 combines these items right into a coherent instance.

Code block 2: Placing all of it collectively


clear all
use accident3
native depvar    "accidents"
native indepvars "cvalue ibn.youngsters site visitors"
_rmcoll `indepvars', develop
native cnames "`r(varlist)' _cons"
native p   : phrase depend `cnames'
matrix bt = J(1, `p', 0)
matrix colnames bt = `cnames'
_ms_omit_info bt
matrix bto = r(omit)

mata:
void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb

    xb = X*b'
   val = -exp(xb) + y:*xb - lnfactorial(y)
}

y  = st_data(., "`depvar'")
X  = st_data(., "`indepvars'")
X  = X,J(rows(X), 1, 1)

mo = st_matrix("bto")
ko = sum(mo)
p  = cols(mo)
if (ko>0) {
    Ct   = J(0, p, .)
    for(j=1; j<=p; j++) {
        if (mo[j]==1) {
            Ct  = Ct  e(j, p)
        }
    }
    Ct = Ct, J(ko, 1, 0)
}
else {
    Ct = J(0,p+1,.)
}

S  = optimize_init()
optimize_init_argument(S, 1, y)
optimize_init_argument(S, 2, X)
optimize_init_evaluator(S, &plleval3())
optimize_init_evaluatortype(S, "gf0")
optimize_init_params(S, J(1, 7, .01))
optimize_init_constraints(S, Ct)

bh = optimize(S)
optimize_result_params(S)
finish

Traces 1–2 drop all of the objects that I created in earlier examples and browse the accident3 dataset into reminiscence. Traces 3–4 create locals to carry the dependent variable and the impartial variables.

Traces 5–6 use _rmcoll to establish which variables must be omitted and put the listing of names within the native macro cnames, as in instance 3. Traces 7–9 create bt, whose column names specify which variables must be omitted, as in instance 4. Traces 10–11 create bto, whose entries specify which variables must be omitted, as in instance 5. Traces
28–42 create the constraint matrix Ct from bto, as in instance 6.

Line 50 places Ct into the optimize() object.

Instance 7 illustrates that the code in code block 2 reproduces the
beforehand obtained outcomes.

Instance 7:Placing all of it collectively


. do pc2

. clear all

. use accident3

. native depvar    "accidents"

. native indepvars "cvalue ibn.youngsters site visitors"

. _rmcoll `indepvars', develop
word: 3.youngsters omitted due to collinearity

. native cnames "`r(varlist)' _cons"

. native p   : phrase depend `cnames'

. matrix bt = J(1, `p', 0)

. matrix colnames bt = `cnames'

. _ms_omit_info bt

. matrix bto = r(omit)

.
. mata:
------------------------------------------------- mata (kind finish to exit) -----
: void plleval3(actual scalar todo, actual vector b,     ///
>               actual vector y,    actual matrix X,     ///
>               val, grad, hess)
> {
>     actual vector  xb
>
>     xb = X*b'
>    val = -exp(xb) + y:*xb - lnfactorial(y)
> }
word: argument todo unused
word: argument grad unused
word: argument hess unused

:
: y  = st_data(., "`depvar'")

: X  = st_data(., "`indepvars'")

: X  = X,J(rows(X), 1, 1)

:
: mo = st_matrix("bto")

: ko = sum(mo)

: p  = cols(mo)

: if (ko>0) {
>     Ct   = J(0, p, .)
>     for(j=1; j         if (mo[j]==1) {
>             Ct  = Ct  e(j, p)
>         }
>     }
>     Ct = Ct, J(ko, 1, 0)
> }
> else {
>     Ct = J(0,p+1,.)
> }

:
: S  = optimize_init()

: optimize_init_argument(S, 1, y)

: optimize_init_argument(S, 2, X)

: optimize_init_evaluator(S, &plleval3())

: optimize_init_evaluatortype(S, "gf0")

: optimize_init_params(S, J(1, 7, .01))

: optimize_init_constraints(S, Ct)

:
: bh = optimize(S)
Iteration 0:   f(p) = -845.47138
Iteration 1:   f(p) = -572.68676
Iteration 2:   f(p) = -545.68381
Iteration 3:   f(p) = -545.11241
Iteration 4:   f(p) = -545.10898
Iteration 5:   f(p) = -545.10898

: optimize_result_params(S)
                  1              2              3              4
    +-------------------------------------------------------------
  1 |  -.6582923624    3.233932519    1.571581623     1.65924145
    +-------------------------------------------------------------
                  5              6              7
     ----------------------------------------------+
  1               0      .13839766   -2.518174926  |
     ----------------------------------------------+

: finish
-------------------------------------------------------------------------------

.
finish of do-file

Achieved and undone

I mentioned a technique for dealing with issue variables when performing nonlinear optimization utilizing optimize(). In my subsequent publish, I implement these strategies in an estimation command for Poisson regression.

Programming an estimation command in Stata: Dealing with issue variables in optimize()

Related Articles

US inhabitants would possibly decline for the primary time – FlowingData

West Coast Stat Views (on Observational Epidemiology and extra): The Voss Chronicles

Why Healthcare Nonetheless Isn’t Prepared for AI

Latest Articles

US inhabitants would possibly decline for the primary time – FlowingData

West Coast Stat Views (on Observational Epidemiology and extra): The Voss Chronicles

Why Healthcare Nonetheless Isn’t Prepared for AI

The demise of reactive IT: How predictive engineering will redefine cloud efficiency in 10 years

The Obtain: Contained in the QuitGPT motion, and EVs in Africa