Programming an estimation command in Stata: Including analytical derivatives to a poisson command utilizing Mata

(newcommand{xb}{{bf x}}
newcommand{betab}{boldsymbol{beta}})Utilizing analytically computed derivatives can drastically scale back the time required to resolve a nonlinear estimation downside. I present the way to use analytically computed derivatives with optimize(), and I talk about mypoisson4.ado, which makes use of these analytically computed derivatives. Just a few traces of mypoisson4.ado differ from the code for mypoisson3.ado, which I mentioned in Programming an estimation command in Stata: Permitting for sturdy or cluster–sturdy commonplace errors in a poisson command utilizing Mata.

That is the twenty-third put up within the sequence Programming an estimation command in Stata. I like to recommend that you just begin initially. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this sequence.

Analytically computed derivatives for Poisson

The contribution of the i(th) commentary to the log-likelihood perform for the Poisson maximum-likelihood estimator is
$$
L_i = -exp(xb_ibetab’) + y_ixb_ibetab’ – ln(y_i!)
$$

The vector of observation-level contributions will be coded in Mata by


    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

the place X is the matrix of observations on the covariates, b is the row vector of parameters, y is the vector of observations on the dependent variable, mu is the vector of observations on xb=X*b’, and val is the vector of observation-level contributions.

The gradient for the i(th) commentary is
$$
g_i = (y_i-exp(xb_ibetab’))xb_i
$$

The vector of all of the observation-level gradients will be coded in Mata by (y-mu):*X.

The sum of the Hessians calculated at every commentary i is
$$
H = -sum_{i=1}^N exp(xb_ibetab)xb_i’xb_i
$$

which will be coded in Mata by -quadcross(X, mu, X).

Utilizing analytically computed gradients in optimize()

The code in dex1.do implements the observation-level gradients within the evaluator perform plleval3() utilized by optimize() in dowork() to maximise the Poisson log-likelihood perform for the given knowledge.

Code block 1: dex1.do


mata:

void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
}

void dowork( )
{
    actual vector   y, b
    actual matrix   X
    actual scalar   n, p
    transmorphic  S

    y  = st_data(., "accidents")
    X  = st_data(., "cvalue children visitors ")
    n  = rows(y)
    X  = X, J(n, 1, 1)
    p  = cols(X)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf1debug")
    optimize_init_params(S, J(1, p, .01))

    b    = optimize(S)

}

dowork()

finish

Traces 2–16 outline the evaluator perform plleval3(), which shops the observation-level contributions to the log probability in val and the observation-level gradients in grad. grad is barely calculated when
todo>=1.

optimize() makes use of todo to inform the evaluator perform what it wants. At some factors within the optimization course of, optimize() wants solely the worth of the target perform, which optimize() communicates to the evaluator by setting todo=0. At different factors within the optimization course of, optimize() wants the worth of the target perform and the gradient, which optimize() communicates to the evaluator by setting todo=1. At nonetheless different factors within the optimization course of, optimize() wants the worth of the target perform, the gradient, and the Hessian, which optimize() communicates to the evaluator by setting todo=2. An evaluator perform that calculates the gradient analytically should compute it when todo=1 or todo=2. Coding >= as an alternative of == on line 13 is essential.

Traces 18–40 outline dowork(), which implements a name to optimize() to maximise the Poisson log-likelihood perform for these knowledge. Line 35 differs from the examples that I beforehand mentioned; it units the evaluator kind to gf1debug. This evaluator kind has two elements: gf1 and debug. gf1 specifies that the evaluator return observation-level contributions to the target perform and that it return a matrix of observation-level gradients when todo==1 or todo==2. Appending debug to gf1 tells optimize() to provide a report evaluating the analytically computed derivatives with these computed numerically by optimize() and to make use of the numerically computed derivatives for the optimization.

Instance 1 illustrates the spinoff comparability report.

Instance 1: gf1debug output


. clear all

. use accident3

. do dex1

. mata:
------------------------------------------------- mata (kind finish to exit) ------
:
: void plleval3(actual scalar todo, actual vector b,     ///
>               actual vector y,    actual matrix X,     ///
>               val, grad, hess)
> {
>     actual vector  xb, mu
>
>     xb  = X*b'
>     mu  = exp(xb)
>     val = (-mu + y:*xb - lnfactorial(y))
>
>     if (todo>=1) {
>         grad = (y - mu):*X
>     }
> }
be aware: argument hess unused

:
: void dowork( )
> {
>     actual vector   y, b
>     actual matrix   X
>     actual scalar   n, p
>     transmorphic  S
>
>     y  = st_data(., "accidents")
>     X  = st_data(., "cvalue children visitors ")
>     n  = rows(y)
>     X  = X, J(n, 1, 1)
>     p  = cols(X)
>
>     S  = optimize_init()
>     optimize_init_argument(S, 1, y)
>     optimize_init_argument(S, 2, X)
>     optimize_init_evaluator(S, &plleval3())
>     optimize_init_evaluatortype(S, "gf1debug")
>     optimize_init_params(S, J(1, p, .01))
>
>     b    = optimize(S)
>
> }
be aware: variable b set however not used

:
: dowork()

gf1debug:  Start derivative-comparison report ----------------------------------
gf1debug:  mreldif(gradient vectors) =  9.91e-07
gf1debug:  Warning:  evaluator didn't compute Hessian matrix
gf1debug:  Finish derivative-comparison report ------------------------------------
Iteration 0:   f(p) = -851.18669

gf1debug:  Start derivative-comparison report ----------------------------------
gf1debug:  mreldif(gradient vectors) =  2.06e-10
gf1debug:  Warning:  evaluator didn't compute Hessian matrix
gf1debug:  Finish derivative-comparison report ------------------------------------
Iteration 1:   f(p) = -556.66874

gf1debug:  Start derivative-comparison report ----------------------------------
gf1debug:  mreldif(gradient vectors) =  1.59e-07
gf1debug:  Warning:  evaluator didn't compute Hessian matrix
gf1debug:  Finish derivative-comparison report ------------------------------------
Iteration 2:   f(p) = -555.81731

gf1debug:  Start derivative-comparison report ----------------------------------
gf1debug:  mreldif(gradient vectors) =  .0000267
gf1debug:  Warning:  evaluator didn't compute Hessian matrix
gf1debug:  Finish derivative-comparison report ------------------------------------
Iteration 3:   f(p) = -555.81538

gf1debug:  Start derivative-comparison report ----------------------------------
gf1debug:  mreldif(gradient vectors) =  .0000272
gf1debug:  Warning:  evaluator didn't compute Hessian matrix
gf1debug:  Finish derivative-comparison report ------------------------------------
Iteration 4:   f(p) = -555.81538
:
: finish
--------------------------------------------------------------------------------

.
finish of do-file

For every iteration, mreldif(gradient vectors) stories the utmost relative distinction between the analytically and numerically computed derivatives. Away from the optimum, a appropriately coded analytical gradient will yield an mreldif of e-08 or smaller. The numerically computed gradients are imperfect approximations to the true gradients, and e-08 is about the very best we will reliably hope for when utilizing double precision numbers. Use the mreldif stories from iterations away from the optimum. As a result of the gradient is sort of zero on the optimum, the mreldif calculation produces an over-sized distinction for iterations close to the optimum.

Within the instance at hand, the mreldif calculations of 9.91e-07, 2.06e-10, and 1.59e-07 for iterations 0, 1, and a couple of point out that the analytically computed derivatives are appropriate.

The code in dex2.do differs from that in dex1.do by specifying the evaluator kind to be gf1 as an alternative of gf1debug on line 37. A gf1 evaluator kind differs from a gf1debug evaluator kind in that it makes use of the analytically computed gradients within the optimization, the numerical gradients are usually not computed, and there are not any spinoff comparability stories.

Code block 2: dex2.do


mata:

mata drop plleval3() dowork()

void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
}

void dowork( )
{
    actual vector   y, b
    actual matrix   X
    actual scalar   n, p
    transmorphic  S

    y  = st_data(., "accidents")
    X  = st_data(., "cvalue children visitors ")
    n  = rows(y)
    X  = X, J(n, 1, 1)
    p  = cols(X)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf1")
    optimize_init_params(S, J(1, p, .01))

    b    = optimize(S)

}

dowork()

finish

Instance 2 illustrates the output.

Instance 2: gf1 output


. do dex2

. mata:
------------------------------------------------- mata (kind finish to exit) ------
:
: mata drop plleval3() dowork()

:
: void plleval3(actual scalar todo, actual vector b,     ///
>               actual vector y,    actual matrix X,     ///
>               val, grad, hess)
> {
>     actual vector  xb, mu
>
>     xb  = X*b'
>     mu  = exp(xb)
>     val = (-mu + y:*xb - lnfactorial(y))
>
>     if (todo>=1) {
>         grad = (y - mu):*X
>     }
> }
be aware: argument hess unused

:
: void dowork( )
> {
>     actual vector   y, b
>     actual matrix   X
>     actual scalar   n, p
>     transmorphic  S
>
>     y  = st_data(., "accidents")
>     X  = st_data(., "cvalue children visitors ")
>     n  = rows(y)
>     X  = X, J(n, 1, 1)
>     p  = cols(X)
>
>     S  = optimize_init()
>     optimize_init_argument(S, 1, y)
>     optimize_init_argument(S, 2, X)
>     optimize_init_evaluator(S, &plleval3())
>     optimize_init_evaluatortype(S, "gf1")
>     optimize_init_params(S, J(1, p, .01))
>
>     b    = optimize(S)
>
> }
be aware: variable b set however not used

:
: dowork()
Iteration 0:   f(p) = -851.18669
Iteration 1:   f(p) = -556.66855
Iteration 2:   f(p) = -555.81731
Iteration 3:   f(p) = -555.81538
Iteration 4:   f(p) = -555.81538
:
: finish
--------------------------------------------------------------------------------

.
finish of do-file

Utilizing an analytically computed Hessian in optimize()

The code in dex3.do provides the sum of the observation-level Hessians to the evaluator perform plleval3() utilized by optimize() in dowork().

Code block 3: dex3.do


mata:

mata drop plleval3() dowork()

void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
    if (todo==2) {
        hess = -quadcross(X, mu, X)
    }

}

void dowork( )
{
    actual vector   y, b
    actual matrix   X
    actual scalar   n, p
    transmorphic  S

    y  = st_data(., "accidents")
    X  = st_data(., "cvalue children visitors ")
    n  = rows(y)
    X  = X, J(n, 1, 1)
    p  = cols(X)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf2debug")
    optimize_init_params(S, J(1, p, .01))

    b    = optimize(S)

}

dowork()

finish

Traces 18–20 are new to dex3.do, they usually compute the Hessian when todo==2. Line 41 in dex3.do specifies a gf2debug evaluator kind as an alternative of the gf1 evaluator kind specified on line 37 of dex2.do.

The gf2debug evaluator kind is a second-derivative model of the gf1debug evaluator kind; it specifies that the evaluator return observation-level contributions to the target perform, that it return a matrix of observation-level gradients when todo==1 or todo==2, and that it return a matrix containing the sum of observation-level Hessians when todo==2. The gf2debug evaluator kind additionally specifies that optimize() will produce a derivative-comparison report for the gradient and the Hessian and that optimize() will use the numerically computed derivatives for the optimization.

Instance 3 illustrates the output.

Instance 3: gf2debug output


. do dex3

. mata:
------------------------------------------------- mata (kind finish to exit) ------
:
: mata drop plleval3() dowork()

:
: void plleval3(actual scalar todo, actual vector b,     ///
>               actual vector y,    actual matrix X,     ///
>               val, grad, hess)
> {
>     actual vector  xb, mu
>
>     xb  = X*b'
>     mu  = exp(xb)
>     val = (-mu + y:*xb - lnfactorial(y))
>
>     if (todo>=1) {
>         grad = (y - mu):*X
>     }
>     if (todo==2) {
>         hess = -quadcross(X, mu, X)
>     }
>
> }

:
: void dowork( )
> {
>     actual vector   y, b
>     actual matrix   X
>     actual scalar   n, p
>     transmorphic  S
>
>     y  = st_data(., "accidents")
>     X  = st_data(., "cvalue children visitors ")
>     n  = rows(y)
>     X  = X, J(n, 1, 1)
>     p  = cols(X)
>
>     S  = optimize_init()
>     optimize_init_argument(S, 1, y)
>     optimize_init_argument(S, 2, X)
>     optimize_init_evaluator(S, &plleval3())
>     optimize_init_evaluatortype(S, "gf2debug")
>     optimize_init_params(S, J(1, p, .01))
>
>     b    = optimize(S)
>
> }
be aware: variable b set however not used

:
: dowork()

gf2debug:  Start derivative-comparison report ----------------------------------
gf2debug:  mreldif(gradient vectors) =  9.91e-07
gf2debug:  mreldif(Hessian matrices) =  1.53e-06
gf2debug:  Finish derivative-comparison report ------------------------------------
Iteration 0:   f(p) = -851.18669

gf2debug:  Start derivative-comparison report ----------------------------------
gf2debug:  mreldif(gradient vectors) =  2.06e-10
gf2debug:  mreldif(Hessian matrices) =  .0001703
gf2debug:  Finish derivative-comparison report ------------------------------------
Iteration 1:   f(p) = -556.66874

gf2debug:  Start derivative-comparison report ----------------------------------
gf2debug:  mreldif(gradient vectors) =  1.59e-07
gf2debug:  mreldif(Hessian matrices) =  5.42e-07
gf2debug:  Finish derivative-comparison report ------------------------------------
Iteration 2:   f(p) = -555.81731

gf2debug:  Start derivative-comparison report ----------------------------------
gf2debug:  mreldif(gradient vectors) =  .0000267
gf2debug:  mreldif(Hessian matrices) =  2.45e-07
gf2debug:  Finish derivative-comparison report ------------------------------------
Iteration 3:   f(p) = -555.81538

gf2debug:  Start derivative-comparison report ----------------------------------
gf2debug:  mreldif(gradient vectors) =  .0000272
gf2debug:  mreldif(Hessian matrices) =  2.46e-07
gf2debug:  Finish derivative-comparison report ------------------------------------
Iteration 4:   f(p) = -555.81538
:
: finish
--------------------------------------------------------------------------------

.
finish of do-file

In contrast to the mreldif calculations for the gradient, I look intently on the mreldif calculations for the Hessian close to the optimum, as a result of the Hessian have to be full rank on the optimum. On this instance, the mreldif calculations close to the optimum are on the order of e-07, indicating a appropriately coded analytical Hessian.

Now take into account dex4.do, which differs from dex3.do in that line 40 specifies a gf2 evaluator kind as an alternative of a gf2debug evaluator kind. A gf2 evaluator kind is gf1 evaluator kind for first and second derivatives. A gf2 evaluator kind differs from a gf2debug evaluator kind in that it makes use of the analytically computed gradients and the analytically computed Hessian within the optimization, the numerical derivatives are usually not computed, and there are not any spinoff comparability stories.

Code block 4: dex4.do


mata:

mata drop plleval3() dowork()

void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
    if (todo==2) {
        hess = -quadcross(X, mu, X)
    }

}

void dowork( )
{
    actual vector   y, b
    actual matrix   X
    actual scalar   n, p
    transmorphic  S

    y  = st_data(., "accidents")
    X  = st_data(., "cvalue children visitors ")
    n  = rows(y)
    X  = X, J(n, 1, 1)
    p  = cols(X)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf2")
    optimize_init_params(S, J(1, p, .01))

    b    = optimize(S)

}

dowork()

finish

Instance 4 illustrates the output.

Instance 4: gf2 output


. do dex4

. mata:
------------------------------------------------- mata (kind finish to exit) ------
:
: mata drop plleval3() dowork()

:
: void plleval3(actual scalar todo, actual vector b,     ///
>               actual vector y,    actual matrix X,     ///
>               val, grad, hess)
> {
>     actual vector  xb, mu
>
>     xb  = X*b'
>     mu  = exp(xb)
>     val = (-mu + y:*xb - lnfactorial(y))
>
>     if (todo>=1) {
>         grad = (y - mu):*X
>     }
>     if (todo==2) {
>         hess = -quadcross(X, mu, X)
>     }
>
> }

:
: void dowork( )
> {
>     actual vector   y, b
>     actual matrix   X
>     actual scalar   n, p
>     transmorphic  S
>
>     y  = st_data(., "accidents")
>     X  = st_data(., "cvalue children visitors ")
>     n  = rows(y)
>     X  = X, J(n, 1, 1)
>     p  = cols(X)
>
>     S  = optimize_init()
>     optimize_init_argument(S, 1, y)
>     optimize_init_argument(S, 2, X)
>     optimize_init_evaluator(S, &plleval3())
>     optimize_init_evaluatortype(S, "gf2")
>     optimize_init_params(S, J(1, p, .01))
>
>     b    = optimize(S)
>
> }
be aware: variable b set however not used

:
: dowork()
Iteration 0:   f(p) = -851.18669
Iteration 1:   f(p) = -556.66855
Iteration 2:   f(p) = -555.81731
Iteration 3:   f(p) = -555.81538
Iteration 4:   f(p) = -555.81538
:
: finish
--------------------------------------------------------------------------------

.
finish of do-file

Together with analytical derivatives within the command

mypoisson4 is like mypoisson3, besides that it computes the derivatives analytically. Within the the rest of this put up, I briefly talk about the code for mypoisson4.ado.

Code block 5: mypoisson4.ado


*! model 4.0.0  28Feb2016
program outline mypoisson4, eclass sortpreserve
    model 14

    syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant vce(string) ]
    marksample touse

    _vce_parse `touse' , optlist(Strong) argoptlist(CLuster) : , vce(`vce')
    native vce        "`r(vce)'"
    native clustervar "`r(cluster)'"
    if "`vce'" == "sturdy" | "`vce'" == "cluster" {
        native vcetype "Strong"
    }
    if "`clustervar'" != "" {
        seize affirm numeric variable `clustervar'
        if _rc {
            show in pink "invalid vce() choice"
            show in pink "cluster variable {bf:`clustervar'} is " ///
                "string variable as an alternative of a numeric variable"
            exit(198)
        }
        kind `clustervar'
    }

    gettoken depvar indepvars : varlist
    _fv_check_depvar `depvar'

    tempname b mo V N rank

    getcinfo `indepvars' , `fixed'
    native  cnames "`r(cnames)'"
    matrix `mo' = r(mo)

    mata: mywork("`depvar'", "`cnames'", "`touse'", "`fixed'", ///
       "`b'", "`V'", "`N'", "`rank'", "`mo'", "`vce'", "`clustervar'")

    if "`fixed'" == "" {
        native cnames "`cnames' _cons"
    }
    matrix colnames `b' = `cnames'
    matrix colnames `V' = `cnames'
    matrix rownames `V' = `cnames'

    ereturn put up `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `rank'
    ereturn native  vce      "`vce'"
    ereturn native  vcetype  "`vcetype'"
    ereturn native  clustvar "`clustervar'"
    ereturn native  cmd     "mypoisson4"

    ereturn show

finish

program getcinfo, rclass
    syntax varlist(ts fv), [ noCONStant ]

    _rmcoll `varlist' , `fixed' develop
    native cnames `r(varlist)'
    native p : phrase rely `cnames'
    if "`fixed'" == "" {
        native p = `p' + 1
        native cons _cons
    }

    tempname b mo

    matrix `b' = J(1, `p', 0)
    matrix colnames `b' = `cnames' `cons'
    _ms_omit_info `b'
    matrix `mo' = r(omit)

    return native  cnames "`cnames'"
    return matrix mo = `mo'
finish

mata:

void mywork( string scalar depvar,  string scalar indepvars,
             string scalar touse,   string scalar fixed,
             string scalar bname,   string scalar Vname,
             string scalar nname,   string scalar rname,
             string scalar mo,
             string scalar vcetype, string scalar clustervar)
{

    actual vector y, b
    actual matrix X, V, Ct
    actual scalar n, p, rank

    y = st_data(., depvar, touse)
    n = rows(y)
    X = st_data(., indepvars, touse)
    if (fixed == "") {
        X = X,J(n, 1, 1)
    }
    p = cols(X)

    Ct = makeCt(mo)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf2")
    optimize_init_params(S, J(1, p, .01))
    optimize_init_constraints(S, Ct)

    b    = optimize(S)

    if (vcetype == "sturdy") {
        V    = optimize_result_V_robust(S)
    }
    else if (vcetype == "cluster") {
        cvar = st_data(., clustervar, touse)
        optimize_init_cluster(S, cvar)
        V    = optimize_result_V_robust(S)
    }
    else {                 // vcetype should IID
        V    = optimize_result_V_oim(S)
    }
    rank = p - diag0cnt(invsym(V))

    st_matrix(bname, b)
    st_matrix(Vname, V)
    st_numscalar(nname, n)
    st_numscalar(rname, rank)
}

actual matrix makeCt(string scalar mo)
{
    actual vector mo_v
    actual scalar ko, j, p

    mo_v = st_matrix(mo)
    p    = cols(mo_v)
    ko   = sum(mo_v)
    if (ko>0) {
        Ct   = J(0, p, .)
        for(j=1; j<=p; j++) {
            if (mo_v[j]==1) {
                Ct  = Ct  e(j, p)
            }
        }
        Ct = Ct, J(ko, 1, 0)
    }
    else {
        Ct = J(0,p+1,.)
    }

    return(Ct)

}

void plleval3(actual scalar todo, actual vector b,     ///
              actual vector y,    actual matrix X,     ///
              val, grad, hess)
{
    actual vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
    if (todo==2) {
        hess = -quadcross(X, mu, X)
    }
}

finish

Just a few traces of mypoisson4.ado differ from their counterparts in mypoisson3.ado. Line 106 of mypoisson4.ado specifies a gf2 evaluator kind, whereas line 106 of mypoisson3.ado specifies a gf0 evaluator kind. Traces 166–171 in mypoisson4.ado compute the gradient and the Hessian analytically, they usually don’t have any counterparts in mypoisson3.ado.

The output in examples 5 and 6 confirms that mypoisson4 produces the identical outcomes as poisson when the choice vce(cluster id) is specified.

Instance 5: mypoisson4 outcomes


. mypoisson4 accidents cvalue children visitors , vce(cluster id)
Iteration 0:   f(p) = -851.18669
Iteration 1:   f(p) = -556.66855
Iteration 2:   f(p) = -555.81731
Iteration 3:   f(p) = -555.81538
Iteration 4:   f(p) = -555.81538
                                     (Std. Err. adjusted for clustering on id)
------------------------------------------------------------------------------
             |               Strong
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |  -.6558871   .1125223    -5.83   0.000    -.8764267   -.4353475
        children |  -1.009017   .1805639    -5.59   0.000    -1.362916   -.6551182
     visitors |   .1467115    .092712     1.58   0.114    -.0350008    .3284237
       _cons |   .5743541   .6238015     0.92   0.357    -.6482744    1.796983
------------------------------------------------------------------------------

Instance 6: poisson outcomes


. poisson accidents cvalue children visitors , vce(cluster id)

Iteration 0:   log pseudolikelihood = -555.86605
Iteration 1:   log pseudolikelihood =  -555.8154
Iteration 2:   log pseudolikelihood = -555.81538

Poisson regression                              Variety of obs     =        505
                                                Wald chi2(3)      =     103.53
                                                Prob > chi2       =     0.0000
Log pseudolikelihood = -555.81538               Pseudo R2         =     0.2343

                                   (Std. Err. adjusted for 285 clusters in id)
------------------------------------------------------------------------------
             |               Strong
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |  -.6558871   .1125223    -5.83   0.000    -.8764266   -.4353475
        children |  -1.009017   .1805639    -5.59   0.000    -1.362915   -.6551181
     visitors |   .1467115    .092712     1.58   0.114    -.0350008    .3284237
       _cons |    .574354   .6238015     0.92   0.357    -.6482744    1.796982
------------------------------------------------------------------------------

Achieved and undone

I confirmed the way to compute derivatives analytically when utilizing optimize(), and I included analytically computed derivatives in mypoisson4.ado. In my subsequent put up, I present the way to make predict work after mypoisson4.

Programming an estimation command in Stata: Including analytical derivatives to a poisson command utilizing Mata

Related Articles

Serving to AI brokers search to get the perfect outcomes out of enormous language fashions | MIT Information

Is Your Machine Studying Pipeline as Environment friendly because it May Be?

The CMF Buds Professional 2 are palms down the perfect finances earbuds, and you will get them for 32% off at Amazon proper now

Latest Articles

Serving to AI brokers search to get the perfect outcomes out of enormous language fashions | MIT Information

Is Your Machine Studying Pipeline as Environment friendly because it May Be?

The CMF Buds Professional 2 are palms down the perfect finances earbuds, and you will get them for 32% off at Amazon proper now

The Relativistic Heavy Ion Collider’s finish marks a brand new starting for U.S. particle physics

Claude Opus 4.6 vs OpenAI Codex 5.3: Which is Higher?