Wednesday, April 22, 2026

Optimistic log-likelihood values occur – The Stata Weblog


Every so often, we get a query from a consumer puzzled about getting a optimistic log chance for a sure estimation. We get so used to seeing adverse log-likelihood values on a regular basis that we could surprise what induced them to be optimistic.

First, let me level out that there’s nothing improper with a optimistic log chance.

The chances are the product of the density evaluated on the observations. Often, the density takes values which are smaller than one, so its logarithm will likely be adverse. Nonetheless, this isn’t true for each distribution.

For instance, let’s consider the density of a standard distribution with a small customary deviation, let’s say 0.1.


. di normalden(0,0,.1)
3.9894228

This density will focus a big space round zero, and subsequently will take giant values round this level. Naturally, the logarithm of this worth will likely be optimistic.


. di log(3.9894228)
1.3836466

In mannequin estimation, the scenario is a little more advanced. While you match a mannequin to a dataset, the log chance will likely be evaluated at each remark. A few of these evaluations could transform optimistic, and a few could transform adverse. The sum of all of them is reported. Let me present you an instance.

I’ll begin by simulating a dataset acceptable for a linear mannequin.


clear
program drop _all
set seed 1357
set obs 100
gen x1 = rnormal()
gen x2 = rnormal()
gen y = 2*x1 + 3*x2 +1 + .06*rnormal()

I’ll borrow the code for mynormal_lf from the e book Most Probability Estimation with Stata (W. Gould, J. Pitblado, and B. Poi, 2010, Stata Press) so as to match my mannequin by way of most chance.


program mynormal_lf
        model 11.1
        args lnf mu lnsigma
        quietly exchange `lnf' = ln(normalden($ML_y1,`mu',exp(`lnsigma')))
finish

ml mannequin lf  mynormal_lf  (y = x1 x2) (lnsigma:)
ml max, nolog

The next desk will likely be displayed:


.   ml max, nolog

                                                  Variety of obs   =        100
                                                  Wald chi2(2)    =  456919.97
Log chance =  152.37127                       Prob > chi2     =     0.0000

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
eq1          |   
          x1 |   1.995834    .005117   390.04   0.000     1.985805    2.005863
          x2 |   3.014579   .0059332   508.08   0.000      3.00295    3.026208
       _cons |   .9990202   .0052961   188.63   0.000       .98864      1.0094
-------------+----------------------------------------------------------------
lnsigma      |  
       _cons |  -2.942651   .0707107   -41.62   0.000    -3.081242   -2.804061
------------------------------------------------------------------------------

We will see that the estimates are shut sufficient to our authentic parameters, and likewise that the log chances are optimistic.

We will receive the log chance for every remark by substituting the estimates within the log-likelihood method:


. predict double xb

. gen double lnf = ln(normalden(y, xb, exp([lnsigma]_b[_cons])))

. summ lnf, element

                             lnf
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -1.360689      -1.574499
 5%    -.0729971       -1.14688
10%     .4198644      -.3653152       Obs                 100
25%     1.327405      -.2917259       Sum of Wgt.         100

50%     1.868804                      Imply           1.523713
                        Largest       Std. Dev.      .7287953
75%     1.995713       2.023528
90%     2.016385       2.023544       Variance       .5311426
95%     2.021751       2.023676       Skewness      -2.035996
99%     2.023691       2.023706       Kurtosis       7.114586

. di r(sum)
152.37127

. gen f = exp(lnf)

. summ f, element

                              f
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .2623688       .2071112
 5%     .9296673       .3176263
10%      1.52623       .6939778       Obs                 100
25%     3.771652       .7469733       Sum of Wgt.         100

50%     6.480548                      Imply           5.448205
                        Largest       Std. Dev.      2.266741
75%     7.357449       7.564968
90%      7.51112        7.56509       Variance       5.138117
95%     7.551539       7.566087       Skewness      -.8968159
99%     7.566199        7.56631       Kurtosis       2.431257

We will see that some values for the log chance are adverse, however most are optimistic, and that the sum is the worth we already know. In the identical method, a lot of the values of the chance are larger than one.

As an train, attempt the instructions above with an even bigger variance, say, 1. Now the density will likely be flatter, and there will likely be no values larger than one.

Briefly, when you have a optimistic log chance, there may be nothing improper with that, however should you test your dispersion parameters, you will see they’re small.



Related Articles

Latest Articles