Every so often, we get a query from a consumer puzzled about getting a optimistic log chance for a sure estimation. We get so used to seeing adverse log-likelihood values on a regular basis that we could surprise what induced them to be optimistic.
First, let me level out that there’s nothing improper with a optimistic log chance.
The chances are the product of the density evaluated on the observations. Often, the density takes values which are smaller than one, so its logarithm will likely be adverse. Nonetheless, this isn’t true for each distribution.
For instance, let’s consider the density of a standard distribution with a small customary deviation, let’s say 0.1.
. di normalden(0,0,.1) 3.9894228
This density will focus a big space round zero, and subsequently will take giant values round this level. Naturally, the logarithm of this worth will likely be optimistic.
. di log(3.9894228) 1.3836466
In mannequin estimation, the scenario is a little more advanced. While you match a mannequin to a dataset, the log chance will likely be evaluated at each remark. A few of these evaluations could transform optimistic, and a few could transform adverse. The sum of all of them is reported. Let me present you an instance.
I’ll begin by simulating a dataset acceptable for a linear mannequin.
clear program drop _all set seed 1357 set obs 100 gen x1 = rnormal() gen x2 = rnormal() gen y = 2*x1 + 3*x2 +1 + .06*rnormal()
I’ll borrow the code for mynormal_lf from the e book Most Probability Estimation with Stata (W. Gould, J. Pitblado, and B. Poi, 2010, Stata Press) so as to match my mannequin by way of most chance.
program mynormal_lf
model 11.1
args lnf mu lnsigma
quietly exchange `lnf' = ln(normalden($ML_y1,`mu',exp(`lnsigma')))
finish
ml mannequin lf mynormal_lf (y = x1 x2) (lnsigma:)
ml max, nolog
The next desk will likely be displayed:
. ml max, nolog
Variety of obs = 100
Wald chi2(2) = 456919.97
Log chance = 152.37127 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
eq1 |
x1 | 1.995834 .005117 390.04 0.000 1.985805 2.005863
x2 | 3.014579 .0059332 508.08 0.000 3.00295 3.026208
_cons | .9990202 .0052961 188.63 0.000 .98864 1.0094
-------------+----------------------------------------------------------------
lnsigma |
_cons | -2.942651 .0707107 -41.62 0.000 -3.081242 -2.804061
------------------------------------------------------------------------------
We will see that the estimates are shut sufficient to our authentic parameters, and likewise that the log chances are optimistic.
We will receive the log chance for every remark by substituting the estimates within the log-likelihood method:
. predict double xb
. gen double lnf = ln(normalden(y, xb, exp([lnsigma]_b[_cons])))
. summ lnf, element
lnf
-------------------------------------------------------------
Percentiles Smallest
1% -1.360689 -1.574499
5% -.0729971 -1.14688
10% .4198644 -.3653152 Obs 100
25% 1.327405 -.2917259 Sum of Wgt. 100
50% 1.868804 Imply 1.523713
Largest Std. Dev. .7287953
75% 1.995713 2.023528
90% 2.016385 2.023544 Variance .5311426
95% 2.021751 2.023676 Skewness -2.035996
99% 2.023691 2.023706 Kurtosis 7.114586
. di r(sum)
152.37127
. gen f = exp(lnf)
. summ f, element
f
-------------------------------------------------------------
Percentiles Smallest
1% .2623688 .2071112
5% .9296673 .3176263
10% 1.52623 .6939778 Obs 100
25% 3.771652 .7469733 Sum of Wgt. 100
50% 6.480548 Imply 5.448205
Largest Std. Dev. 2.266741
75% 7.357449 7.564968
90% 7.51112 7.56509 Variance 5.138117
95% 7.551539 7.566087 Skewness -.8968159
99% 7.566199 7.56631 Kurtosis 2.431257
We will see that some values for the log chance are adverse, however most are optimistic, and that the sum is the worth we already know. In the identical method, a lot of the values of the chance are larger than one.
As an train, attempt the instructions above with an even bigger variance, say, 1. Now the density will likely be flatter, and there will likely be no values larger than one.
Briefly, when you have a optimistic log chance, there may be nothing improper with that, however should you test your dispersion parameters, you will see they’re small.
