xtabond cheat sheet – The Stata Weblog

March 4, 2026

110

Random-effects and fixed-effects panel-data fashions don’t enable me to make use of observable data of earlier durations in my mannequin. They’re static. Dynamic panel-data fashions use present and previous data. For example, I could mannequin present well being outcomes as a perform of well being outcomes up to now— a smart modeling assumption— and of previous observable and unobservable traits.

At this time I’ll present data that can assist you to interpret the estimation and postestimation outcomes from Stata’s Arellano–Bond estimator xtabond, the commonest linear dynamic panel-data estimator.

The devices and the regressors

Now we have fictional information for 1,000 folks from 1991 to 2000. The result of curiosity is earnings (earnings), and the explanatory variables are years of education (educ) and an indicator for marital standing (married). Beneath, we match an Arellano–Bond mannequin utilizing xtabond.


. xtabond earnings married educ, vce(strong)

Arellano-Bond dynamic panel-data estimation     Variety of obs     =      8,000
Group variable: id                              Variety of teams  =      1,000
Time variable: 12 months
                                                Obs per group:
                                                              min =          8
                                                              avg =          8
                                                              max =          8

Variety of devices =     39                  Wald chi2(3)      =    3113.63
                                                Prob > chi2       =     0.0000
One-step outcomes
                                     (Std. Err. adjusted for clustering on id)
------------------------------------------------------------------------------
             |               Strong
      earnings |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      earnings |
         L1. |   .2008311   .0036375    55.21   0.000     .1937018    .2079604
             |
     married |   1.057667   .1006091    10.51   0.000     .8604764    1.254857
        educ |    .057551   .0045863    12.55   0.000     .0485619      .06654
       _cons |   .2645702   .0805474     3.28   0.001     .1067002    .4224403
------------------------------------------------------------------------------
Devices for differenced equation
        GMM-type: L(2/.).earnings
        Customary: D.married D.educ
Devices for stage equation
        Customary: _cons

A few parts within the output desk are totally different from what one would count on. The output features a coefficient for the lagged worth of the dependent variable that we didn’t specify within the command. Why?

Within the Arellano–Bond framework, the worth of the dependent variable within the earlier interval is a predictor for the present worth of the dependent variable. Stata consists of the worth of the dependent variable within the earlier interval for us. One other noteworthy facet that seems within the desk is the point out of 39 devices within the header. That is adopted by a footnote that refers to GMM and standard-type devices. Right here a little bit of math will assist us perceive what’s going on.

The connection of curiosity is given by

[begin{equation*}
y_{it} = x_{it}’beta_1 + y_{i(t-1)}beta_2 + alpha_i + varepsilon_{it}
end{equation*}]

Within the equation above, (y_{it}) is the result of curiosity for particular person (i) at time (t), (x_{it}) are a set of regressors which will embrace previous values, (y_{i(t-1)}) is the worth of the result within the earlier interval, (alpha_i) is a time-invariant unobservable, and (varepsilon_{it}) is a time-varying unobservable.

As within the fixed-effects framework, we assume the time-invariant unobserved part is said to the regressors. When unobservables and observables are correlated, we have now an endogeneity drawback that yields inconsistent parameter estimates if we use a traditional linear panel-data estimator. One answer is taking first-differences of the connection of curiosity. Nevertheless, the technique of taking first-differences doesn’t work. Why?

[begin{eqnarray*}
Delta y_{it} &=& Delta x_{it}’beta_1 + Delta y_{i(t-1)} + Delta varepsilon_{it}
Eleft( Delta y_{i(t-1)} Delta varepsilon_{it} right) &neq & 0
end{eqnarray*}]

Within the first equation above, we removed (alpha_i), which is correlated with our regressors, however we generated a brand new endogeneity drawback. The second equation above illustrates certainly one of our regressors is said to our unobservables. The answer is instrumental variables. Which instrumental variables? Arellano–Bond recommend the second lags of the dependent variable and all of the possible lags thereafter. This generates the set of second situations outlined by

[begin{eqnarray*}
Eleft( Delta y_{i(t-2)} Delta varepsilon_{it} right) &=& 0
Eleft( Delta y_{i(t-3)} Delta varepsilon_{it} right) &=& 0
ldots & &
Eleft( Delta y_{i(t-j)} Delta varepsilon_{it} right) &=& 0
end{eqnarray*}]

In our instance, we have now 10 time durations, which yield the next set of devices:

[begin{eqnarray*}
t&=10& quad y_{t-8}, y_{t-7}, y_{t-6}, y_{t-5}, y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}
t&=9& quad y_{t-7}, y_{t-6}, y_{t-5}, y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}
t&=8& quad y_{t-6}, y_{t-5}, y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}
t& = 7& quad y_{t-5}, y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}
t&= 6& quad y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}
t&= 5& quad y_{t-3}, y_{t-2}, y_{t-1}
t&= 4& quad y_{t-2}, y_{t-1}
t&=3& quad y_{t-1}
end{eqnarray*}]

This offers us 36 devices that are what the desk calls GMM-type devices. GMM has been explored within the weblog put up Estimating parameters by most probability and methodology of moments utilizing mlexp and gmm and we’ll discuss it in a later put up. The opposite three devices are given by the primary distinction of the regressors educ and married and the fixed. That is no totally different from two-stage least squares, the place we embrace the exogenous variables as a part of our instrument listing.

Testing for serial correlation

The important thing for the instrument set in Arellano–Bond to work is that

[begin{equation}
Eleft( Delta y_{i(t-j)} Delta varepsilon_{it} right) = 0 quad j geq 2
end{equation}]

We will take a look at these situations in Stata utilizing estat abond. In essence, the differenced unobserved time-invariant part must be unrelated to the second lag of the dependent variable and the lags thereafter. If this isn’t the case, we’re again to the preliminary drawback, endogeneity. Once more, a little bit of math will assist us perceive what’s going on.

All is properly if

[begin{equation}
Delta varepsilon_{it} = Delta nu_{it}
end{equation}]

The unobservable is serially correlated of order 1 however not serially correlated of orders 2 or past.

However we’re in bother if

[begin{equation}
Delta varepsilon_{it} = Delta nu_{it} + Delta nu_{i(t-1)}
end{equation}]

The second lag of the dependent variable can be associated to the differenced time-varying part (Delta varepsilon_{it}). One other manner of claiming that is that the differenced time-varying unobserved part is serially correlated with an order larger than 1.

estat abond offers a take a look at for the serial correlation construction. For the instance above,


. estat abond

Arellano-Bond take a look at for zero autocorrelation in first-differenced errors
  +-----------------------+
  |Order |  z     Prob > z|
  |------+----------------|
  |   1  |-22.975  0.0000 |
  |   2  |-.36763  0.7132 |
  +-----------------------+
   H0: no autocorrelation

We reject no autocorrelation of order 1 and can’t reject no autocorrelation of order 2. There’s proof that the Arellano–Bond mannequin assumptions are glad. If this weren’t the case, we must search for totally different devices. Primarily, we must match a distinct dynamic mannequin. That is what the xtdpd command permits us to do, however it’s past the scope of this put up.

Parting phrases

Dynamic panel-data fashions present a helpful analysis framework. On this put up, I touched on the interpretation of a few outcomes from estimation and postestimation from xtabond that can assist you to perceive your output.

xtabond cheat sheet – The Stata Weblog

Related Articles

Deloitte Japan Advances Safety Operations with Cisco Basis AI’s Open-Supply Mannequin

AI-Powered Workflow Automation: A Sensible Information for Fashionable Companies – Fingent

Conserving it enjoyable: Nothing debuts its telephones, audio lineup at Greatest Purchase within the US

Latest Articles

Deloitte Japan Advances Safety Operations with Cisco Basis AI’s Open-Supply Mannequin

AI-Powered Workflow Automation: A Sensible Information for Fashionable Companies – Fingent

Conserving it enjoyable: Nothing debuts its telephones, audio lineup at Greatest Purchase within the US

Why summer season flies by as an grownup—however lasted ceaselessly whenever you have been 10

In terms of predicting individuals’s preferences, it pays to think about “the ability of three” | MIT Information