Programming an estimation command in Stata: Computing OLS objects in Mata

February 20, 2026

3

(newcommand{epsilonb}{boldsymbol{epsilon}}
newcommand{ebi}{boldsymbol{epsilon}_i}
newcommand{Sigmab}{boldsymbol{Sigma}}
newcommand{betab}{boldsymbol{beta}}
newcommand{eb}{{bf e}}
newcommand{xb}{{bf x}}
newcommand{xbit}{{bf x}_{it}}
newcommand{xbi}{{bf x}_{i}}
newcommand{zb}{{bf z}}
newcommand{zbi}{{bf z}_i}
newcommand{wb}{{bf w}}
newcommand{yb}{{bf y}}
newcommand{ub}{{bf u}}
newcommand{Xb}{{bf X}}
newcommand{Mb}{{bf M}}
newcommand{Xtb}{tilde{bf X}}
newcommand{Wb}{{bf W}}
newcommand{Vb}{{bf V}})I current the formulation for computing the strange least-squares (OLS) estimator and present easy methods to compute them in Mata. This put up is a Mata model of Programming an estimation command in Stata: Utilizing Stata matrix instructions and capabilities to compute OLS objects. I focus on the formulation and the computation of independence-based customary errors, sturdy customary errors, and cluster-robust customary errors.

That is the fourteenth put up within the collection Programming an estimation command in Stata. I like to recommend that you just begin at the start. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this collection.

OLS formulation

Recall that the OLS level estimates are given by

[
widehat{betab} =
left( sum_{i=1}^N xb_i’xb_i right)^{-1}
left(
sum_{i=1}^N xb_i’y_i
right)
]

the place (xb_i) is the (1times ok) vector of unbiased variables, (y_i) is the dependent variable for every of the (N) pattern observations, and the mannequin for (y_i) is

[
y_i = xb_ibetab’ + epsilon_i
]

If the (epsilon_i) are independently and identically distributed (IID), we estimate the variance-covariance matrix of the estimator (VCE) by

[
widehat{Vb} = widehat{s}
left( sum_{i=1}^N xb_i’xb_i right)^{-1}
]

the place (widehat{s} = 1/(N-k)sum_{i=1}^N e_i^2) and (e_i=y_i-xb_iwidehat{betab}). See Cameron and Trivedi (2005), Inventory and Watson (2010), or Wooldridge (2015) for introductions to OLS.

Mata implementation

I compute the OLS level estimates in Mata in instance 1.

Instance 1: Computing OLS level estimates in Mata


. sysuse auto
(1978 Car Knowledge)

. mata:
------------------------------------------------- mata (kind finish to exit) ------
: y    = st_data(., "worth")

: X    = st_data(., "mpg trunk")

: n    = rows(X)

: X    = X,J(n,1,1)

: XpX  = quadcross(X, X)

: XpXi = invsym(XpX)

: b    = XpXi*quadcross(X, y)

: finish
--------------------------------------------------------------------------------

I used st_data() to place a duplicate of the observations on worth into the Mata vector y and to place a duplicate of the observations on mpg and trunk into the Mata matrix X. I used rows(X) to place the variety of observations into n. After including a column of ones onto X for the fixed time period, I used quadcross() to calculate (Xb’Xb) in quad precision. After utilizing invsym() to calculate the inverse of the symmetric matrix XpXi, I calculated the purpose estimates from the OLS system.

In instance 1, I computed the OLS level estimates after forming the cross merchandise. As mentioned in Lange (2010, chapter 7), I might compute extra correct estimates utilizing a QR decomposition; kind assist mf_qrd for particulars about computing QR decompositions in Mata. By computing the cross merchandise in quad precision, I obtained level estimates which are nearly as correct as these obtainable from a QR decomposition in double precision, however that may be a subject for an additional put up.

Listed here are the purpose estimates I computed in Mata and comparable outcomes from regress.

Instance 2: Outcomes from Mata and regress


. mata: b'
                  1              2              3
    +----------------------------------------------+
  1 |  -220.1648801    43.55851009    10254.94983  |
    +----------------------------------------------+

. regress worth mpg trunk

      Supply |       SS           df       MS      Variety of obs   =        74
-------------+----------------------------------   F(2, 71)        =     10.14
       Mannequin |   141126459         2  70563229.4   Prob > F        =    0.0001
    Residual |   493938937        71  6956886.44   R-squared       =    0.2222
-------------+----------------------------------   Adj R-squared   =    0.2003
       Whole |   635065396        73  8699525.97   Root MSE        =    2637.6

------------------------------------------------------------------------------
       worth |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -220.1649   65.59262    -3.36   0.001    -350.9529    -89.3769
       trunk |   43.55851   88.71884     0.49   0.625    -133.3418    220.4589
       _cons |   10254.95   2349.084     4.37   0.000      5571.01    14938.89
------------------------------------------------------------------------------

Given the OLS level estimates, I can now compute the IID estimator of the VCE.

Instance 3: Computing the IID VCE


. mata:
------------------------------------------------- mata (kind finish to exit) ------
: e    = y - X*b

: e2   = e:^2

: ok    = cols(X)

: V    = (quadsum(e2)/(n-k))*XpXi

: sqrt(diagonal(V))'
                 1             2             3
    +-------------------------------------------+
  1 |  65.59262431   88.71884015    2349.08381  |
    +-------------------------------------------+

: finish
--------------------------------------------------------------------------------

I put the residuals into the Mata vector e, which I subsequently element-wise squared. I used cols(X) to place the variety of covariates into ok. I used quadsum() to compute the sum of the squared residuals in quad precision when computing V, an IID estimator for the VCE. The usual errors displayed by sqrt(diagonal(V)) are the identical as those displayed by regress in instance 2.

Sturdy customary errors

The incessantly used sturdy estimator of the VCE is given by

[
widehat{V}_{robust}=frac{N}{N-k}
left( sum_{i=1}^N xb_i’xb_i right)^{-1}
Mb
left( sum_{i=1}^N xb_i’xb_i right)^{-1}
]

the place
[Mb=sum_{i=1}^N widehat{e}_i^2xb_i’xb_i]

See Cameron and Trivedi (2005), Inventory and Watson (2010), or Wooldridge (2015) for derivations and discussions.

Instance 4 implements this estimator in Mata.

Instance 4: A sturdy VCE


. mata:
------------------------------------------------- mata (kind finish to exit) ------
: M    = quadcross(X, e2, X)

: V    = (n/(n-k))*XpXi*M*XpXi

: sqrt(diagonal(V))'
                 1             2             3
    +-------------------------------------------+
  1 |  72.45387946   71.45370224   2430.640607  |
    +-------------------------------------------+

: finish
--------------------------------------------------------------------------------

Utilizing quadcross(X, e2, X) to compute M is extra correct and quicker than looping over the observations. The accuracy comes from the quad precision supplied by quadcross(). The velocity comes from performing the loops in compiled C code as an alternative of compiled Mata code. Mata is quick however C is quicker, as a result of C imposes far more construction and since C is compiled utilizing far more platform-specific data than Mata.

quadcross() can also be quicker as a result of it has been parallelized, like many Mata capabilities. For instance, a name to quadcross() from Stata/MP with 2 processors will run about twice as quick as a name to quadcross() from Stata/SE when there are various rows in X. An in depth dialogue of the efficiency will increase supplied by Mata in Stata/MP is a topic for an additional put up.

I now confirm that my computations match these reported by regress.

Instance 5: Evaluating computations of strong VCE


. regress worth mpg trunk, vce(sturdy)

Linear regression                               Variety of obs     =         74
                                                F(2, 71)          =      11.59
                                                Prob > F          =     0.0000
                                                R-squared         =     0.2222
                                                Root MSE          =     2637.6

------------------------------------------------------------------------------
             |               Sturdy
       worth |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -220.1649   72.45388    -3.04   0.003    -364.6338   -75.69595
       trunk |   43.55851    71.4537     0.61   0.544    -98.91613    186.0331
       _cons |   10254.95   2430.641     4.22   0.000      5408.39    15101.51
------------------------------------------------------------------------------

Cluster-robust customary errors

The cluster-robust estimator of the VCE is incessantly used when the information have a gaggle construction, also called a panel construction or as a longitudinal construction. This VCE accounts for the within-group correlation of the errors, and it’s given by

[
widehat{V}_{cluster}=frac{N-1}{N-k}frac{g}{g-1}
left( sum_{i=1}^N xb_i’xb_i right)^{-1}
Mb_c
left( sum_{i=1}^N xb_i’xb_i right)^{-1}
]

the place
[
Mb_c=sum_{j=1}^g
Xb_j’
(widehat{eb}_j widehat{eb}_j’)
Xb_j
]

(Xb_j) is the (n_jtimes ok) matrix of observations on (xb_i) in group (j), (widehat{eb}_j) is the (n_jtimes 1) vector of residuals in group (j), and (g) is the variety of teams. See Cameron and Trivedi (2005), Wooldridge (2010), and [R] regress for derivations and discussions.

Computing (Mb_c) requires sorting the information by group. I take advantage of rep78, with the lacking values changed by 6, because the group variable in my instance. In instance 6, I type the dataset in Stata, put a duplicate of the observations on the modified rep78 into the column vector id, and recompute the OLS objects that I would like. I might have sorted the dataset in Mata, however I normally type it in Stata, so that’s what I illustrated. Kind assist mf_sort for sorting in Mata. In an actual program, I might not must recompute all the things. I do right here as a result of I didn’t wish to focus on the group variable or sorting the dataset till I mentioned cluster-robust customary errors.

Instance 6: Setup for computing M


. exchange rep78=6 if lacking(rep78)
(5 actual modifications made)

. type rep78

. mata:
------------------------------------------------- mata (kind finish to exit) ------
: id   = st_data(., "rep78")

: y    = st_data(., "worth")

: X    = st_data(., "mpg trunk")

: n    = rows(X)

: X    = X,J(n,1,1)

: ok    = cols(X)

: XpX  = quadcross(X, X)

: XpXi = invsym(XpX)

: b    = XpXi*quadcross(X, y)

: e    = y - X*b

: finish
--------------------------------------------------------------------------------

The Mata operate panelsetup(Q,p) returns a matrix describing the group construction of the information when Q is sorted by the group variable in column p. I illustrate this operate in instance 7.

Instance 7: panelsetup()


. listing rep78 if rep78<3, sepby(rep78)

     +-------+
     | rep78 |
     |-------|
  1. |     1 |
  2. |     1 |
     |-------|
  3. |     2 |
  4. |     2 |
  5. |     2 |
  6. |     2 |
  7. |     2 |
  8. |     2 |
  9. |     2 |
 10. |     2 |
     +-------+

. mata:
------------------------------------------------- mata (kind finish to exit) ------
: data = panelsetup(id, 1)

: data
        1    2
    +-----------+
  1 |   1    2  |
  2 |   3   10  |
  3 |  11   40  |
  4 |  41   58  |
  5 |  59   69  |
  6 |  70   74  |
    +-----------+

: finish
--------------------------------------------------------------------------------

I start by itemizing out the group variable, rep78, in Stata for the primary two teams. I then use panelsetup() to create data, which has one row for every group with the primary column containing the primary row of that group and the second column containing the second row of that group. I show data for instance what it comprises. The primary row of data specifies that the primary group begins in row 1 and ends in row 2, which matches the outcomes produced by listing. The second row of data specifies that the second group begins in row 3 and ends in row 10, which additionally matches the outcomes produced by listing.

Having created data, I can use it and the panelsubmatrix() to compute (Mb_c).

Instance 8: A cluster-robust VCE


. mata:
------------------------------------------------- mata (kind finish to exit) ------
: nc   = rows(data)

: M    = J(ok, ok, 0)

: for(i=1; i<=nc; i++) {
>     xi = panelsubmatrix(X,i,data)
>     ei = panelsubmatrix(e,i,data)
>     M  = M + xi'*(ei*ei')*xi
> }

: V    = ((n-1)/(n-k))*(nc/(nc-1))*XpXi*M*XpXi

: sqrt(diagonal(V))'
                 1             2             3
    +-------------------------------------------+
  1 |  93.28127184   58.89644366   2448.547376  |
    +-------------------------------------------+

: finish
--------------------------------------------------------------------------------

After storing the variety of teams in nc, I created an preliminary M to be a ok (occasions) ok matrix of zeros. For every group, I used panelsubmatrix() to extract the covariate for that group from X, I used panelsubmatrix() to extract the residuals for that group from e, and I added that group’s contribution into M. After looping over the teams, I computed V and displayed the usual errors.

I now confirm that my computations match these reported by regress.

Instance 9: Evaluating computations of cluster-robust VCE


. regress worth mpg trunk, vce(cluster rep78)

Linear regression                               Variety of obs     =         74
                                                F(2, 5)           =       9.54
                                                Prob > F          =     0.0196
                                                R-squared         =     0.2222
                                                Root MSE          =     2637.6

                                  (Std. Err. adjusted for six clusters in rep78)
------------------------------------------------------------------------------
             |               Sturdy
       worth |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -220.1649   93.28127    -2.36   0.065     -459.952    19.62226
       trunk |   43.55851   58.89644     0.74   0.493    -107.8396    194.9566
       _cons |   10254.95   2448.547     4.19   0.009     3960.758    16549.14
------------------------------------------------------------------------------

Finished and undone

I reviewed the formulation that underlie the OLS estimator and confirmed easy methods to compute them in Mata. Within the subsequent two posts, I write an ado-command that implements these formulation.

References

Cameron, A. C., and P. Okay. Trivedi. 2005. Microeconometrics: Strategies and functions. Cambridge: Cambridge College Press.

Lange, Okay. 2010. Numerical Evaluation for Statisticians. 2nd ed. New York: Springer.

Inventory, J. H., and M. W. Watson. 2010. Introduction to Econometrics. third ed. Boston, MA: Addison Wesley New York.

Wooldridge, J. M. 2010. Econometric Evaluation of Cross Part and Panel Knowledge. 2nd ed. Cambridge, Massachusetts: MIT Press.

Wooldridge, J. M. 2015. Introductory Econometrics: A Fashionable Method. sixth ed. Cincinnati, Ohio: South-Western.

Programming an estimation command in Stata: Computing OLS objects in Mata

Related Articles

Historic ‘Asgard’ microbe might have used oxygen lengthy earlier than it was plentiful on Earth, providing new clue to origins of complicated life

Amazon SageMaker AI in 2025, a yr in evaluation half 1: Versatile Coaching Plans and enhancements to cost efficiency for inference workloads

JetBrains introduces Java to Kotlin converter for Visible Studio Code

Latest Articles

Historic ‘Asgard’ microbe might have used oxygen lengthy earlier than it was plentiful on Earth, providing new clue to origins of complicated life

Amazon SageMaker AI in 2025, a yr in evaluation half 1: Versatile Coaching Plans and enhancements to cost efficiency for inference workloads

JetBrains introduces Java to Kotlin converter for Visible Studio Code

The Obtain: Microsoft’s on-line actuality verify, and the worrying rise in measles instances

Dwindling M1 Air inventory factors to imminent launch of a finances MacBook