Sunday, February 22, 2026

Programming an estimation command in Stata: Mata features


I present the best way to write a perform in Mata, the matrix programming language that’s a part of Stata. This put up makes use of ideas launched in Programming an estimation command in Stata: Mata 101.

That is the twelfth put up within the collection Programming an estimation command in Stata. I like to recommend that you just begin initially. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this collection.

Mata features

Instructions do work in Stata. Capabilities do work in Mata. Instructions function on Stata objects, like variables, and customers specify choices to change the conduct. Mata features settle for arguments, function on the arguments, and will return a end result or alter the worth of an argument to include a end result.

Take into account myadd() outlined under.

Code block 1: myadd()


mata:
perform myadd(X, Y)
{
    A = X + Y
    return(A)
}
finish

myadd() accepts two arguments, X and Y, places the sum of X and Y into A, and returns A. For instance,

Instance 1: Defining and utilizing a perform


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: perform myadd(X, Y)
> {
>     A = X + Y
>     return(A)
> }

: C = J(3, 3, 4)

: D = I(3)

: W = myadd(C,D)

: W
[symmetric]
       1   2   3
    +-------------+
  1 |  5          |
  2 |  4   5      |
  3 |  4   4   5  |
    +-------------+

: finish
--------------------------------------------------------------------------------

After defining myadd(), I create the matrices C and D, and I cross C and D into myadd(), which returns their sum. Mata assigns the returned sum to W, which I show. Be aware that contained in the perform myadd(), C and D are respectively referred to as X and Y.

The A created in myadd() can solely be accessed inside myadd(), and it doesn’t battle with an A outlined exterior myadd(); that’s, A is native to the perform myadd(). I now illustrate that A is native to myadd().

Instance 2: A is native to myadd()


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: A = J(3, 3, 4)

: A
[symmetric]
       1   2   3
    +-------------+
  1 |  4          |
  2 |  4   4      |
  3 |  4   4   4  |
    +-------------+

: W = myadd(A,D)

: A
[symmetric]
       1   2   3
    +-------------+
  1 |  4          |
  2 |  4   4      |
  3 |  4   4   4  |
    +-------------+

: finish
--------------------------------------------------------------------------------

Having illustrated that the A outlined inside myadd() is native to myadd(), I ought to level out that the C and D matrices I outlined in instance 1 are in international Mata reminiscence. As in ado-programs, we don’t wish to use fastened names in international Mata reminiscence in our packages in order that we don’t destroy the customers’ information. Luckily, this downside is definitely solved by writing Mata features that write their outcomes out to Stata and don’t return outcomes. I’ll present detailed discussions of this resolution within the instructions that I develop in subsequent posts.

When a Mata perform modifications the worth of an argument contained in the perform, that modifications the worth of that argument exterior the perform; in different phrases, arguments are handed by tackle. Mata features can compute multiple end result by storing these ends in arguments. For instance, sumproduct() returns each the sum and the element-wise product of two matrices.

Code block 2: sumproduct()


perform sumproduct(X, Y, S, P)
{
	S = X +  Y
	P = X :* Y
	return
}

sumproduct() returns the sum of the arguments X and Y within the argument S and the element-wise product in P.

Instance 3: Returning ends in arguments


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: perform sumproduct(X, Y, S, P)
> {
>         S = X +  Y
>         P = X :* Y
>         return
> }

: A = I(3)

: B = rowshape(1::9, 3)

: A
[symmetric]
       1   2   3
    +-------------+
  1 |  1          |
  2 |  0   1      |
  3 |  0   0   1  |
    +-------------+

: B
       1   2   3
    +-------------+
  1 |  1   2   3  |
  2 |  4   5   6  |
  3 |  7   8   9  |
    +-------------+

: W=.

: W
  .

: Z=.

: Z
  .

: sumproduct(A, B, W, Z)

: W
        1    2    3
    +----------------+
  1 |   2    2    3  |
  2 |   4    6    6  |
  3 |   7    8   10  |
    +----------------+

: Z
[symmetric]
       1   2   3
    +-------------+
  1 |  1          |
  2 |  0   5      |
  3 |  0   0   9  |
    +-------------+

: finish
--------------------------------------------------------------------------------

After defining sumproduct(), I exploit I() to create A and use rowshape() to create B. I then create W and Z; every is a scalar lacking worth. I have to create W and Z earlier than I cross them as arguments; in any other case, I’d be referencing arguments that don’t exist. After calling sumproduct(), I show W and Z as an instance that they now include the sum and element-wise product of X and Y.

In myadd() and sumproduct(), I didn’t specify what sort of factor every argument should be, nor did I specify what sort of factor every perform would return. In different phrases, I used implicit declarations. Implicit declarations are simpler to sort than express declarations, however they make error messages and code much less informative. I extremely advocate explicitly declaring returns, arguments, and native variables to make your code and error messages extra readable.

myadd2() is a model of myadd() that explicitly declares the kind of factor returned, the kind of factor that every argument should be, and the sort that every local-to-the-function factor should be.

Code block 3: myadd2(): Express declarations


mata:
numeric matrix myadd2(numeric matrix X, numeric matrix Y)
{
    numeric matrix A

    A = X + Y
    return(A)
}
finish

myadd2() returns a numeric matrix that it constructs by including the numeric matrix X to the numeric matrix Y. The local-to-the-function object A can also be a numeric matrix. A numeric matrix is both a actual matrix or a complicated matrix; it can’t be a string matrix.

Explicitly declaring returns, arguments, and native variables makes the code extra informative. I instantly see that myadd2() doesn’t work with string matrices, however this property is buried within the code for myadd().

I can not sufficiently stress the significance of writing easy-to-read code. Studying different folks’s code is a vital a part of programming. It’s educational, and it means that you can undertake the options that others have discovered or carried out. If you’re new to programming, you might not but notice that after a couple of months, studying your individual code is like studying another person’s code. Even if you happen to by no means give your code to anybody else, it’s important that you just write easy-to-read code in an effort to learn it at a later date.

Express declarations additionally make some error messages simpler to trace down. In examples 4 and 5, I cross a string matrix to myadd() and to myadd2(), respectively.

Instance 4: Passing a string matrix to myadd()


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: B = I(3)

: C = J(3,3,"whats up")

: myadd(B,C)
                 myadd():  3250  sort mismatch
                 :     -  perform returned error
(0 strains skipped)
--------------------------------------------------------------------------------
r(3250);

Instance 5: Passing a string matrix to myadd2()


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: B = I(3)

: C = J(3,3,"whats up")

: numeric matrix myadd2(numeric matrix X, numeric matrix Y)
> {
>     numeric matrix A
> 
>     A = X + Y
>     return(A)
> }

: myadd2(B,C)
                myadd2():  3251  C[3,3] discovered the place actual or complicated required
                 :     -  perform returned error
(0 strains skipped)
--------------------------------------------------------------------------------
r(3251);

finish of do-file

The error message in instance 4 signifies that someplace in myadd(), an operator or a perform couldn’t carry out one thing on two objects as a result of their sorts weren’t appropriate. Don’t be deluded by the simplicity of myadd(). Monitoring down a sort mismatch in actual code might be tough.

In distinction, the error message in instance 5 says that the matrix C we handed to myadd2() is neither an actual nor a fancy matrix just like the argument of myadd2() requires. Wanting on the code and the error message instantly informs me that the issue is that I handed a string matrix to a perform that requires a numeric matrix.

Express declarations are so extremely really useful that Mata has a setting to require it, as illustrated under.

Instance 6: Turning on matastrict


. mata: mata set matastrict on

Setting matastrict to on causes the Mata compiler to require that return and native variables be explicitly declared. By default, matastrict is ready to off, by which case return and native variables could also be implicitly declared.

When matastrict is ready to on, arguments should not required to be explicitly declared as a result of some arguments maintain outcomes whose enter and output sorts may differ. Take into account makereal() outlined and utilized in instance 7.

Instance 7: Altering an arguments sort


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: void makereal(A)
> {
>         A = substr(A, 11,1) 
>         A = strtoreal(A)
> }

: A = J(2,2, "Quantity is 4")

: A
[symmetric]
                 1             2
    +-----------------------------+
  1 |  Quantity is 4                |
  2 |  Quantity is 4   Quantity is 4  |
    +-----------------------------+

: makereal(A)

: A + I(2)
[symmetric]
       1   2
    +---------+
  1 |  5      |
  2 |  4   5  |
    +---------+

: finish
--------------------------------------------------------------------------------

The declaration of makereal() specifies that makereal() returns nothing as a result of void comes earlier than the identify of the perform. Although matastrict is ready to on, I didn’t declare what sort of factor A should be. The 2 executable strains of makereal() make clear that A should be a string on enter and that A will likely be actual on output, which I subsequently illustrate.

I exploit the characteristic that arguments could also be implicitly declared to make my code simpler to learn. Lots of the Mata features that I write substitute arguments with outcomes. I explicitly declare arguments which might be inputs, and I implicitly declare arguments that include outputs. Take into account sumproduct2().

Code block 4: sumproduct2(): Express declarations of inputs however not outputs


void sumproduct2(actual matrix X, actual matrix Y, S, P)
{
	S = X +  Y
	P = X :* Y
	return
}

sumproduct2() returns nothing as a result of void comes earlier than the perform identify. The primary argument X is actual matrix, the second argument Y is a actual matrix, the third argument S is implicitly declared, and the fourth argument P is implicitly declared. My coding conference that inputs are explicitly declared and that outputs are implicitly declared instantly informs me that X and Y are inputs however that S and P are outputs. That X and Y are inputs and that S and P are outputs is illustrated in instance 8.

Instance 8: Explicitly declaring inputs however not outputs


. mata:
------------------------------------------------- mata (sort finish to exit) ------
: void sumproduct2(actual matrix X, actual matrix Y, S, P)
> {
>         S = X +  Y
>         P = X :* Y
>         return
> }

: A = I(2)

: B = rowshape(1::4, 2)

: C = .

: D = .

: sumproduct2(A, B, C, D)

: C
       1   2
    +---------+
  1 |  2   2  |
  2 |  3   5  |
    +---------+

: D
[symmetric]
       1   2
    +---------+
  1 |  1      |
  2 |  0   4  |
    +---------+

: finish
--------------------------------------------------------------------------------

Carried out and undone

I confirmed the best way to write a perform in Mata and mentioned declarations in some element. Sort assist m2_declarations for a lot of extra particulars.

In my subsequent put up, I exploit Mata features to carry out the computations for a easy estimation command.



Related Articles

Latest Articles