I take advantage of the syntax command to enhance the command that implements the peculiar least-squares (OLS) estimator that I mentioned in Programming an estimation command in Stata: A primary command for OLS. I present learn how to require that each one variables be numeric variables and learn how to make the command settle for time-series operated variables.
That is the seventh submit within the collection Programming an estimation command in Stata. I like to recommend that you just begin originally. See Programming an estimation command in Stata: A map to posted entries for a map to all of the posts on this collection.
Stata syntax and the syntax command
The myregress2 command described in Programming an estimation command in Stata: A primary command for OLS has the syntax
myregress2 depvar [indepvars]
This syntax requires that the dependent variable be specified as a result of depvar is just not enclosed in sq. brackets. The unbiased variables are elective as a result of indepvars is enclosed in sq. brackets. Kind
for an introduction to studying Stata syntax diagrams.
This syntax is applied by the syntax command in line 5 of myregress2.ado, which I mentioned at size in Programming an estimation command in Stata: A primary command for OLS. The consumer should specify a listing of variable names as a result of varlist is just not enclosed in sq. brackets. The syntax of the syntax command follows the principles of a syntax diagram.
*! model 2.0.0 26Oct2015 program outline myregress2, eclass model 14 syntax varlist gettoken depvar : varlist tempname zpz xpx xpy xpxi b V tempvar xbhat res res2 quietly matrix accum `zpz' = `varlist' native p : phrase depend `varlist' native p = `p' + 1 matrix `xpx' = `zpz'[2..`p', 2..`p'] matrix `xpy' = `zpz'[2..`p', 1] matrix `xpxi' = syminv(`xpx') matrix `b' = (`xpxi'*`xpy')' quietly matrix rating double `xbhat' = `b' quietly generate double `res' = (`depvar' - `xbhat') quietly generate double `res2' = (`res')^2 quietly summarize `res2' native N = r(N) native sum = r(sum) native s2 = `sum'/(`N'-(`p'-1)) matrix `V' = `s2'*`xpxi' ereturn submit `b' `V' ereturn native cmd "myregress2" ereturn show finish
Instance 1 illustrates that myregress2 runs the requested regression once I specify a varlist.
Instance 1: myregress2 with specified variables
. sysuse auto
(1978 Car Information)
. myregress2 value mpg trunk
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | -220.1649 65.59262 -3.36 0.001 -348.7241 -91.6057
trunk | 43.55851 88.71884 0.49 0.623 -130.3272 217.4442
_cons | 10254.95 2349.084 4.37 0.000 5650.83 14859.07
------------------------------------------------------------------------------
Instance 2 illustrates that the syntax command shows an error message and stops execution once I don’t specify a varlist. I take advantage of set hint on to see every line of code and the output it produces.
Instance 2: myregress2 with no varlist
. set hint on . myregress2 --------------------------------------------------------- start myregress2 -- - model 14 - syntax varlist varlist required ----------------------------------------------------------- finish myregress2 -- r(100);
Instance 3 illustrates that the syntax command is checking that the required variables are within the present dataset. syntax throws an error as a result of DoesNotExist is just not a variable within the present dataset.
Instance 3: myregress2 with a variable not on this dataset
. set hint on . myregress2 value mpg trunk DoesNotExist --------------------------------------------------------- start myregress2 -- - model 14 - syntax varlist variable DoesNotExist not discovered ----------------------------------------------------------- finish myregress2 -- r(111); finish of do-file r(111);
As a result of the syntax command on line 5 is just not limiting the required variables to be numeric, I get the no observations error in instance 4 as a substitute of an error indicating the precise downside, which is the string variable make.
Instance 4: myregress2 with a string variable
. describe make
storage show worth
variable identify kind format label variable label
-------------------------------------------------------------------------------
make str18 %-18s Make and Mannequin
. myregress2 value mpg trunk make
no observations
r(2000);
finish of do-file
r(2000);
On line 5 of myregress3, I modify varlist to solely settle for numeric variables This alteration produces a extra informative error message when I attempt to embody a string variable within the regression.
*! model 3.0.0 30Oct2015 program outline myregress3, eclass model 14 syntax varlist(numeric) gettoken depvar : varlist tempname zpz xpx xpy xpxi b V tempvar xbhat res res2 quietly matrix accum `zpz' = `varlist' native p : phrase depend `varlist' native p = `p' + 1 matrix `xpx' = `zpz'[2..`p', 2..`p'] matrix `xpy' = `zpz'[2..`p', 1] matrix `xpxi' = syminv(`xpx') matrix `b' = (`xpxi'*`xpy')' quietly matrix rating double `xbhat' = `b' quietly generate double `res' = (`depvar' - `xbhat') quietly generate double `res2' = (`res')^2 quietly summarize `res2' native N = r(N) native sum = r(sum) native s2 = `sum'/(`N'-(`p'-1)) matrix `V' = `s2'*`xpxi' ereturn submit `b' `V' ereturn native cmd "myregress3" ereturn show finish
Instance 5: myregress3 with a string variable
. set hint on . myregress3 value mpg trunk make --------------------------------------------------------- start myregress3 -- - model 14 - syntax varlist(numeric) string variables not allowed in varlist; make is a string variable ----------------------------------------------------------- finish myregress3 -- r(109); finish of do-file r(109);
On line 5 of myregress4, I modify the varlist to just accept time-series (ts) variables. The syntax command places time-series variables in a canonical type that’s saved within the native macro varlist, as illustrated within the show on line 6, whose output seems in instance 6.
*! model 4.0.0 31Oct2015 program outline myregress4, eclass model 14 syntax varlist(numeric ts) show "varlist is `varlist'" gettoken depvar : varlist tempname zpz xpx xpy xpxi b V tempvar xbhat res res2 quietly matrix accum `zpz' = `varlist' native p : phrase depend `varlist' native p = `p' + 1 matrix `xpx' = `zpz'[2..`p', 2..`p'] matrix `xpy' = `zpz'[2..`p', 1] matrix `xpxi' = syminv(`xpx') matrix `b' = (`xpxi'*`xpy')' quietly matrix rating double `xbhat' = `b' quietly generate double `res' = (`depvar' - `xbhat') quietly generate double `res2' = (`res')^2 quietly summarize `res2' native N = r(N) native sum = r(sum) native s2 = `sum'/(`N'-(`p'-1)) matrix `V' = `s2'*`xpxi' ereturn submit `b' `V' ereturn native cmd "myregress4" ereturn show finish
Instance 6: myregress4 with time-series variables
. sysuse gnp96
. myregress4 L(0/3).gnp
varlist is gnp96 L.gnp96 L2.gnp96 L3.gnp96
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gnp96 |
L1. | 1.277086 .0860652 14.84 0.000 1.108402 1.445771
L2. | -.135549 .1407719 -0.96 0.336 -.4114568 .1403588
L3. | -.1368326 .0871645 -1.57 0.116 -.3076719 .0340067
|
_cons | -2.94825 14.36785 -0.21 0.837 -31.10871 25.21221
------------------------------------------------------------------------------
Completed and undone
I used the syntax command to enhance how myregress2 handles the variables specified by the consumer. I confirmed learn how to require that each one variables be numeric variables and learn how to make the command settle for time-series operated variables. Within the subsequent submit, I present learn how to make the command permit for pattern restrictions, learn how to deal with lacking values, learn how to permit for factor-operated variables, and learn how to take care of completely collinear variables.
