Time-series information, reminiscent of monetary information, usually have identified gaps as a result of there are not any observations on days reminiscent of weekends or holidays. Utilizing common Stata datetime codecs with time-series information which have gaps may end up in deceptive evaluation. Fairly than treating these gaps as lacking values, we should always modify our calculations appropriately. I illustrate a handy option to work with irregularly spaced dates through the use of Stata’s enterprise calendars.
In nasdaq.dta, I’ve day by day information on the NASDAQ index from February 5, 1971 to March 23, 2015 that I downloaded from the St. Louis Federal Reserve Financial Database (FRED).
. use http://www.stata.com/information/nasdaq
. describe
Comprises information from http://www.stata.com/information/nasdaq.dta
obs: 11,132
vars: 2 29 Jan 2016 16:21
measurement: 155,848
-------------------------------------------------------------------------------
storage show worth
variable identify kind format label variable label
-------------------------------------------------------------------------------
date str10 %10s Each day date
index float %9.0g NASDAQ Composite Index (1971=100)
-------------------------------------------------------------------------------
Sorted by:
date is the time variable in our information, which is a string format ordered as 12 months, month, and day. I take advantage of the date() operate to transform the string day by day date to a Stata numeric date and retailer the values in mydate. To search out out extra about changing string dates to numeric, you may learn A tour of datetime in Stata.
. generate mydate = date(date,"YMD") . format %td mydate
I tsset these information with mydate because the time variable after which listing the primary 5 observations, together with the primary lag of index.
. tsset mydate
time variable: mydate, 05feb1971 to 23mar2015, however with gaps
delta: 1 day
. listing date mydate index l.index in 1/5
+------------------------------------------+
| L.|
| date mydate index index |
|------------------------------------------|
1. | 1971-02-05 05feb1971 100 . |
2. | 1971-02-08 08feb1971 100.84 . |
3. | 1971-02-09 09feb1971 100.76 100.84 |
4. | 1971-02-10 10feb1971 100.69 100.76 |
5. | 1971-02-11 11feb1971 101.45 100.69 |
+------------------------------------------+
The primary remark on l.index is lacking; I count on this as a result of there are not any observations previous to the primary remark on index. Nonetheless, the second remark on l.index can be lacking. As you could have already observed, the dates are irregularly spaced in my dataset—the primary remark corresponds to a Friday and the second remark to a Monday.
I get lacking information on this case as a result of mydate is a daily date, and tsset–ing by a daily date will deal with all weekends and different holidays as if they’re lacking within the dataset as a substitute of ignoring them in calculations. To keep away from the issue of gaps inherent in enterprise information, I can create a enterprise calendar. Enterprise calendars specify which dates are omitted. For day by day monetary information, a enterprise calendar specifies the weekends and holidays for which the markets have been closed.
Creating enterprise calendars
Enterprise calendars are outlined in information named calname.stbcal. You may create your personal calendars, use those supplied by StataCorp, or acquire them instantly from different customers or by way of the SSC. Calendars will also be created routinely from the present dataset utilizing the bcal create command.
Each stbcal-file requires you to specify the next 4 issues:
- the model of Stata getting used
- the vary of the calendar
- the middle date of the calendar
- the dates to be omitted
I start by creating nasdaq.stbcal, which is able to omit Saturdays and Sundays of each month. I do that utilizing the Do-file editor, however you should use any textual content editor.
model 14.1 goal "Changing day by day monetary information into enterprise calendar dates" dateformat dmy vary 05feb1971 23mar2015 centerdate 05feb1971 omit dayofweek (Sa Su)
The primary line specifies the present model of Stata I’m utilizing. The second line is non-compulsory, however the textual content typed there’ll show if I kind bcal describe nasdaq and is sweet for document retaining when I’ve a number of calenders. Line 3 specifies the show date format and can be non-compulsory. Line 4 specifies the vary of dates within the dataset.
Line 5 specifies the middle of the date to be 05feb1971. I picked the primary date within the pattern, however I might have picked any date within the vary specified for the enterprise calendar. centerdate doesn’t imply selecting a date that’s in reality the middle of the pattern. For instance, Stata’s default %td calendar makes use of 01jan1960 as its middle.
The final assertion specifies to omit weekends of each month. Later, I’ll present a number of variations of the omit command to omit different holidays. As soon as I’ve a enterprise calendar, I can use this to transform common dates to enterprise dates, share this file with colleagues, and likewise make additional modifications to my calendar.
Utilizing a enterprise calendar
. bcal load nasdaq
loading ./nasdaq.stbcal ...
1. model 14.1
2. goal "Changing day by day monetary information into enterprise calendar dates"
3. dateformat dmy
4. vary 05feb1971 23mar2015
5. centerdate 05feb1971
6. omit dayofweek (Sa Su)
(calendar loaded efficiently)
. generate bcaldate = bofd("nasdaq",mydate)
. assert !lacking(bcaldate) if !lacking(mydate)
To create enterprise dates utilizing bofd(), I specified two arguments: the identify of the enterprise calendar and the identify of the variable containing common dates. The assert assertion verifies that every one dates recorded in mydate seem within the enterprise calendar. This can be a means of checking that I created my calendar for the entire date vary—the bofd() operate returns a lacking worth when mydate doesn’t seem on the required calendar.
Enterprise dates have a selected show format, %tbcalname, which in my case is %tbnasdaq. So as to show enterprise dates in a Stata date format I’ll apply this format to bcaldate simply as I’d for a daily date.
. format %tbnasdaq bcaldate
. listing in 1/5
+---------------------------------------------+
| date index mydate bcaldate |
|---------------------------------------------|
1. | 1971-02-05 100 05feb1971 05feb1971 |
2. | 1971-02-08 100.84 08feb1971 08feb1971 |
3. | 1971-02-09 100.76 09feb1971 09feb1971 |
4. | 1971-02-10 100.69 10feb1971 10feb1971 |
5. | 1971-02-11 101.45 11feb1971 11feb1971 |
+---------------------------------------------+
Though mydate and bcaldate look comparable, they’ve completely different encodings. Now, I can tsset on the enterprise date bcaldate and listing the primary 5 observations with the lag of index recalculated.
. tsset bcaldate
time variable: bcaldate, 05feb1971 to 23mar2015, however with gaps
delta: 1 day
. listing bcaldate index l.index in 1/5
+-----------------------------+
| L.|
| bcaldate index index |
|-----------------------------|
1. | 05feb1971 100 . |
2. | 08feb1971 100.84 100 |
3. | 09feb1971 100.76 100.84 |
4. | 10feb1971 100.69 100.76 |
5. | 11feb1971 101.45 100.69 |
+-----------------------------+
As anticipated, the difficulty of gaps on account of weekends is now resolved. As a result of I have a calendar that excludes Saturdays and Sundays, bcaldate skipped the weekend between 05feb1971 and 08feb1971 when calculating the lagged index worth and can do the identical for any subsequent weekends within the information.
Excluding particular dates
Up to now I’ve not excluded gaps within the information on account of different main holidays, reminiscent of Thanksgiving and Christmas. Stata has a number of variations on the omit command that allow you to exclude particular dates. For instance, I take advantage of the omit command to omit the Thanksgiving vacation (the fourth Thursday of November within the U.S.) by including the next assertion in my enterprise calendar.
omit dowinmonth +4 Th of Nov
dowinmonth stands for day of week in month and +4 Th of Nov refers back to the fourth Thursday of November. This rule is utilized to yearly within the information.
One other main vacation is Christmas, with the NASDAQ closed on the twenty fifth of December yearly. I can omit this vacation within the calendar as
omit date 25dec*
The * within the assertion above signifies that December 25 ought to be omitted for yearly in my nasdaq calendar.
This rule is deceptive for the reason that twenty fifth could also be on a weekend, through which case the vacations are on the preceeding Friday or following Monday. To seize these circumstances, I add the next statements:
omit date 25dec* and (-1) if dow(Sa) omit date 25dec* and (+1) if dow(Su)
The primary assertion omits December 24 if Christmas is on a Saturday, and the second assertion omits December 26 if Christmas is on a Sunday.
Encodings
I discussed earlier that the encodings of normal date mydate and enterprise date bcaldate are completely different. To see the encodings of my date variables, I apply the numerical format and listing the primary 5 observations.
. format %8.0g mydate bcaldate
. listing in 1/5
+-----------------------------------------+
| date index mydate bcaldate |
|-----------------------------------------|
1. | 1971-02-05 100 4053 0 |
2. | 1971-02-08 100.84 4056 1 |
3. | 1971-02-09 100.76 4057 2 |
4. | 1971-02-10 100.69 4058 3 |
5. | 1971-02-11 101.45 4059 4 |
+-----------------------------------------+
The variable bcaldate begins with 0 as a result of this was the centerdate in my calendar nasdaq.stbcal. The enterprise date encoding is consecutive with out gaps, which is why utilizing lags or any time-series operators will yield appropriate values.
Abstract
Utilizing common dates with time-series information as a substitute of enterprise dates could also be deceptive in case there are gaps within the information. On this submit, I confirmed a handy option to work with enterprise dates by making a enterprise calendar. As soon as I loaded a calendar file into Stata, I created enterprise dates utilizing the bofd() operate. I additionally confirmed some variations of the omit command utilized in enterprise calendars to accommodate particular gaps on account of completely different holidays.
