Monday, February 23, 2026

A tour of datetime in Stata


Changing a string date

Stata has a big selection of instruments to work with dates. You may have dates in years, months, and even milliseconds. On this submit, I’ll present a short tour of working with dates that may allow you to get began utilizing all of Stata’s instruments.

If you load a dataset, you’ll discover that each variable has a show format. For date variables, the show format is %td for day by day dates, %tm for month-to-month dates, and so on. Let’s load the wpi1 dataset as an instance.


. webuse wpi1

. describe

Comprises knowledge from http://www.stata-press.com/knowledge/r14/wpi1.dta
  obs:           124                          
 vars:             3                          28 Nov 2014 10:31
 measurement:         1,240                          
-------------------------------------------------------------------------------
              storage   show    worth
variable identify   sort    format     label      variable label
-------------------------------------------------------------------------------
wpi             float   %9.0g                 U.S. Wholesale Worth Index
t               int     %tq                   quarterly date
ln_wpi          float   %9.0g                 
-------------------------------------------------------------------------------
Sorted by: t

The show format for the variable t is %tq which signifies a quarterly date. At this level, we are able to use tsset to declare the information as time collection and proceed our statistical endeavors. In actuality, nonetheless, date variables are saved in string or numeric codecs, which aren’t immediately interpretable by Stata. Under, we use capabilities for translating dates saved in string and numeric codecs right into a Stata date format. For particulars about changing dates from different software program, see Utilizing dates and instances from different software program

The 2 most generally used capabilities for translating string dates are date() and clock(). date() converts day by day dates into the variety of days elapsed since 01 Jan 1960. Equally, clock() converts dates with time stamps into the variety of milliseconds elapsed since 01 January 1960 00:00:00.000. For dates earlier than 01 Jan 1960, Stata assigns adverse numbers because the variety of elapsed days. For instance, 31 Dec 1959 is -1 days elapsed since 01 Jan 1960.

All date capabilities takes two arguments: the identify of the string variable and a string specifying the order of the date and time parts known as masks. Let’s have a look at an instance.


. clear

. enter str18 mydate

                 mydate
  1. "20151001"
  2. "15-10-01"
  3. "102015"
  4. "01Oct2015 20:10"
  5. "14:10:05"
  6. finish

I created a string variable mydate with 5 various kinds of dates as observations. The primary statement “20151001” is ordered as yr, month, and day, and the corresponding masks is “YMD”. I can translate this string date utilizing date(mydate,”YMD”) and retailer the translated date within the variable newdate.


. quietly generate double newdate = date(mydate,"YMD") in 1

I like to recommend utilizing the storage sort double for newdate to reduce lack of precision. That is notably vital for clock time dates, as we are going to see later. Let’s listing what Stata saved in newdate.


. listing mydate newdate in 1

     +--------------------+
     |   mydate   newdate |
     |--------------------|
  1. | 20151001     20362 |
     +--------------------+

The numeric worth of 01 October 2015 is equal to twenty,362 days elapsed since 01 January 1960. At this level, now we have efficiently translated a string date right into a numeric date. We will resume our evaluation with newdate as our date variable. In a later part, I’ll change the show format utilizing format to make it readable.

The following statement, “15-10-01”, is ordered the identical means as the primary apart from the lacking century and the presence of hyphens. To translate dates with a two-digit yr, we have to inform Stata what century the yr part refers to. On this case, we are able to enter the century by specifying “20YMD” because the masks. We will receive the corresponding numeric date by typing


. quietly exchange newdate = date(mydate,"20YMD") in 2

Observe that we didn’t acknowledge the hyphens whereas changing the date above as a result of Stata ignores any punctuation that exists in string dates. The third statement, “102015”, refers to a month and yr. As a result of the day part doesn’t exist, we solely specify “MY” because the masks.


. quietly exchange newdate = date(mydate,"MY") in 3

. listing mydate newdate in 3

     +------------------+
     | mydate   newdate |
     |------------------|
  3. | 102015     20362 |
     +------------------+

Though we didn’t specify the day part, the date() perform with the “MY” masks nonetheless transformed the string date because the variety of days elapsed. It is because Stata assumes a default worth of 1 for the day part, so we inadvertently translated 01 Oct 2015 when in truth the date existed as solely Oct 2015 in out knowledge.

The fourth statement, “01Oct2015 20:10:40”, has an hour and a minute time stamp. On this case we use the clock() perform as a substitute of date(). The corresponding masks is “DMYhm”.


. quietly exchange newdate = clock(mydate,"DMYhm") in 4

. listing mydate newdate in 4

     +-----------------------------+
     |          mydate     newdate |
     |-----------------------------|
  4. | 01Oct2015 20:10   1.759e+12 |
     +-----------------------------+

The numeric worth in newdate refers to a quantity with a magnitude (10^{12}) milliseconds elapsed since 01 Jan 1960 00:00:00.000.

We will additionally ignore sure parts within the string by utilizing the # image for these parts. For instance, we are able to specify “DMYh#” because the masks to disregard the minute part. Nonetheless, if we omit the # image and simply use “DMYh” as a substitute, we are going to receive a lacking worth. Stata won’t assume a default worth of 00 for the minute part because it did earlier for the date. The reason being that Stata all the time expects a masks for the total string within the statement.

Lastly, the final statement, “14:10:05”, is in hour, minute, and second type. As you’ll have guessed, the corresponding masks is solely “hms”. We will convert it to a numeric worth utilizing clock(mydate,”hms”).

I’ve solely used 5 variations of string dates on this submit. There are numerous extra Stata capabilities that match nearly any want. Additionally, there are quite a few translation capabilities akin to weekly(), month-to-month(), and so on., that I’ve not talked about.

Displaying dates

We want working with readable dates as a substitute of elapsed dates and instances. As soon as now we have transformed to a numeric date, we are able to merely use format to vary the show format. Let’s have a look at an instance.


. clear

. enter str18 mydate

                 mydate
  1. "01Jan2015"
  2. "02Jan2015"
  3. "03Jan2015"
  4. "04Jan2015"
  5. "05Jan2015"
  6. finish

The variable mydate accommodates day by day dates with day, month, and yr parts. As you’ll have guessed, we use the date() perform with a masks “DMY” to translate mydate right into a numeric date.


. generate double newdate = date(mydate,"DMY")

. listing

     +---------------------+
     |    mydate   newdate |
     |---------------------|
  1. | 01Jan2015     20089 |
  2. | 02Jan2015     20090 |
  3. | 03Jan2015     20091 |
  4. | 04Jan2015     20092 |
  5. | 05Jan2015     20093 |
     +---------------------+

I saved the numeric dates within the variable newdate. The observations are saved because the variety of elapsed days. To vary the show format, we use the format command.


. format %td newdate

. listing

     +-----------------------+
     |    mydate     newdate |
     |-----------------------|
  1. | 01Jan2015   01jan2015 |
  2. | 02Jan2015   02jan2015 |
  3. | 03Jan2015   03jan2015 |
  4. | 04Jan2015   04jan2015 |
  5. | 05Jan2015   05jan2015 |
     +-----------------------+

%td refers back to the format for displaying day by day dates. We will now tsset our knowledge for time-series or panel-data evaluation.

Constructing dates from numeric parts

Generally, dates exist as particular person numeric parts. In Stata, we are able to mix these particular person parts to acquire the specified date. Take into account the next dataset for instance:


. clear

. enter month day yr hour minute

         month        day       yr       hour     minute
  1. 01 10 2015 10 05
  2. 02 10 2015 05 10
  3. 03 15 2015 20 15
  4. 04 20 2015 12 20
  5. 05 05 2015 02 25
  6. finish

I created 5 variables, particularly month, day, yr, hour, and minute. Let’s start with combining the person month, day, and yr parts within the variable newdate. As with translating string dates into numeric utilizing capabilities, Stata supplies a unique set of capabilities for combining particular person numeric date parts. For instance, we use the perform mdy() to mix the month, day, and yr parts.


. generate double newdate = mdy(month,day,yr)

. format %td newdate

. listing newdate

     +-----------+
     |   newdate |
     |-----------|
  1. | 10jan2015 |
  2. | 10feb2015 |
  3. | 15mar2015 |
  4. | 20apr2015 |
  5. | 05may2015 |
     +-----------+

The arguments for mdy() are particular person month, day, and yr parts. Suppose we wish to add the hours and minutes to the prevailing date variable newdate. We use the perform dhms(newdate,hour,minute,1), which takes date, hour, minute, and second parts as arguments. As a result of seconds don’t exist within the knowledge, we add a 1 to indicate the default worth for the seconds part in dhms().


. quietly exchange newdate = dhms(newdate,hour,minute,1)

. format %tc newdate

. listing newdate

     +--------------------+
     |            newdate |
     |--------------------|
  1. | 10jan2015 10:05:01 |
  2. | 10feb2015 05:10:01 |
  3. | 15mar2015 20:15:01 |
  4. | 20apr2015 12:20:01 |
  5. | 05may2015 02:25:01 |
     +--------------------+

We used %tc as a result of we wish to show a datetime format as a substitute of a date format. As a closing instance, I’ll create a month-to-month date variable by combining the yr and month utilizing the ym() perform and use %tm as a show format for month-to-month dates.


. quietly exchange newdate = ym(yr,month)

. format %tm newdate

. listing newdate

     +---------+
     | newdate |
     |---------|
  1. |  2015m1 |
  2. |  2015m2 |
  3. |  2015m3 |
  4. |  2015m4 |
  5. |  2015m5 |
     +---------+

There are different helpful capabilities for combining particular person parts that I’ve not mentioned right here and you may examine them within the the Stata manuals. However the primary concept stays the identical. As soon as you understand the person date and time parts, you may choose any of the datetime capabilities to mix them.

Abstract

We started our tour by noting that date variables have a particular show format akin to %td for day by day dates. Nonetheless, date variables in uncooked knowledge are sometimes saved as strings. We transformed such string dates to numeric dates utilizing date() and clock() capabilities. Stata shops numeric dates because the variety of elapsed days since 01 Jan 1960 for date() and the variety of elapsed milliseconds since 01 Jan 1960 00:00:00:000 for clock(). We obtained a readable date by utilizing the format command with %td for day by day dates, %tc for datetime, and %tm for month-to-month. Lastly, we additionally mixed particular person numeric date and time parts to type the specified date variable.



Related Articles

Latest Articles