Sunday, April 5, 2026

The Penultimate Information to Precision


There have lately been occasional questions on precision and storage varieties on Statalist regardless of all that I’ve written on the topic, a lot of it posted on this weblog. I take that as proof that I’ve but to provide a helpful, readable piece that addresses all of the questions researchers have.

So I wish to attempt once more. This time I’ll attempt to write the last word piece on the topic, making it as brief and snappy as attainable, and addressing each in style query of which I’m conscious—together with some I haven’t addressed earlier than—and doing all that with out making you wade with me into all of the messy particulars, which I do know I generally tend to do.

I’m hopeful that any longer, each query that seems on Statalist that even remotely touches on the topic will probably be answered with a hyperlink again to this web page. If I succeed, I’ll place this within the Stata manuals and get it listed on-line in Stata in order that customers can discover it the moment they’ve questions.

What follows is meant to offer all the things scientific researchers must know to evaluate the impact of storage precision on their work, to know what can go flawed, and to stop that. I don’t wish to increase expectations an excessive amount of, nevertheless, so I’ll entitle it …

  • Contents

     1. Numeric varieties
    2. Floating-point varieties
    3. Integer varieties
    4. Integer precision
    5. Floating-point precision
    6. Recommendation regarding 0.1, 0.2, …
    7. Recommendation regarding precise information, resembling foreign money information
    8. Recommendation for programmers
    9. interpret %21x format (for those who care)
    10. Additionally see

  • Numeric varieties

    1.1 Stata offers 5 numeric varieties for storing variables, three of them integer varieties and two of them floating level.

    1.2 The floating-point varieties are float and double.

    1.3 The integer varieties are byte, int, and lengthy.

    1.4 Stata makes use of these 5 varieties for the storage of information.

    1.5 Stata makes all calculations in double precision (and typically quad precision) whatever the kind used to retailer the information.

  • Floating-point varieties

    2.1 Stata offers two IEEE 754-2008 floating-point varieties: float and double.

    2.2 float variables are saved in 4 bytes.

    2.3 double variables are saved in 8 bytes.

    2.4 The ranges of float and double variables are

         Storage
         kind             minimal                most
         -----------------------------------------------------
         float     -3.40282346639e+ 38      1.70141173319e+ 38
         double    -1.79769313486e+308      8.98846567431e+307
         -----------------------------------------------------
         As well as, float and double can report lacking values 
         ., .a, .b, ..., .z.

    The above values are approximations. For these conversant in %21x floating-point hexadecimal format, the precise values are

         Storage
         kind                   minimal                most
         ------------------------------------------------------- 
         float   -1.fffffe0000000X+07f     +1.fffffe0000000X+07e 
         double  -1.fffffffffffffX+3ff     +1.fffffffffffffX+3fe
         -------------------------------------------------------

    Mentioned otherwise, and fewer exactly, float values are within the open interval (-2128, 2127), and double values are within the open interval (-21024, 21023). That is much less exact as a result of the intervals proven within the tables are closed intervals.

  • Integer varieties

    3.1 Stata offers three integer storage codecs: byte, int, and lengthy. They’re 1 byte, 2 bytes, and 4 bytes, respectively.

    3.2 Integers can also be saved in Stata’s IEEE 754-2008 floating-point storage codecs float and double.

    3.3 Integer values could also be saved exactly over the ranges

         storage
         kind                   minimal                 most
         ------------------------------------------------------
         byte                      -127                     100
         int                    -32,767                  32,740
         lengthy            -2,147,483,647           2,147,483,620
         ------------------------------------------------------
         float              -16,777,216              16,777,216
         double  -9,007,199,254,740,992   9,007,199,254,740,992
         ------------------------------------------------------
         As well as, all storage varieties can report lacking values
         ., .a, .b, ..., .z.

    The general ranges of float and double had been proven in (2.4) and are wider than the ranges for them proven right here. The ranges proven listed here are the subsets of the general ranges over which no rounding of integer values happens.

  • Integer precision

    4.1 (Automated promotion.) For the integer storage varieties—for byte, int, and lengthy—numbers outdoors the ranges listed in (3.3) can be saved as lacking (.) besides that storage varieties are promoted robotically. As mandatory, Stata promotes bytes to ints, ints to longs, and longs to doubles. Even when a variable is a byte, the efficient vary remains to be [-9,007,199,254,740,992, 9,007,199,254,740,992] within the sense that you would change a worth of a byte variable to a big worth and that worth can be saved appropriately; the variable that was a byte would, as if by magic, change its kind to int, lengthy, or double if that had been mandatory.

    4.2 (Information enter.) Automated promotion (4.1) applies after the information are enter/learn/imported/copied into Stata. When first studying, importing, copying, or creating information, it’s your duty to decide on applicable storage varieties. Bear in mind that Stata’s default storage kind is float, so when you have massive integers, it’s normally essential to specify explicitly the kinds you want to use.

    In case you are not sure of the kind to specify on your integer variables, specify double. After studying the information, you need to use compress to demote storage varieties. compress by no means ends in a lack of precision.

    4.3 Be aware that you need to use the floating-point varieties float and double to retailer integer information.

    4.3.1 Integers outdoors the vary [-2,147,483,647, 2,147,483,620] have to be saved as doubles if they’re to be exactly recorded.

    4.3.2 Integers will be saved as float, however keep away from doing that until you’re sure they are going to be contained in the vary [-16,777,216, 16,777,216] not simply if you initially learn, import, or copy them into Stata, however subsequently as you make transformations.

    4.3.3 In the event you learn your integer information as floats, and assuming they’re throughout the allowed vary, we suggest that you just change them to an integer kind. You are able to do that just by typing compress. We make that suggestion in order that your integer variables will profit from the automated promotion described in (4.1).

    4.4 Allow us to present what can go flawed if you don’t observe our recommendation in (4.3). For the floating-point varieties—for float and double—integer values outdoors the ranges listed in (3.3) are rounded.

    Take into account a float variable, and keep in mind that the integer vary for floats is [-16,777,216, 16,777,216]. In the event you tried to retailer a worth outdoors the vary within the variable—say, 16,777,221—and for those who checked afterward, you’ll uncover that truly saved was 16,777,220! Listed below are another examples of rounding:

         desired worth                            saved (rounded)
         to retailer            true worth             float worth 
         ------------------------------------------------------
         most             16,777,216              16,777,216 
         most+1           16,777,217              16,777,216
         ------------------------------------------------------
         most+2           16,777,218              16,777,218
         ------------------------------------------------------
         most+3           16,777,219              16,777,220
         most+4           16,777,220              16,777,220
         most+5           16,777,221              16,777,220
         ------------------------------------------------------
         most+6           16,777,222              16,777,222
         ------------------------------------------------------
         most+7           16,777,223              16,777,224
         most+8           16,777,224              16,777,224
         most+9           16,777,225              16,777,224
         ------------------------------------------------------
         most+10          16,777,226              16,777,226
         ------------------------------------------------------

    Once you retailer massive integers in float variables, values will probably be rounded and no point out will probably be made from that reality.

    And that’s the reason we are saying that when you have integer information that have to be recorded exactly and if the values could be massive—outdoors the vary ±16,777,216—don’t use float. Use lengthy or use double; or simply use the compress command and let computerized promotion deal with the issue for you.

    4.5 In contrast to byte, int, and lengthy, float and double variables are usually not promoted to protect integer precision.

    Float values are usually not promoted as a result of, effectively, they aren’t. Truly, there’s a deep motive, however it has to do with the usage of float variables for his or her actual goal, which is to retailer non-integer values.

    Double values are usually not promoted as a result of there may be nothing to advertise them to. Double is Stata’s most exact storage kind. The most important integer worth Stata can retailer exactly is 9,007,199,254,740,992 and the smallest is -9,007,199,254,740,992.

    Integer values outdoors the vary for doubles spherical in the identical approach that float values spherical, besides at completely bigger values.

  • Floating-point precision

    5.1 The smallest, nonzero worth that may be saved in float and double is

         Storage
         kind      worth          worth in %21x         worth in base 10
         -----------------------------------------------------------------
         float     ±2^-127    ±1.0000000000000X-07f   ±5.877471754111e-039
         double    ±2^-1022   ±1.0000000000000X-3fe   ±2.225073858507e-308
         -----------------------------------------------------------------

    We embody the worth proven within the third column, the worth in %21x, for many who know the right way to learn it. It’s described in (9), however it’s unimportant. We’re merely emphasizing that these are the smallest values for correctly normalized numbers.

    5.2 The smallest worth of epsilon such that 1+epsilon ≠ 1 is

         Storage
         kind      epsilon       epsilon in %21x        epsilon in base 10
         -----------------------------------------------------------------
         float      ±2^-23     ±1.0000000000000X-017    ±1.19209289551e-07
         double     ±2^-52     ±1.0000000000000X-034    ±2.22044604925e-16
         -----------------------------------------------------------------

    Epsilon is the space from 1 to the following quantity on the floating-point quantity line. The corresponding unit roundoff error is u = ±epsilon/2. The unit roundoff error is the utmost relative roundoff error that’s launched by the floating-point quantity storage scheme.

    The smallest worth of epsilon such that x+epsilon ≠ x is roughly |x|*epsilon, and the corresponding unit roundoff error is ±|x|*epsilon/2.

    5.3 The precision of the floating-point varieties is, relying on the way you wish to measure it,

         Measurement                           float              double
         ----------------------------------------------------------------
         # of binary digits                       23                  52
         # of base 10 digits (approximate)         7                  16 
    
         Relative precision                   ±2^-24              ±2^-53
         ... in base 10 (approximate)      ±5.96e-08           ±1.11e-16
         ----------------------------------------------------------------

    Relative precision is outlined as

                           |x - x_as_stored|
                  ± max   ------------------    
                     x            x

    carried out utilizing infinite precision arithmetic, x chosen from the subset of reals between the minimal and most values that may be saved. It’s value appreciating that relative precision is a worst-case relative error over all attainable numbers that may be saved. Relative precision is equivalent to roundoff error, however maybe this definition is simpler to understand.

    5.4 Stata by no means makes calculations in float precision, even when the information are saved as float.

    Stata makes double-precision calculations no matter how the numeric information are saved. In some circumstances, Stata internally makes use of quad precision, which offers roughly 32 decimal digits of precision. If the results of the calculation is being saved again right into a variable within the dataset, then the double (or quad) result’s rounded as essential to be saved.

    5.5 (False precision.) Double precision is 536,870,912 occasions extra correct than float precision. You could fear that float precision is insufficient to precisely report your information.

    Little on this world is measured to a relative accuracy of ±2-24, the accuracy supplied by float precision.

    Ms. Smith, it’s reported, made $112,293 this 12 months. Do you imagine that’s recorded to an accuracy of ±2-24*112,293, or roughly ±0.7 cents?

    David was born on 21jan1952, so on 27mar2012 he was 21,981 days previous, or 60.18 years previous. Recorded in float precision, the precision is ±60.18*2-24, or roughly ±1.89 minutes.

    Joe reported that he drives 12,234 miles per 12 months. Do you imagine that Joe’s report is correct to ±12,234*2-24, equal to ±3.85 toes?

    A pattern of 102,400 folks reported that they drove, in complete, 1,252,761,600 miles final 12 months. Is that correct to ±74.7 miles (float precision)? Whether it is, every of them is reporting with an accuracy of roughly ±3.85 toes.

    The space from the Earth to the moon is commonly reported as 384,401 kilometers. Recorded as a float, the precision is ±384,401*2-24, or ±23 meters, or ±0.023 kilometers. As a result of the quantity was not reported as 384,401.000, one would assume float precision can be correct to report that consequence. Actually, float precision is greater than sufficiently correct to report the space as a result of the space from the Earth to the moon varies from 356,400 to 406,700 kilometers, some 50,300 kilometers. The space would have been higher reported as 384,401 ±25,150 kilometers. At finest, the measurement 384,401 has relative accuracy of ±0.033 (it’s correct to roughly two digits).

    Nonetheless, a couple of issues have been measured with greater than float accuracy, they usually stand out as crowning accomplishments of mankind. Use double as required.

  • Recommendation regarding 0.1, 0.2, …

    6.1 Stata makes use of base 2, binary. In style numbers resembling 0.1, 0.2, 100.21, and so forth, don’t have any precise binary illustration in a finite variety of binary digits. There are a couple of exceptions, resembling 0.5 and 0.25, however not many.

    6.2 In the event you create a float variable containing 1.1 and record it, it can record as 1.1 however that’s solely as a result of Stata’s default show format is %9.0g. In the event you modified that format to %16.0g, the consequence would seem as 1.1000000238419.

    This scares some customers. If this scares you, return and browse (5.5) False Precision. The relative error remains to be a modest ±2-24. The number one.1000000238419 is probably going a superbly acceptable approximation to 1.1 as a result of the 1.1 was by no means measured to an accuracy of lower than ±2-24 anyway.

    6.3 One motive completely acceptable approximations to 1.1 resembling 1.1000000238419 might trouble you is that you just can’t choose observations containing 1.1 by typing if x==1.1 if x is a float variable. You can’t as a result of the 1.1 on the correct is interpreted as double precision 1.1. To pick out the observations, it’s important to kind if x==float(1.1).

    6.4 If this bothers you, report the information as doubles. It’s best to do that on the level if you learn the unique information or if you make the unique calculation. The quantity will then seem like 1.1. It won’t actually be 1.1, however it can have much less relative error, specifically, ±2-53.

    6.5 In the event you initially learn the information and saved them as floats, it’s nonetheless typically attainable to get well the double-precision accuracy simply as for those who had initially learn the information into doubles. You are able to do this if you understand how many decimal digits had been recorded after the decimal level and if the values are inside a sure vary.

    If there was one digit after the decimal level and if the information are within the vary [-1,048,576, 1,048,576], which suggests the values could possibly be -1,048,576, -1,048,575.9, …, -1, 0, 1, …, 1,048,575.9, 1,048,576, then typing

    . gen double y = spherical(x*10)/10

    will get well the total double-precision consequence. Saved in y would be the quantity in double precision simply as for those who had initially learn it that approach.

    It’s not attainable, nevertheless, to get well the unique consequence if x is outdoors the vary ±1,048,576 as a result of the float variable comprises too little info.

    You are able to do one thing comparable when there are two, three, or extra decimal digits:

         # digits to
         proper of 
         decimal pt.   vary     command
         -----------------------------------------------------------------
             1      ±1,048,576   gen double y = spherical(x*10)/10
             2      ±  131,072   gen double y = spherical(x*100)/100
             3      ±   16,384   gen double y = spherical(x*1000)/1000
             4      ±    1,024   gen double y = spherical(x*10000)/10000
             5      ±      128   gen double y = spherical(x*100000)/100000
             6      ±       16   gen double y = spherical(x*1000000)/1000000
             7      ±        1   gen double y = spherical(x*10000000)/10000000
         -----------------------------------------------------------------

    Vary is the vary of x over which command will produce appropriate outcomes. As an example, vary = ±16 within the next-to-the-last line implies that the values recorded in x have to be -16 ≤ x ≤ 16.

  • Recommendation regarding precise information, resembling foreign money information

    7.1 Sure, there are precise information on this world. Such information are normally counts of one thing or are foreign money information, which you’ll be able to consider as counts of pennies ($0.01) or the smallest unit in no matter foreign money you’re utilizing.

    7.2 Simply because the information are precise doesn’t imply you want precise solutions. It could nonetheless be that calculated solutions are satisfactory if the information are recorded to a relative accuracy of ±2-24 (float). For many analyses—even of foreign money information—that is typically satisfactory. The U.S. deficit in 2011 was $1.5 trillion. Saved as a float, this quantity has a (most) error of ±2-24*1.5e+12 = ±$89,406.97. It will be tough to think about that ±$89,406.97 would have an effect on any authorities determination maker coping with the total $1.5 trillion.

    7.3 That mentioned, you typically do must make precise calculations. Banks monitoring their accounts want precise quantities. It’s not sufficient to say to account holders that we now have your cash inside a couple of pennies, {dollars}, or tons of of {dollars}.

    In that case, the foreign money information ought to be transformed to integers (pennies) and saved as integers, after which processed as described in (4). Assuming the dollar-and-cent quantities had been learn into doubles, you possibly can convert them into pennies by typing

    . exchange x = x*100

    7.4 In the event you mistakenly learn the foreign money information as a float, you should not have to re-read the information if the greenback quantities are between ±$131,072. You’ll be able to kind

    . gen double x_in_pennies = spherical(x*100)

    This works provided that x is between ±131,072.

  • Recommendation for programmers

    8.1 Stata does all calculations in double (and typically quad) precision.

    Float precision could also be satisfactory for recording most information, however float precision is insufficient for performing calculations. That’s the reason Stata does all calculations in double precision. Float precision can be insufficient for storing the outcomes of intermediate calculations.

    There is just one scenario by which it is advisable train warning—for those who create variables within the information containing intermediate outcomes. Make sure to create all such variables as doubles.

    8.2 The identical quad-precision routines StataCorp makes use of can be found to you in Mata; see the handbook entries [M-5] imply, [M-5] sum, [M-5] runningsum, and [M-5] quadcross. Use them as you choose mandatory.

  • interpret %21x format (for those who care)

    9.1 Stata has a show format that can show IEEE 754-2008 floating-point numbers of their full binary glory however in a readable approach. You most likely don’t care; if that’s the case, skip this part.

    9.2 IEEE 754-2008 floating-point numbers are saved as a pair of numbers (a, b) which can be given the interpretation

    z = a * 2b

    the place -2 < a < 2. In double precision, a is recorded with 52 binary digits. In float precision, a is recorded with 23 binary digits. For instance, the quantity 2 is recorded in double precision as

    a = +1.0000000000000000000000000000000000000000000000000000
    b = +1

    The worth of pi is recorded as

    a = +1.1001001000011111101101010100010001000010110100011000
    b = +1

    9.3 %21x presents a and b in base 16. The double-precision worth of two is proven in %21x format as

    +1.0000000000000X+001

    and the worth of pi is proven as

    +1.921fb54442d18X+001

    Within the case of pi, the interpretation is

    a = +1.921fb54442d18 (base 16)
    b = +001             (base 16)

    Studying this requires observe. It helps to keep in mind that one-half corresponds to 0.8 (base 16). Thus, we are able to see {that a} is barely bigger than 1.5 (base 10) and b = 1 (base 10), so _pi is one thing over 1.5*21 = 3.

    The quantity 100,000 in %21x is

    +1.86a0000000000X+010

    which is to say

    a = +1.86a0000000000 (base 16)
    b = +010             (base 16)

    We see {that a} is barely over 1.5 (base 10), and b is 16 (base 10), so 100,000 is one thing over 1.5*216 = 98,304.

    9.4 %21x faithfully presents how the pc thinks of the quantity. As an example, we are able to simply see that the great no 1.1 (base 10) is, in binary, a quantity with many digits to the correct of the binary level:

    . show %21x 1.1
    +1.199999999999aX+000

    We are able to additionally see why 1.1 saved as a float is totally different from 1.1 saved as a double:

    . show %21x float(1.1)
    +1.19999a0000000X+000

    Float precision assigns fewer digits to the mantissa than does double precision, and 1.1 (base 10) in base 16 is a repeating hexadecimal.

    9.5 %21x can be utilized as an enter format in addition to an output format. As an example, Stata understands

    . gen x = 1.86ax+10

    Saved in x will probably be 100,000 (base 10).

    9.6 StataCorp has seen too many competent scientific programmers who, needing a perturbance for later use of their program, code one thing like

    epsilon = 1e-8

    It’s value analyzing that quantity:

    . show %21x 1e-8
    +1.5798ee2308c3aX-01b

    That’s an unsightly quantity that may solely result in the introduction of roundoff error of their program. A much better quantity can be

    epsilon = 1.0x-1b

    Stata and Mata perceive the above assertion as a result of %21x could also be used as enter in addition to output. Naturally, 1.0x-1b appears identical to what it’s,

    . show %21x 1.0x-1b
    +1.0000000000000X-01b

    and all these fairly zeros will scale back numerical roundoff error.

    In base 10, the beautiful 1.0x-1b appears like

    . show %20.0g 1.0x-1b
    7.4505805969238e-09

    and that quantity might not look fairly to you, however you aren’t a base-2 digital pc.

    Maybe the programmer feels that epsilon actually must be nearer to 1e-8. In %21x, we see that 1e-8 is +1.5798ee2308c3aX-01b, so if we wish to get nearer, maybe we use

    epsilon = 1.6x-1b

    9.7 %21x was invented by StataCorp.

  • Additionally see

    In the event you want to be taught extra, see

    learn the %21x format

    learn the %21x format, half 2

    Precision (but once more), Half I

    Precision (but once more), Half II



  • Related Articles

    Latest Articles