Tuesday, January 13, 2026

Creating Excel tables with putexcel, half 2: Macro, image, matrix, and system expressions


In my final submit, I confirmed how you can use putexcel to jot down easy expressions to Microsoft Excel and format the ensuing textual content and cells. As we speak, I need to present you how you can write extra complicated expressions resembling macros, graphs, and matrices. I’ll even present you how you can write formulation to Excel to create calculated cells. These are vital steps towards our aim of automating the creation of stories in Excel.

Earlier than we start the examples, I need to level out that I’m going to create a single Excel file referred to as putexcel2.xlsx and write every instance to a separate sheet inside that file. So the putexcel set instructions will appear like this:


putexcel set putexcel2.xlsx, sheet(example1) change
putexcel set putexcel2.xlsx, sheet(example2) modify
...
putexcel set putexcel2.xlsx, sheet(example8) modify

I may create a separate Excel file for every instance, however I would favor to have a single file with a number of sheets. It’s possible you’ll want to use the same strategy when creating massive stories.

You’ll be able to obtain the dataset used within the examples by typing webuse nhanes2. I’m utilizing a subset of the information and have relabeled a few of the variables so don’t be involved in case your numbers don’t match mine precisely.

Instance 1: Writing returned outcomes to Excel

Many Stata instructions return scalars, macros, and matrices. We are able to write every of those to an Excel desk utilizing putexcel. For instance, I would want to write the imply of age to Excel. I can sort return listing after summarize age to see a listing of returned outcomes. The imply is saved within the scalar r(imply).


. summarize age

    Variable |        Obs        Imply    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
         age |      1,266    48.44076    16.98858         20         74

. return listing

scalars:
                  r(N) =  1266
              r(sum_w) =  1266
               r(imply) =  48.44075829383886
                r(Var) =  288.6119026656426
                 r(sd) =  16.98858153777539
                r(min) =  20
                r(max) =  74
                r(sum) =  61326

I start by setting the vacation spot to the sheet “example1” within the file putexcel2.xlxs. I can then write the expression “Imply Age = ” to cell A1 and the expression `r(imply)’ to cell B1. Notice that r(imply) is enclosed in single quotes. This tells Stata that I need to write the worth of r(imply) to cell B2.


. putexcel set putexcel2.xlsx, sheet(example1) change
Notice: file might be changed when the primary putexcel command is issued

. putexcel A1 = "Imply Age = "
file putexcel2.xlsx saved

. putexcel B1 = `r(imply)'
file putexcel2.xlsx saved

I can open the ensuing Excel file and see that the imply was efficiently written to cell B2.

Instance 2: Formatting numbers with Excel codecs

I can specify the show format of a quantity utilizing Excel codecs within the nformat() choice. For instance, I may show the imply with three decimal locations utilizing the choice nformat(“#.###”).


. putexcel set putexcel2.xlsx, sheet(example2) modify

. putexcel A1 = "Imply Age = "
file putexcel2.xlsx saved

. putexcel B1 = `r(imply)', nformat("#.###")
file putexcel2.xlsx saved

graph1

You’ll be able to view an entire description of Excel’s formatting choices within the Choices part of [P] putexcel superior.

Instance 3: Formatting numbers with Stata codecs

Stata contains many shortcut format codes that can be utilized with nformat(). For instance, I may show the imply with two decimal locations utilizing the choice number_d2.


. putexcel set putexcel2.xlsx, sheet(example3) modify

. putexcel A1 = "Imply Age = "
file putexcel2.xlsx saved

. putexcel B1 = `r(imply)', nformat(number_d2)
file putexcel2.xlsx saved

graph1

You’ll be able to view an entire listing of codes for numeric codecs within the Appendix of [P] putexcel.

Instance 4: Formatting numbers with the string() perform

I may additionally report the imply and customary deviation in a single cell. This may be completed in two steps. First, I retailer r(imply) and r(sd) to the native macros meanage and sdage, respectively. The string() perform permits me to specify one decimal place. Second, I create an area macro named meansd that mixes meanage and sdage right into a single expression. Notice that I have to use double quotes within the command putexcel B1 = “`meansd’” as a result of meansd is a string.


. // put the imply of age in a cell utilizing the string() perform
. putexcel set putexcel2.xlsx, sheet(example4) modify

. putexcel A1 = "Imply (SD) = "
file putexcel2.xlsx saved

. native meanage = string(`r(imply)',"%9.1f")

. native sdage = string(`r(sd)',"%9.1f")

. native meansd = "`meanage' (`sdage')"

. putexcel B1 = "`meansd'"
file putexcel2.xlsx saved

graph1

Instance 5: Including a graph to an Excel file

I would want to embody a histogram of age in my Excel file. First, I create the histogram and use graph export to avoid wasting the graph to a .png file.


. histogram age
(bin=31, begin=20, width=1.7419355)

graph1


. graph export age.png, change
(file age.png written in PNG format)

Then, I can put the graph within the spreadsheet utilizing the expression image(age.png)


. putexcel set putexcel2.xlsx, sheet(example5) modify

. putexcel A1 = image(age.png)
file putexcel2.xlsx saved

graph1

Instance 6: Writing a matrix to Excel

Some Stata instructions return matrices. For instance, I can use tabstat to calculate descriptive statistics for a listing of variables. The save choice tells tabstat to avoid wasting the outcomes to a matrix.


. tabstat age peak weight sbp dbp, stat(n imply sd min max) save

   stats |       age    peak    weight       sbp       dbp
---------+--------------------------------------------------
       N |      1266      1266      1266      1267      1267
    imply |  48.44076  167.1308  72.17252  130.6843  80.54854
      sd |  16.98858   9.61487  16.28185  24.34302  13.99187
     min |        20   144.199     39.12        65        35
     max |        74   193.398    175.88       254       150
------------------------------------------------------------

After I sort return listing, I see that tabstat returned the matrix r(StatTotal).


. return listing

matrices:
          r(StatTotal) :  5 x 5

. matlist r(StatTotal)

             |       age     peak     weight        sbp        dbp 
-------------+-------------------------------------------------------
           N |      1266       1266       1266       1267       1267 
        imply |  48.44076   167.1308   72.17252   130.6843   80.54854 
          sd |  16.98858    9.61487   16.28185   24.34302   13.99187 
         min |        20    144.199      39.12         65         35 
         max |        74    193.398     175.88        254        150 

I would favor to create an Excel desk that appears just like the output from summarize with variables in rows and statistics in columns. So I create a matrix referred to as outcomes, which is the same as the transpose of r(StatTotal).


. matrix outcomes = r(StatTotal)'

. matlist outcomes

             |         N       imply         sd        min        max 
-------------+-------------------------------------------------------
         age |      1266   48.44076   16.98858         20         74 
      peak |      1266   167.1308    9.61487    144.199    193.398 
      weight |      1266   72.17252   16.28185      39.12     175.88 
         sbp |      1267   130.6843   24.34302         65        254 
         dbp |      1267   80.54854   13.99187         35        150 

Then, I write the matrix outcomes to Excel utilizing the expression matrix(outcomes). I may use the expression matrix(r(StatTotal)’) slightly than creating a brand new matrix, however I needed to indicate you the transposed matrix earlier than writing it to Excel. The upper-left nook of the matrix might be positioned in cell A1 of the Excel desk. The names choice tells putexcel to jot down the row and column names to Excel together with the matrix. The nformat(number_d2) choice tells putexcel to show the matrix with two decimal locations.


. putexcel set putexcel2.xlsx, sheet(example6) modify

. putexcel A1 = matrix(outcomes), names nformat(number_d2)
file putexcel2.xlsx saved

graph1

Subsequent, I want to format the Excel desk to make it look extra like a desk of outcomes slightly than a matrix. I can change the format of a cell with out writing something to the cell. I may even format a spread of cells utilizing the syntax ul:br, the place ul is the upper-left nook of the cell vary and br is the bottom-right cell.

The pattern sizes for every variable displayed in cells B2:B6 don’t want two decimal locations. I can show cells B2:B6 with no decimals utilizing the nformat(quantity) choice. The choice overwritefmt tells putexcel to overwrite the present cell format.


. putexcel B2:B6, nformat(quantity) overwritefmt
file putexcel2.xlsx saved

Subsequent, I format the cells A1:A6. The proper choice aligns the textual content to the appropriate facet of the cells, and the border(proper) choice provides a border to the appropriate facet of the cells.


. putexcel A1:A6, proper border(proper) overwritefmt
file putexcel2.xlsx saved

I format cells A1:F1 equally. The hcenter choice facilities the labels horizontally, and the border(backside) choice provides a border to the underside of cells A1:F1.


. putexcel A1:F1, hcenter daring border(backside) overwritefmt
file putexcel2.xlsx saved

Lastly, I show the numbers in cells B2:F1 utilizing a daring font to make the desk look much like the output of summarize.


. putexcel B2:F6, daring
file putexcel2.xlsx saved

The formatted Excel desk seems to be like this:

graph1

Instance 7: Writing regression coefficients to Excel

Most Stata regression instructions return the desk of coefficients in a matrix named r(desk). For instance, I may match the next linear regression mannequin with regress.


. regress sbp age intercourse 

      Supply |       SS           df       MS      Variety of obs   =     1,266
-------------+----------------------------------   F(2, 1263)      =    236.23
       Mannequin |  204030.521         2  102015.261   Prob > F        =    0.0000
    Residual |  545432.459     1,263  431.854678   R-squared       =    0.2722
-------------+----------------------------------   Adj R-squared   =    0.2711
       Whole |   749462.98     1,265  592.460854   Root MSE        =    20.781

------------------------------------------------------------------------------
         sbp |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .7328247   .0343946    21.31   0.000     .6653479    .8003014
         intercourse |   5.292591   1.170872     4.52   0.000     2.995522     7.58966
       _cons |    92.6976   1.852992    50.03   0.000     89.06232    96.33289
------------------------------------------------------------------------------

and sort matlist r(desk) to see the coefficient matrix.


. matlist r(desk)

             |       age        intercourse      _cons 
-------------+---------------------------------
           b |  .7328247   5.292591    92.6976 
          se |  .0343946   1.170872   1.852992 
           t |  21.30641   4.520211   50.02589 
      pvalue |  2.65e-86   6.76e-06   6.9e-302 
          ll |  .6653479   2.995522   89.06232 
          ul |  .8003014    7.58966   96.33289 
          df |      1263       1263       1263 
        crit |  1.961844   1.961844   1.961844 
       eform |         0          0          0 

r(desk) comprises the rows df, crit, and eform, which aren’t displayed within the regression output. I want to reproduce the desk of coefficients in my Excel desk, so I retailer r(desk) to the matrix outcomes, extract the primary six rows of outcomes, after which transpose outcomes.


. matrix outcomes = r(desk)

. matrix outcomes = outcomes[1..6,1...]'

. matlist outcomes

             |         b         se          t     pvalue         ll         ul 
-------------+------------------------------------------------------------------
         age |  .7328247   .0343946   21.30641   2.65e-86   .6653479   .8003014 
         intercourse |  5.292591   1.170872   4.520211   6.76e-06   2.995522    7.58966 
       _cons |   92.6976   1.852992   50.02589   6.9e-302   89.06232   96.33289 

Now, I can write outcomes to my Excel file.


. putexcel set putexcel2.xlsx, sheet(example8) modify

. putexcel A1 = matrix(outcomes), names nformat(number_d2) hcenter
file putexcel2.xlsx saved

graph1

A number of changes to the font and cell codecs make my matrix look extra like a desk of coefficients.


. putexcel B2:G4, daring overwritefmt
file putexcel2.xlsx saved

. putexcel A1:A4, proper border(proper) overwritefmt
file putexcel2.xlsx saved

. putexcel A1:G1, hcenter border(backside) overwritefmt
file putexcel2.xlsx saved

graph1

Instance 8: Writing cross-tabulations to Excel

We are able to additionally write the outcomes from tabulate to Excel utilizing matrices. The matcell() choice shops the cell counts from tabulate in a matrix. For instance, I can retailer the outcomes of the tabulate command beneath within the matrix cellcounts.


. tabulate intercourse race, matcell(cellcounts)

           |               Race
       Intercourse |     Black      Different      White |     Whole
-----------+---------------------------------+----------
    Feminine |       101         12        563 |       676 
      Male |        75         10        506 |       591 
-----------+---------------------------------+----------
     Whole |       176         22      1,069 |     1,267 


. matlist cellcounts

             |        c1         c2         c3 
-------------+---------------------------------
          r1 |       101         12        563 
          r2 |        75         10        506 

I can rename the rows of cellcounts with the worth labels from intercourse utilizing the next steps. First, intercourse is saved as a numeric variable, so I can use decode to create a string variable referred to as sex_s. I may omit this step if intercourse was saved as a string variable. Subsequent, I exploit levelsof to retailer the degrees of sex_s to the native macro sexlabels. Then, I exploit matrix rownames to label the rows of cellcounts with the labels saved sexlabels.


. decode intercourse, generate(sex_s)

. levelsof sex_s, native(sexlabels)
`"Feminine"' `"Male"'

. matrix rownames cellcounts = `sexlabels'

. matlist cellcounts

             |        c1         c2         c3 
-------------+---------------------------------
      Feminine |       101         12        563 
        Male |        75         10        506 

I can use the identical steps to rename the columns of cellcounts with the worth labels of race.


. decode race, generate(race_s)

. levelsof race_s, native(racelabels)
`"Black"' `"Different"' `"White"'

. matrix colnames cellcounts = `racelabels'

. matlist cellcounts

             |     Black      Different      White 
-------------+---------------------------------
      Feminine |       101         12        563 
        Male |        75         10        506 

I can then write cellcounts to Excel identical to I did within the final two examples.

 
. putexcel set putexcel2.xlsx, sheet(example9) modify

. putexcel A1 = matrix(cellcounts), names hcenter
file putexcel2.xlsx saved

graph1

This strategy labored, however there aren’t any row and column totals in my Excel desk. A technique so as to add them is to place calculated cells within the desk utilizing system() expressions. For instance, the primary line of putexcel beneath places the Excel perform SUM(B2:D2) in cell E2. This calculates the full for the primary row of the desk. The putexcel command beneath places formulation within the desk to calculate the row, column, and total totals.


. putexcel E2 = system(=SUM(B2:D2))      ///
>          E3 = system(=SUM(B3:D3))      ///
>          B4 = system(=SUM(B2:B3))      ///
>          C4 = system(=SUM(C2:C3))      ///
>          D4 = system(=SUM(D2:D3))      ///
>          E4 = system(=SUM(B2:D3)), hcenter
file putexcel2.xlsx saved

graph1

I can visually differentiate the row and column totals from the cells by including labels and borders. I also can show the numbers with a daring font.


. putexcel A4 = "Whole" E1 = "Whole", hcenter
file putexcel2.xlsx saved

. putexcel A1:A4 D1:D4, border(proper) overwritefmt
file putexcel2.xlsx saved

. putexcel A1:E1 A3:E3, border(backside) overwritefmt
file putexcel2.xlsx saved

. putexcel B2:E4, daring
file putexcel2.xlsx saved

graph1

The Excel desk seems to be much like the output from tabulate.

Re-creating Stata output in Excel tables is straightforward with putexcel utilizing returned scalars, macros, and matrices. Examples 1-7 generalize nicely for arbitrary variables, however I hardcoded the row and column totals for the variables intercourse and race in instance 8.

Subsequent time, I’ll present you how you can generalize these examples for arbitrary variables and put extra eleborate expressions within the desk cells.



Related Articles

Latest Articles