Tuesday, October 28, 2025

The R Formulation Cheatsheet | econometrics.weblog


R’s method syntax is extraordinarily highly effective however could be complicated for novices.
This put up is a fast reference masking all the symbols which have a “particular” which means within an R method: ~, +, ., -, 1, :, *, ^, and I().
You might by no means use a few of these in observe, but it surely’s good to know that they exist.
It was a few years earlier than I noticed that I might merely sort y ~ x * z as a substitute of the lengthier y ~ x + z + x:z, for instance.
Whereas R formulation crop up in quite a lot of locations, they’re in all probability most acquainted as the primary argument of lm().
For that reason, my verbal explanations assume a easy linear regression setting through which we hope to foretell y utilizing various regressors x, z, and w.

~ separate LHS and RHS of method y ~ x regress y on x
+ add variable to a method y ~ x + z regress y on x and z
. denotes “all the pieces else” y ~ . regress y on all different variables in a knowledge body
- take away variable from a method y ~ . - x regress y on all different variables besides x
1 denotes intercept y ~ x - 1 regress y on x with out an intercept
: assemble interplay time period y ~ x + z + x:z regress y on x, z, and the product x instances z
* shorthand for ranges plus interplay y ~ x * z regress y on x, z, and the product x instances z
^ larger order interactions y ~ (x + z + w)^3 regress y on x, z, w, all two-way interactions, and the three-way interactions
I() “as-is” – override particular meanings of different symbols from this desk y ~ x + I(x^2) regress y on x and x squared

Related Articles

Latest Articles