R’s method syntax is extraordinarily highly effective however could be complicated for novices.
This put up is a fast reference masking all the symbols which have a “particular” which means within an R method: ~, +, ., -, 1, :, *, ^, and I().
You might by no means use a few of these in observe, but it surely’s good to know that they exist.
It was a few years earlier than I noticed that I might merely sort y ~ x * z as a substitute of the lengthier y ~ x + z + x:z, for instance.
Whereas R formulation crop up in quite a lot of locations, they’re in all probability most acquainted as the primary argument of lm().
For that reason, my verbal explanations assume a easy linear regression setting through which we hope to foretell y utilizing various regressors x, z, and w.
~ |
separate LHS and RHS of method | y ~ x |
regress y on x |
+ |
add variable to a method | y ~ x + z |
regress y on x and z |
. |
denotes “all the pieces else” | y ~ . |
regress y on all different variables in a knowledge body |
- |
take away variable from a method | y ~ . - x |
regress y on all different variables besides x |
1 |
denotes intercept | y ~ x - 1 |
regress y on x with out an intercept |
: |
assemble interplay time period | y ~ x + z + x:z |
regress y on x, z, and the product x instances z |
* |
shorthand for ranges plus interplay | y ~ x * z |
regress y on x, z, and the product x instances z |
^ |
larger order interactions | y ~ (x + z + w)^3 |
regress y on x, z, w, all two-way interactions, and the three-way interactions |
I() |
“as-is” – override particular meanings of different symbols from this desk | y ~ x + I(x^2) |
regress y on x and x squared |
