1 R and RStudio
1.1 R as a toolkit
- Scriptability \(\rightarrow\) R
- Literate programming (code, narrative, output in one place) \(\rightarrow\) R Markdown
- Version control \(\rightarrow\) Git / GitHub
1.1.1 Why R and RStudio?
1.1.2 Some R basics
- You will load packages at the start of every new R session.
- “Base” R comes with tons of useful built-in functions. It also provides all the tools necessary for you to write your own functions.
- However, many of R’s best data science functions and tools come from external packages written by other users.
- R easily and infinitely parallelizes. For free.
- Compare the cost of a Stata/MP license, nevermind the fact that you effectively pay per core…
1.2 R code examples
1.2.1 Linear regression
##
## Call:
## lm(formula = dist ~ 1 + speed, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.5791 6.7584 -2.601 0.0123 *
## speed 3.9324 0.4155 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
1.2.3 ggplot2
1.2.4 gganimate
1.3 R vs. RStudio
- R is a statistical programming language
- RStudio is a convenient interface for R (an integrated development environment, IDE)
- At its simplest:
- R is like a car’s engine
- RStudio is like a car’s dashboard
1.4 R vs. R packages
R packages extend the functionality of R by providing additional functions, data, and documentation.
They are written by a world-wide community of R users and can be downloaded for no cost
1.5 R packages
CRAN: A group of people who check that packages fulfill certain standards
Mirror: A location on the web where to download R packages from. Because many thousand people download them daily, the load is distributed on different machines. Pick one which is geographically close to you
R base/recommended packages: The base installation of R ships with a bunch of default packages. In addition, there are some more packages listed as “recommended”.
“base” packages are managed by the R core team and will only be updated for every R release.
Packages listed as “recommended” inherit the attributes of being widely used and having a long history in the R community.
## Package Priority
## 1 base base
## 2 compiler base
## 3 datasets base
## 4 graphics base
## 5 grDevices base
## 6 grid base
## 7 methods base
## 8 parallel base
## Package Priority
## 1 boot recommended
## 2 class recommended
## 3 cluster recommended
## 4 codetools recommended
## 5 foreign recommended
## 6 KernSmooth recommended
## 7 lattice recommended
## 8 MASS recommended
## 9 Matrix recommended
## 10 mgcv recommended
## [ reached 'max' / getOption("max.print") -- omitted 2 rows ]
1.6 .Rprofile
File in your home directory
~/.Rprofile
Will be executed before every R session starts
Useful to set global options and for loading of often used packages
1.7 .Renviron
File in your home directory
~/.Renviron
Used to set environment variables
Used to store “Access tokens” (Github, CI provider, C++ flags)
1.8 RStudio
\(\rightarrow\) Exists to boost your productivity
\(\rightarrow\) Change the defaults to your liking so you actually can be productive
\(\rightarrow\) Keybindings = productivity
Since RStudio v1.3 a portable JSON settings file exists.
If you want to have sane settings without much hassle, you can execute the following R code: source("https://bit.ly/rstudio-pat")
This code will change/overwrite your existing RStudio settings and
set custom keybindings
move the console panel to the top-right (by default bottom-left)
Enable/Disable some core settings to have a better overall experience
R scripts (source code) are written in the Source pane (Editor).
(Source of all following RStudio screenshots: https://github.com/edrubin/EC525S19)
You can use the menubar or ⇧+⌘+N / ⇧+CTRL+N to create new R scripts.
To execute commands from your R script, use ⌘+Enter / CTRL+Enter.
RStudio will execute the command in the console.
You can see the new object in the Environment pane.
The History tab records your old commands.
The Files pane is the file explorer.
The Plots pane/tab shows… plots.
Packages shows installed packages
Packages shows installed packages and whether they are loaded.
The Help tab shows help documentation (also accessible via ?
).
Finally, you can customize the actual layout
1.9 RStudio addins
RStudio can be further enhanced by so called “addins”. These are clickable snippets that execute certain actions in RStudio.
They aim to make repetitive tasks easier and to save you time. There is an addin called addinslist which lists all available addins. It can be installed as a normal package from CRAN:
install.packages("addinslist")
To have an addin available in RStudio after installation, RStudio needs to be restarted.
1.10 RStudio projects
Without a project, you will need to define long file paths which only exist on your machine.
With a project, R automatically references the project’s folder as the current working directory.
From there on, you can use relative paths to point to files.
Double-plus bonus: The here package extends RStudio project philosophy even more and helps in cases when not using RStudio (e.g. on the command line).
1.11 Alternatives to RStudio
Using R directly in the terminal via radian (optimized R console interpreter)
R is supported in other “general purpose IDE’s” (VScode, Sublime Text, Atom, Vim, etc.)