# 1 R and RStudio

## 1.1 R as a toolkit

- Scriptability \(\rightarrow\) R
- Literate programming (code, narrative, output in one place) \(\rightarrow\) R Markdown
- Version control \(\rightarrow\) Git / GitHub

### 1.1.1 Why R and RStudio?

### 1.1.2 Some R basics

- You will load packages at the
**start of every new R session**.- “Base” R comes with tons of useful built-in functions. It also provides all the tools necessary for you to write your own functions.
- However, many of R’s best data science functions and tools come from external packages written by other users.

- R easily and infinitely parallelizes. For free.
- Compare the cost of a Stata/MP license, nevermind the fact that you effectively pay per core…

## 1.2 R code examples

### 1.2.1 Linear regression

```
##
## Call:
## lm(formula = dist ~ 1 + speed, data = cars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -29.069 -9.525 -2.272 9.215 43.201
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -17.5791 6.7584 -2.601 0.0123 *
## speed 3.9324 0.4155 9.464 1.49e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15.38 on 48 degrees of freedom
## Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438
## F-statistic: 89.57 on 1 and 48 DF, p-value: 1.49e-12
```

### 1.2.3 ggplot2

```
library(ggplot2)
library(gapminder) ## For the gapminder data
ggplot(
data = gapminder,
mapping = aes(x = gdpPercap, y = lifeExp)
) +
geom_point()
```

### 1.2.4 gganimate

## 1.3 R vs. RStudio

- R is a statistical
**programming language** - RStudio is a convenient interface for R (an
**integrated development environment**, IDE) - At its simplest:
- R is like a car’s engine
- RStudio is like a car’s dashboard

## 1.4 R vs. R packages

R packages

**extend**the functionality of R by providing additional functions, data, and documentation.They are written by a world-wide community of R users and can be downloaded for no cost

## 1.5 R packages

**CRAN**: A group of people who check that packages fulfill certain standards**Mirror**: A location on the web where to download R packages from. Because many thousand people download them daily, the load is distributed on different machines. Pick one which is geographically close to you**R base/recommended packages**: The base installation of R ships with a bunch of default packages. In addition, there are some more packages listed as “recommended”.

“base” packages are managed by the R core team and will only be updated for every R release.

Packages listed as “recommended” inherit the attributes of being widely used and having a long history in the R community.

```
## Package Priority
## 1 base base
## 2 compiler base
## 3 datasets base
## 4 graphics base
## 5 grDevices base
## 6 grid base
## 7 methods base
## 8 parallel base
```

```
## Package Priority
## 1 boot recommended
## 2 class recommended
## 3 cluster recommended
## 4 codetools recommended
## 5 foreign recommended
## 6 KernSmooth recommended
## 7 lattice recommended
## 8 MASS recommended
## 9 Matrix recommended
## 10 mgcv recommended
## [ reached 'max' / getOption("max.print") -- omitted 2 rows ]
```

## 1.6 .Rprofile

File in your home directory

`~/.Rprofile`

Will be executed before every R session starts

Useful to set global options and for loading of often used packages

## 1.7 .Renviron

File in your home directory

`~/.Renviron`

Used to set environment variables

Used to store “Access tokens” (Github, CI provider, C++ flags)

## 1.8 RStudio

\(\rightarrow\) Exists to **boost** your productivity

\(\rightarrow\) Change the defaults to your liking so you *actually* can be **productive**

\(\rightarrow\) Keybindings = productivity

Since RStudio v1.3 a portable JSON settings file exists.

If you want to have sane settings without much hassle, you can execute the following R code: `source("https://bit.ly/rstudio-pat")`

This code will change/overwrite your existing RStudio settings and

set custom keybindings

move the console panel to the top-right (by default bottom-left)

Enable/Disable some core settings to have a better overall experience

R scripts (source code) are written in the *Source* pane (Editor).

(Source of all following RStudio screenshots: https://github.com/edrubin/EC525S19)

You can use the menubar or ⇧+⌘+N / ⇧+CTRL+N to create new R scripts.

To execute commands from your R script, use ⌘+Enter / CTRL+Enter.

RStudio will execute the command in the console.

You can see the new object in the *Environment* pane.

The *History* tab records your old commands.

The *Files* pane is the file explorer.

The *Plots* pane/tab shows… plots.

*Packages* shows installed packages

*Packages* shows installed packages and whether they are *loaded*.

The *Help* tab shows help documentation (also accessible via `?`

).

Finally, you can customize the actual layout

## 1.9 RStudio addins

RStudio can be further enhanced by so called “addins”. These are clickable snippets that execute certain actions in RStudio.

They aim to make repetitive tasks easier and to save you time. There is an addin called addinslist which lists all available addins. It can be installed as a normal package from CRAN:

`install.packages("addinslist")`

To have an addin available in RStudio after installation, RStudio needs to be restarted.

## 1.10 RStudio projects

Without a project, you will need to define **long** file paths which **only exist on your machine**.

`sample_df <- read.csv("/Users/<yourname>/somewhere/on/this/machine/sample.csv")`

With a project, R automatically references the project’s folder as the current working directory.

From there on, you can use *relative paths* to point to files.

`sample_df <- read.csv("sample.csv")`

**Double-plus bonus**: The *here* package extends *RStudio project* philosophy even more and helps in cases when not using RStudio (e.g. on the command line).

## 1.11 Alternatives to RStudio

Using R directly in the terminal via radian (optimized R console interpreter)

R is supported in other “general purpose IDE’s” (VScode, Sublime Text, Atom, Vim, etc.)