# Lab 1 -- Introduction

This is loosely based on notes by Dr. Leemis: [here](http://www.math.wm.edu/~leemis/r.pdf). If you're not familiar with `r` I highly recommend you go through this.

## Notebook basics

Code notebooks allow interweaving of

1. rich-text input (html, markdown, etc.)
2. code input
3. code output

In [None]:
print("hello darkness my old friend")

![sound of silence](https://upload.wikimedia.org/wikipedia/commons/e/e5/SimonandGarfunkel.jpg)

We can write notebooks using a couple different IDEs, popular being

1. [jupyter](https://jupyter.org/install)

2. [rstudio](https://rstudio.com/products/rstudio/)

either can be used to write code in [R](https://www.r-project.org/).

You can also use jupyter-like environments via [jupyterhub.wm.edu](https://jupyterhub.wm.edu) or [colab.to/r](https://colab.to/r)

## R as a calculator

We can use R as a calculator

In [None]:
1 + 1

In [None]:
1 + 2 * 5

there are also some built-in constants in R

In [None]:
pi
exp(1)

there are also special infinite values

In [None]:
1 / 0
-1 / 0

In [None]:
0 / 0

to denote nothing we have

In [None]:
NULL

to denote a missing value

In [None]:
NA

# Variable assignment and simple objects

One can assign variables with either an `=` or `<-`

In [None]:
x <- 1
x

In [None]:
x = 1
x

## Vectors

vectors are made with the `c` command

In [None]:
x <- c(5, 3, 7)
x

we get the elements with `[]`

In [None]:
x[1]

negative numbers selects all but those elements

In [None]:
x[c(-1, -2)]

we can make a vector out of any type

In [None]:
truths <- c(TRUE, FALSE, TRUE)
truths

if we subset by boolean then it select those elements with `TRUE`

In [None]:
x

In [None]:
truths

In [None]:
x[truths]

we can make consecutive integers using `:`

In [None]:
1:10

or the `seq` function

In [None]:
seq(1, 20, by = 3)

## matrices

we can make a matrix using the ``matrix`` function

In [None]:
X <- matrix(1:25, nrow = 5, byrow = TRUE)
X

or the `array` function

In [None]:
Y <- array(25:49, c(5, 5))
Y

matrix multiplication is done with the operator ``%*%``

In [None]:
X %*% Y

I can extract or assign individual elements with `[,]`

In [None]:
X[1, 2]

In [None]:
X[1, 2] <- 3
X

or entire rows/columns as follows

In [None]:
X[, 1]

In [None]:
X[1, ]

# Flow Control

if statements are as follows

In [None]:
A <- 5
if (A == 0) {
  print("A=0")
} else if (A == 1) {
  print("A=1")
} else if (A == 2) {
  print("A=2")
} else {
  print("A is not 0, 1 or 2")
}

we can also make `for` loops

In [None]:
for (i in 1:10) {
  print(i)
}

and `while` loops

In [None]:
i <- 1
while (i <= 10) {
  print(i)
  i <- i + 1
}

# Major data types

numeric

In [None]:
c(1, 2, 3)

In [None]:
class(c(1, 2, 3))

strings

In [None]:
c("hello", "darkness", "my", "old", "friend")

In [None]:
class(c("hello", "darkness", "my", "old", "friend"))

boolean

In [None]:
c(TRUE, FALSE)

In [None]:
class(c(TRUE, FALSE))

factors for discrete variables

In [None]:
fctr <- as.factor(c("hello", "darkness"))

In [None]:
fctr

In [None]:
class(fctr)

In [None]:
as.numeric(fctr)

In [None]:
fctr * 2

# Data-frames, Matrices, and tibbles

matrcies all have to be the same type:

In [None]:
X <- array(0, c(5, 5))
X

In [None]:
X <- array(LETTERS[1:25], c(5, 5))
X

so if I have a character `X` and I assign the number `5` to the first column, what happens?

In [None]:
X[, 1] <- rep(5, 5)

In [None]:
X

In [None]:
X[, 1]

this is not the number `5`, its the character `'5'`

The proper way to have mixed types is to have a `data.frame`

In [None]:
df <- data.frame(x = 1:5, y = LETTERS[1:5])
df

I can access the elements like matrices

In [None]:
df[1, 1]

In [None]:
df[1, 2]

or with names

In [None]:
df$x

In [None]:
df$y

In [None]:
df[["x"]]

In [None]:
df[["y"]]

There are also an updated *better* version of data frames called `tibbles`: [link](https://tibble.tidyverse.org/)

In [None]:
library("tibble")

In [None]:
tbl <- tibble(df)

In [None]:
tbl

There is a really nice ecosystem of packages for data analysis in `R` called the `tidyverse`. It's probably one of the nicest datascience ecosystems. However, there is a bit of a learning curve and the syntax can be somewhat more advanced. You can learn more [here](https://www.tidyverse.org/). I will be using these packages pretty extensively.

In [None]:
library("dplyr")

In [None]:
tbl %>% sample_n(5)

In [None]:
tbl <- tbl %>% mutate(z = x^2)
tbl %>% head()

In [None]:
tbl %>% select(x)

In [None]:
tbl %>% filter(y > "B")

In [None]:
tbl %>%
  filter(y > "B") %>%
  summarize(mean = mean(z))