- R is a free open-source program that is
- Used to do statistics in both academia and industry.
- Many online resources for R, for both beginners and experts.
- Good skill to put on your cv.
R can be used as a fancy calculator
2+2
## [1] 4
sin(3.14)
## [1] 0.001592653
exp(1)
## [1] 2.718282
log(2.71)
## [1] 0.9969486
pi
## [1] 3.141593
Variables can be assigned in the following way
x <- 1+2+3+4+5+6 x*2
## [1] 42
R is designed to store data as vectors \(x = (x_1, x_2, \ldots, x_n)\) that is lists of numbers.
y <- c(1,2,3,4,5,6,7,8,9) y
## [1] 1 2 3 4 5 6 7 8 9
R has many built in common operations that are useful for statistics:
x <- c(1,2,3,4,5,6,7,8,9,10) x
## [1] 1 2 3 4 5 6 7 8 9 10
sum(x)
## [1] 55
mean(x)
## [1] 5.5
sd(x)
## [1] 3.02765
var(x)
## [1] 9.166667
R will do certain operations component wise:
x <- c(1,2,3,4,5) y <- c(6,7,8,9,10) z=x+y z
## [1] 7 9 11 13 15
sin(z)
## [1] 0.6569866 0.4121185 -0.9999902 0.4201670 0.6502878
It is often necessary to add or delete data from a vector:
x<- c(0.1, 0.2, 0.3, 0.4, 0.5) x <- x[-5] x
## [1] 0.1 0.2 0.3 0.4
x <- c(x, 5.5) x
## [1] 0.1 0.2 0.3 0.4 5.5
x<- rbinom(12,1, 0.5) x
## [1] 0 1 1 0 0 0 0 0 0 0 1 0
z <- sample(6,12, replace =TRUE) z
## [1] 2 3 3 6 5 2 6 4 6 5 4 4
x<- rnorm(10, 5, 1) x
## [1] 5.397613 4.621272 4.500198 3.865283 5.204967 5.673333 3.231157 3.856156 ## [9] 5.348239 3.087446
z <- runif(10, min=-1, max=1) z
## [1] 0.5121779 0.8852102 0.4035998 0.8742291 -0.3314065 -0.3301936 ## [7] 0.6630193 -0.6573704 -0.3727413 -0.8491085
z <- rexp(10, 2) z
## [1] 0.243909000 0.846406905 0.894939437 0.075192600 0.939676119 1.263858433 ## [7] 1.060914981 0.554971271 0.663786483 0.001729772
z<- rpois(7, 1) print(z)
## [1] 2 1 3 2 2 1 0
x <- rexp(100,1) hist(x, prob=TRUE)
sincos <- function(x){ z <- sin(x) + cos(x); z } sincos(1)
## [1] 1.381773
By running simulations in R, approximate the average number of rolls of a fair dice it takes before you see a 6.
numrolls <- function(){ n=0 x=0 while(x <6){ x <- sample(6,1, replace =TRUE) n <- n+1 } n } mean(replicate(1000, numrolls()) )
## [1] 5.915
We illustrate how to write a for loop in the context of prime numbers.
Define a function that tells you whether a positive integer is prime or not.
Let \(n\) be an integer.
\(d\) is a divisor of \(n\) if there exists an integer \(c\) such that \(n = cd\).
An integer \(n \geq 2\) is prime if it only divisors are \(1\) and \(n\).
R has a built in remainder function, which for nonnegative integers \(a,b\) outputs the remainder in the sense of elementary school, when \(a\) is divided by \(b\).
Using the remainder function we define the isprime function, and use it spit out the prime numbers up to 500.
25%%5
## [1] 0
26%%5
## [1] 1
isprime <- function(n){ x=1 for (i in 2: (n-1)){ if (x >0) { x <- n%%i } } if (n==1) {x <-0} if (n==2) {x <-1} x } isprime(101)
## [1] 1
x=2 for(i in 3:500){ if( isprime(i)==1){ x <- c(x, i)} x } print(x)
## [1] 2 3 5 7 11 13 17 19 23 29 31 ## [12] 37 41 43 47 53 59 61 67 71 73 79 ## [23] 83 89 97 101 103 107 109 113 127 131 137 ## [34] 139 149 151 157 163 167 173 179 181 191 193 ## [45] 197 199 211 223 227 229 233 239 241 251 257 ## [56] 263 269 271 277 281 283 293 307 311 313 317 ## [67] 331 337 347 349 353 359 367 373 379 383 389 ## [78] 397 401 409 419 421 431 433 439 443 449 457 ## [89] 461 463 467 479 487 491 499