rm(list = ls()) # clean-up workspace

Announcement

R basics

styles

(reading assignment)

Checkout

Note: Unlike other computing languages, R allows putting “.” in variable/function names.

this.is.a.variable <- 7360
typeof(this.is.a.variable)
## [1] "double"
this.is.a.variable
## [1] 7360

First functions to learn

symbol use
? get documentation
str show structure
test.str <- 1:6
str(test.str)
##  int [1:6] 1 2 3 4 5 6

Arithmetic

R can do any basic mathematical computations.

symbol use
+ addition
- subtraction
* multiplication
/ division
^ power
%% modulus
sign 1(positive), 0 (zero) or -1 (negative)
abs absolute value
exp() exponent
log() natural logarithm
log10() log(x, base = 10)
log2() log(x, base = 2)
sqrt() square root
round() rounding
floor() flooring
ceiling() ceiling

Comparison (logic operator)

symbol use
!= not equal
== equal
> greater
>= greater or equal
< smaller
<= smaller or equal
is.na is it “Not Available”/Missing
complete.cases returns a logical vector specifying which observations/rows have no missing values
is.finite if the value is finite
all are all values in a logical vector true?
any any value in a logical vector is true?
test.vec <- 73:68
test.vec
## [1] 73 72 71 70 69 68
test.vec < 70
## [1] FALSE FALSE FALSE FALSE  TRUE  TRUE
test.vec > 70
## [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE
test.vec[3] <- NA
test.vec
## [1] 73 72 NA 70 69 68
is.na(test.vec)
## [1] FALSE FALSE  TRUE FALSE FALSE FALSE
complete.cases(test.vec)
## [1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE
all(is.na(test.vec))
## [1] FALSE
any(is.na(test.vec))
## [1] TRUE

Now let’s do a test of accuracy for doubles in R. Recall that for Double precision, we get approximately \(\log_{10}(2^{52}) \approx 16\) decimal point for precision.

test.exponent <- -(7:18)
10^test.exponent == 0
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
1 - 10^test.exponent == 1
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
7360 - 10^test.exponent == 7360
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
73600 - 10^test.exponent == 73600
##  [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

Other operators

%in%, match

test.vec
## [1] 73 72 NA 70 69 68
66 %in% test.vec
## [1] FALSE
match(66, test.vec, nomatch = 0)
## [1] 0
70 %in% test.vec
## [1] TRUE
match(70, test.vec, nomatch = 0)
## [1] 4
match(70, test.vec, nomatch = 0) > 0 # the implementation of %in%
## [1] TRUE

Objects

You can create an R object to save results of a computation or other command.

Example 1

x <- 3 + 5
x
## [1] 8
  • In most languages, the direction of passing through the value into the object goes from right to left (e.g. with “=”). However, R allows both directions (which is actually bad!). In this course, we encourage the use of “<-” or “=”. There are people liking “=” over “<-” for the reason that “<-” sometimes break into two operators “< -”.

Example 2

x < - 3 + 5
## [1] FALSE
x
## [1] 8
  • For naming conventions, stick with either “.” or "_" (refer to the style guide).

Example 3

sum.result <- x + 5
sum.result
## [1] 13
  • important: many names are already taken for built-in R functions. Make sure that you don’t override them.

Example 4

sum(2:5)
## [1] 14
sum
## function (..., na.rm = FALSE)  .Primitive("sum")
sum <- 3 + 4 + 5
sum(5:8)
## [1] 26
sum
## [1] 12
  • R is case-sensitive. “Math.7360” is different from “math.7360”.

  • use “assign” and “get” to create/retrieve object by string.

Example 5

assign("test.vec.duplicate", test.vec)
test.vec
## [1] 73 72 NA 70 69 68
test.vec.duplicate
## [1] 73 72 NA 70 69 68
test.vec[3] <- 71

get("test.vec")
## [1] 73 72 71 70 69 68

Locating and deleting objects:

The commands “objects()” and “ls()” will provide a list of every object that you’ve created in a session.

objects()
## [1] "sum"                "sum.result"         "test.exponent"     
## [4] "test.str"           "test.vec"           "test.vec.duplicate"
## [7] "this.is.a.variable" "x"
ls()
## [1] "sum"                "sum.result"         "test.exponent"     
## [4] "test.str"           "test.vec"           "test.vec.duplicate"
## [7] "this.is.a.variable" "x"

The “rm()” and “remove()” commands let you delete objects (tip: always clearn-up your workspace as the first command)

# rm(list=ls())  # clean up workspace

Vectors

Many commands in R generate a vector of output, rather than a single number.

The “c()” command: creates a vector containing a list of specific elements.

Example 1

c(7, 3, 6, 0)
## [1] 7 3 6 0
c(73:60)
##  [1] 73 72 71 70 69 68 67 66 65 64 63 62 61 60
c(7:3, 6:0)
##  [1] 7 6 5 4 3 6 5 4 3 2 1 0
c(rep(7:3, 6), 0)
##  [1] 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 0

Example 2 The command “seq()” creates a sequence of numbers.

seq(7)
## [1] 1 2 3 4 5 6 7
seq(3, 70, by = 6)
##  [1]  3  9 15 21 27 33 39 45 51 57 63 69
seq(3, 70, length = 6)
## [1]  3.0 16.4 29.8 43.2 56.6 70.0

Operations on vectors

Use brackets to select element of a vector.

x <- 73:60
x[2]
## [1] 72
x[2:5]
## [1] 72 71 70 69
x[-(2:5)]
##  [1] 73 68 67 66 65 64 63 62 61 60

Can access by “name” (safe with column/row order changes)

y <- 1:3
names(y) <- c("do", "re", "mi")
y[3]
## mi 
##  3
y["mi"]
## mi 
##  3

Matrix

matrix() command creates a matrix from the given set of values

matrix.example <- matrix(rnorm(100), nrow = 10, ncol = 10, byrow = TRUE)
matrix.example
##             [,1]          [,2]       [,3]        [,4]        [,5]       [,6]
##  [1,]  0.2487614 -1.3283204760 -0.3261691 -2.29128182 -1.12003362 -0.7339829
##  [2,]  0.4467889  1.4244049895 -1.3744598 -0.61844468 -0.09760008  0.1450305
##  [3,]  0.5565741  0.5613392990 -1.0462043 -0.16772428  0.66317840  0.1717412
##  [4,]  0.4781100  0.1031285257 -1.3827341  0.52628434 -1.03155402  0.1887497
##  [5,]  0.6615452 -1.3978513407 -0.6098668 -0.64822571  0.73282632 -0.6848463
##  [6,] -0.2767702 -0.0413636658  0.2877840 -0.53158161  2.18101752  0.1049041
##  [7,]  2.0772488  0.0005798236  0.6622772 -0.29926697 -0.44139444 -0.6727295
##  [8,] -1.5433887 -1.3868296873 -1.0937415 -1.05320024 -1.02114233  0.2773030
##  [9,] -1.1489334  3.0698628891 -1.4326127  0.63103254 -2.14939521  0.2847929
## [10,] -0.3303552  0.2787351466  0.3609868 -0.00640998 -0.14099100  0.4735913
##              [,7]        [,8]       [,9]       [,10]
##  [1,]  0.88670128  1.07644234  0.5691316  0.11664369
##  [2,] -0.42085126 -0.35566720 -1.2787175  0.62591946
##  [3,] -0.24182343  2.50147677 -0.8938938 -0.82472030
##  [4,] -0.15644894 -0.92356896  0.5796492 -0.83197244
##  [5,]  0.97749167 -0.34432111 -0.5241366  0.72904207
##  [6,]  0.09955405  1.31901689 -1.0952569 -1.86012755
##  [7,]  0.74249468 -0.22862938 -1.4529933  0.88682884
##  [8,] -1.49211883 -0.75581818 -1.1892348 -0.06324179
##  [9,]  1.68691575 -0.09244263 -3.3025809 -0.21126535
## [10,] -0.26314783  0.14841914  0.1975597  1.21253209

R commands on vector/matrix

command usage
sum() sum over elements in vector/matrix
mean() compute average value
sort() sort all elements in a vector/matrix
min(), max() min and max values of a vector/matrix
length() length of a vector/matrix
summary() returns the min, Q1, median, mean, Q3, and max values of a vector
dim() dimension of a matrix
cbind() combine a sequence of vector, matrix or data-frame arguments and combine by columns
rbind() combine a sequence of vector, matrix or data-frame arguments and combine by rows
names() get or set names of an object
colnames() get or set column names of a matrix-like object
rownames() get or set row names of a matrix-like object
sum(matrix.example)
## [1] -15.27799
mean(matrix.example)
## [1] -0.1527799
sort(matrix.example)
##   [1] -3.3025808825 -2.2912818250 -2.1493952084 -1.8601275488 -1.5433886763
##   [6] -1.4921188288 -1.4529933136 -1.4326126933 -1.3978513407 -1.3868296873
##  [11] -1.3827341216 -1.3744598370 -1.3283204760 -1.2787174790 -1.1892347988
##  [16] -1.1489333947 -1.1200336205 -1.0952568874 -1.0937415249 -1.0532002370
##  [21] -1.0462043173 -1.0315540166 -1.0211423305 -0.9235689563 -0.8938938457
##  [26] -0.8319724445 -0.8247203017 -0.7558181823 -0.7339829079 -0.6848462825
##  [31] -0.6727294922 -0.6482257080 -0.6184446776 -0.6098668487 -0.5315816116
##  [36] -0.5241366303 -0.4413944380 -0.4208512571 -0.3556671975 -0.3443211058
##  [41] -0.3303551592 -0.3261690684 -0.2992669743 -0.2767701549 -0.2631478332
##  [46] -0.2418234349 -0.2286293806 -0.2112653520 -0.1677242763 -0.1564489355
##  [51] -0.1409909978 -0.0976000753 -0.0924426281 -0.0632417911 -0.0413636658
##  [56] -0.0064099797  0.0005798236  0.0995540482  0.1031285257  0.1049041025
##  [61]  0.1166436913  0.1450304900  0.1484191371  0.1717412016  0.1887496990
##  [66]  0.1975597151  0.2487613818  0.2773029635  0.2787351466  0.2847928839
##  [71]  0.2877839979  0.3609868474  0.4467888704  0.4735913266  0.4781100209
##  [76]  0.5262843355  0.5565741240  0.5613392990  0.5691315636  0.5796492106
##  [81]  0.6259194576  0.6310325401  0.6615452108  0.6622771599  0.6631783990
##  [86]  0.7290420698  0.7328263249  0.7424946763  0.8867012761  0.8868288379
##  [91]  0.9774916666  1.0764423401  1.2125320919  1.3190168881  1.4244049895
##  [96]  1.6869157466  2.0772487856  2.1810175237  2.5014767706  3.0698628891
summary(matrix.example)
##        V1                V2                 V3                V4          
##  Min.   :-1.5434   Min.   :-1.39785   Min.   :-1.4326   Min.   :-2.29128  
##  1st Qu.:-0.3170   1st Qu.:-1.00658   1st Qu.:-1.3043   1st Qu.:-0.64078  
##  Median : 0.3478   Median : 0.05185   Median :-0.8280   Median :-0.41542  
##  Mean   : 0.1170   Mean   : 0.12837   Mean   :-0.5955   Mean   :-0.44588  
##  3rd Qu.: 0.5370   3rd Qu.: 0.49069   3rd Qu.: 0.1343   3rd Qu.:-0.04674  
##  Max.   : 2.0772   Max.   : 3.06986   Max.   : 0.6623   Max.   : 0.63103  
##        V5                V6                 V7                 V8         
##  Min.   :-2.1494   Min.   :-0.73398   Min.   :-1.49212   Min.   :-0.9236  
##  1st Qu.:-1.0290   1st Qu.:-0.47832   1st Qu.:-0.25782   1st Qu.:-0.3528  
##  Median :-0.2912   Median : 0.15839   Median :-0.02845   Median :-0.1605  
##  Mean   :-0.2425   Mean   :-0.04454   Mean   : 0.18188   Mean   : 0.2345  
##  3rd Qu.: 0.4730   3rd Qu.: 0.25516   3rd Qu.: 0.85065   3rd Qu.: 0.8444  
##  Max.   : 2.1810   Max.   : 0.47359   Max.   : 1.68692   Max.   : 2.5015  
##        V9                V10          
##  Min.   :-3.30258   Min.   :-1.86013  
##  1st Qu.:-1.25635   1st Qu.:-0.67136  
##  Median :-0.99457   Median : 0.02670  
##  Mean   :-0.83905   Mean   :-0.02204  
##  3rd Qu.: 0.01714   3rd Qu.: 0.70326  
##  Max.   : 0.57965   Max.   : 1.21253

Exercise Write a command to generate a random permutation of the numbers between 1 and 5 and save it to an object.

Control flow

These are the basic control-flow constructs of the R language. They function in much the same way as control statements in any Algol-like (Algol short for “Algorithmic Language”) language. They are all reserved words.

keyword usage
if if(cond) expr
if-else if(cond) cons.expr else alt.expr
for for(var in seq) expr
while while(cond) expr
break breaks out of a for loop
next halts the processing of the current iteration and advances the looping index

Define a function

DoNothing <- function() {
  return(invisible(NULL))
}
DoNothing()

In general, try to avoid using loops (vectorize your code) in R. If you have to loop, try using for loops first. Sometimes, while loops can be dangerous (however, a smart compiler should detect this).

DoBadThing <- function() {
  result <- NULL
  while(TRUE) {
    result <- c(result, rnorm(100))
  }
  return(result)
}
# DoBadThing()