Announcement

Course Project

Project ideas/Dataset resources

Brief Description components

  • Introduce the dataset (data type, origin, etc). Explain why you choose the dataset. List some questions you want to explore with the dataset.

Mid-term report components

  • Include the brief description with modifications if needed

  • Give an abstract on your plan

    • What analyses you want to perform for answering your questions
  • Current progress and future plan

Final report components

  • Introduce the dataset. Explain why you choose it. Explain what questions you want to ask and explore using the dataset.

  • Analysis. Explain the statistical methods that you use for analyzing the dataset. Explain what you have done to generate the results (make your analysis reproducible).

  • Results. Illustrate your results. Use figures and tables to imiprove readability.

  • Discussions. This is the place to put in almost whatever you want to share. Some difficulties you met in the analysis, what you learned from the analysis, some future directions.

Consequences of computer storage / arithmetic

Programming Languages

Compiled versus interpreted languages.

More about computer languages

R basics

styles

(reading assignment)

Checkout Google’s R style Guide, Style guide in Advanced R and the tidyverse style guide.

Arithmetic

R can do any basic mathematical computations.

symbol use
+ addition
- subtraction
* multiplication
/ division
^ power
%% modulus
exp() exponent
log() natural logarithm
sqrt() square root
round() rounding
floor() flooring
ceiling() ceiling

Objects

You can create an R object to save results of a computation or other command.

Example 1

x <- 3 + 5
x
## [1] 8
  • In most languages, the direction of passing through the value into the object goes from right to left (e.g. with “=”). However, R allows both directions (which is actually bad!). In this course, we encourage the use of “<-” or “=”. There are people liking “=” over “<-” for the reason that “<-” sometimes break into two operators “< -”.

Example 2

x < - 3 + 5
## [1] FALSE
x
## [1] 8
  • For naming conventions, stick with either “.” or "_" (refer to the style guide).

Example 3

sum.result <- x + 5
sum.result
## [1] 13
  • important: many names are already taken for built-in R functions. Make sure that you don’t override them.

Example 4

sum(2:5)
## [1] 14
sum
## function (..., na.rm = FALSE)  .Primitive("sum")
sum <- 3 + 4 + 5
sum(5:8)
## [1] 26
sum
## [1] 12
  • R is case-sensitive. “Math.7360” is different from “math.7360”.

Locating and deleting objects:

The commands “objects()” and “ls()” will provide a list of every object that you’ve created in a session.

objects()
## [1] "sum"        "sum.result" "x"
ls()
## [1] "sum"        "sum.result" "x"

The “rm()” and “remove()” commands let you delete objects (tip: always clearn-up your workspace as the first command)

rm(list=ls())  # clean up workspace

Vectors

Many commands in R generate a vector of output, rather than a single number.

The “c()” command: creates a vector containing a list of specific elements.

Example 1

c(7, 3, 6, 0)
## [1] 7 3 6 0
c(73:60)
##  [1] 73 72 71 70 69 68 67 66 65 64 63 62 61 60
c(7:3, 6:0)
##  [1] 7 6 5 4 3 6 5 4 3 2 1 0
c(rep(7:3, 6), 0)
##  [1] 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 0

Example 2 The command “seq()” creates a sequence of numbers.

seq(7)
## [1] 1 2 3 4 5 6 7
seq(3, 70, by = 6)
##  [1]  3  9 15 21 27 33 39 45 51 57 63 69
seq(3, 70, length = 6)
## [1]  3.0 16.4 29.8 43.2 56.6 70.0

Operations on vectors

Use brackets to select element of a vector.

x <- 73:60
x[2]
## [1] 72
x[2:5]
## [1] 72 71 70 69
x[-(2:5)]
##  [1] 73 68 67 66 65 64 63 62 61 60

Can access by “name” (safe with column/row order changes)

y <- 1:3
names(y) <- c("do", "re", "mi")
y[3]
## mi 
##  3
y["mi"]
## mi 
##  3

R commands on vectors

command usage
sum() sum over elements in vector
mean() compute average value
sort() sort elements in a vector
min(), max() min and max values of a vector
length() length of a vector
summary() returns the min, Q1, median, mean, Q3, and max values of a vector

Exercise Write a command to generate a random permutation of the numbers between 1 and 5 and save it to an object.