0% found this document useful (0 votes)
33 views29 pages

Tutorial 1

Uploaded by

Jessica Kristy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views29 pages

Tutorial 1

Uploaded by

Jessica Kristy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Tutorial 1

Introduction to R
What is R?
• R is a powerful language and environment for statistical computing
and graphics. R is much used in as an educational language and
research tool.
• Free
• A lot of help online;
• Can be understood (hopefully) by people without any programming
experience
• Open-source (Thousands of packages online, and still growing)
• We prefer to use R in combination with the RStudio interface, which
has an organized layout and several extra options.
Install R and Rstudio on your PC

1. Install R

2. Install Rstudio

Or use Posit Cloud in your browser (formerly RStudio Cloud)


https://posit.cloud/

Allows you to access Rstudio right in your browser – no installation


or complex configuration required.
1. Sign up with email address
2. verifying your email
3. Create a New
Rstudio Project
RStudio Layout

• Top left: editor window (also


called script window)
• Bottom left: console window
• Top right: workspace / history
window.
• Bottom right: files / plots /
packages / help window.
Install and Load Packages in R
• install.packages(“XXXX") #install a new package and replace XXXX with
package names
• library(XXXX) #load installed packages and replace XXXX with package
names
Note: Text in red are codes, grey text are replaceable. Text after
the pound sign "#" within the same line is considered a comment
Set work directory
• setwd(“c:/User/your location”) #Or setwd(“~your location”) in Mac#
• Setwd(choose.dir()) #setwd by manually choosing the directory
Basic Expressions: Operators
• +, -, *, /, ^, sqrt
• 1+1
• 2-1
• 3*2
• 6/3
• 3^3
• sqrt(9)
• sqrt((1+7)*8/5-9)
• “hello world” # string variables need quote
• 3 < 4 # some expressions return a “logical value”: TRUE or FALSE
Operators in R
Arithmetic Operators Logical Operators
Operator Description Example Operator Description Example
= (or <-) Assign a value a = 1+2 <, > Less, greater than x<y
+ Add x+y <=, >= Less, greater than or x>=y
equal to
- Subtract x-y
== Equal to x==y
* Multiply x*y
!= Not equal to x!=y
/ Divide x/y
! Not !x
** or ^ Exponentiation x^y or x**y
| Or x|y
%% Modulus x%%y & And x&y
%/% Integer division x%/%y isTRUE() Test if true isTRUE(x==y)
Variables
• x <- 85 #<- assign • Height <- 180
• x/2 # use x in expressions • Weight <- 50
• x <- x*2 + 48 # if you specify x • Height*Weight
again, it will forget what value it
had before
• x <- “hello world” # assign string
values to x
Functions
• You call a function by typing its name, followed by one or more
arguments to that function in the brackets
• sum(1,2,4,7) # summation
• rep(“penny”, times=3) # replicate “penny” for 3 times
Useful Tips
• Case sensitive in R language
• help(rep) #OR ?rep
• always use # to add comments to your code
• some quick start tutorial of R: Quick-R; R Tutor
• stack exchange; r-blogger
Data Types and Structures
Data Types
• Three general types of data
• Strings
• “Why, hi there”
• Numbers
• TRUE/FALSE
• Missing data (NA) – Note, there are no quotes (“NA”)
Data Structures
• Vectors
• Matrices
• Data frames
• Lists

Source: Kabacoff (2011)


Vectors
• Vector is a list of values [numeric, logic, or • Vector Subsetting
string] • y[2] # access the second value of y
• # A vector’s values can be numbers, • y[1:3] #access the first to the third value of y
characters, logical values, as long as they’re • y[c(1,3)] #access the first and the third value of
the same type # y
• Define a vector • y[3] <- “cat” # assign new value to the third
value in y
• x <- c() #empty vector • y[5:6] <- c(“likes” , ”me”) # assign multiple
• x <- c(1,2,4,7) # the c function creates a vector values in continuance
by combining a list of values
• Vector Math
• y <- c(“a” , ”b” , ”c” , “d”) # array of characters • V {+,-,*,/} 1
• y <- c(1,2,”hi”) # what is the data type of this • V+V
vector
• V*V
• typeof(y) • sqrt(V)
• V <- seq(5,10,0.5) • sum(V)
• V <- c(1:10)
Matrices
• A collection of data elements arranged • Matrix subsetting
in a two-dimensional rectangular • m[2,3] # try getting the value of row 2,
layout, data elements should be same column 3
type. • m[2,3] <- 0 # assign new value to a cell in
the matrix
• Define a matrix • m[1,]
• m <- matrix() • m[,2]
• matrix(1,5,5) #create a 5*5 matrix with all • m[,2:3]
values in the matrix equal to 1
• m <- matrix (1:12,nrow=3, ncol=4) # • m[,c(1,3)]
create a 3*4 matrix, with values 1:12 • m[c(1,2),c(1,3)]
• V <- c(1:12) • Matrix Math
• m <- matrix(v,3,4) • m {+,-,*,/} 1
• m+m
• cbind (m,m)
• rbind (m,m)
Data Frames
• # data frame is a data set that • Data frame subsetting
includes multiple types of data, • Data[1,2]
such as numeric and string # • Data$Prices

• Define a data frame • Data frame math


• Weights <- c(1:8) • Data$Weights*Data$Prices
• Prices <- c(2:9) • Data$new<-Data$Weights*Data$Prices
#save the new score as a new variable in
• Types <- c(T,F,F,F,T,F,T,F) the data frame
• Data <- data.frame(Weights, Prices, Types) • mean(Data$Weights)
Lists
• In R, lists act as containers.
• Unlike atomic vectors, the contents of a list are not restricted to a single
mode and can encompass any mixture of data types.
• Define a list
• v1 <- c(1,6,7,8)
• v2 <- c(2,4)
• m <- matrix(1,2,4)
• L <- list (v1, v2, m)
• List Access
• L[[1]]
• Lists are extremely useful inside functions. You can "staple" together lots of
different kinds of results into a single object that a function can return.
Data Management
Manual Data Input
• age <- c(20,25,30)
• gender <- c("male","female","male")
• score <- c(65,75,85)
• newdata <- data.frame(age,gender,score) # create new data frame by function
data.frame
• newdata$midterm <- c(7,8,9) # add a new variable named midterm to the data
frame
• newdata$sum <- newdata$score + newdata$midterm #create a new variable
based on existing variables in the data frame
Import and Export Data
Data Type Import Export

write.table()
TXT, CSV read.table(file=“”) read.csv(file=“”)
write.csv()

library(foreign)
SAV, DTA read.spss(file=“”, to.data.frame = T)
read.dta(file=“”)

Save & load


load(“data.Rdata”) save(data,file=“data.Rdata”),
x.Rdata files
Viewing Data
• View (mtcars)
• names(mtcars) # list the variable names in mtcars
• mtcars $mpg # access certain variable of mtcars
Rename a column in a data frame
• names(mtcars) #check the existing column names
• names(mtcars)[1] <- “brand” #rename by index in names vector
Recode Data
• recode the disp into a new variable rank
• mtcars $rank[mtcars $disp <= 160] <- "L"
• mtcars $rank[mtcars $disp > 160 & mtcars $disp <= 300] <- "M"
• mtcars $rank[mtcars $disp > 300] <- "H"
• mtcars $rank
Subsetting Datasets
• Selecting/keeping variables
• newdata1 <- mtcars[ ,c(1, 3)] #keep the first and third columns
• newdata1 <- mtcars[ ,c(“brand”, ”rank”)] #keep the first and third columns
• Dropping variables
• newdata2 <- mtcars[ ,c(-2 : -5)] #drop the second to the fifth column in the data frame
• Selecting observations
• newdata3 <- mtcars[which(mtcars $rank == “H”),]
• newdata4 <- mtcars[which(mtcars $wt > 4),]
Handling Missing Values
• specify missing values before analysis
• y <- c(1, 2, 3, NA) #NA in capital
• sum(y) #return NA, because there is a missing value in the vector
• sum(y, na.rm=TURE) #na.rm means NA remove equal to TRUE
• help(sum)

You might also like