Basics of R
R Variables Concatenate
name <- "John"
Elements
You can also concatenate, or join, two or
age <- 40 more elements, by using
the paste() function.
name # output "John" To combine both text and a variable,
age # output 40 R uses comma (,):
In R, we can use both = and <- as assignment operators. text <- "awesome"
print(name) # print the value of the name paste("R is", text)
for (x in 1:10) {
print(x) text1 <- "R is"
} text2 <- "awesome"
paste(text1, text2)
Multiple
Variables
# Assign the same value to multiple variables in one line
var1 <- var2 <- var3 <- "Orange"
Variable Names
• A variable name must start with a letter and can be a combination of letters, digits,
period(.) and underscore(_)
• If it starts with period(.), it cannot be followed by a digit
• A variable name cannot start with a number or underscore (_)
• Variable names are case-sensitive (age, Age and AGE are three different variables)
• Reserved words cannot be used as variables (TRUE, FALSE, NULL, if...)
# Legal variable # Illegal variable names:
names: 2myvar <- "John"
myvar <- "John" my-var <- "John"
my_var <- "John" my var <- "John"
myVar <- "John" _my_var <- "John"
MYVAR <- "John" my_v@ar <- "John"
myvar2 <- "John" TRUE <- "John"
.myvar <- "John"
Data Types
• In R, variables do not need to be declared with any particular type, and can even change
type after they have been set.
• Basic data types in R can be divided into the following types:
• numeric - (10.5, 55, 787)
• integer - (1L, 55L, 100L, where the letter "L" declares this as an integer)
• complex - (9 + 3i, where "i" is the imaginary part)
• character (a.k.a. string) - ("k", "R is exciting", "FALSE", "11.5")
• logical (a.k.a. boolean) - (TRUE or FALSE)
We can use the class() function to check the data type of a variable
# numeric
x <- 10.5
class(x)
Type Conversion
You can convert from one type to another with the following functions:
• as.numeric()
x <- 1L # integer
• as.integer()
y <- 2 # numeric
• as.complex()
# convert from integer to numeric:
a <- as.numeric(x)
# convert from numeric to integer:
b <- as.integer(y)
# print values of x and y
x
y
# print the class name of a and b
class(a)
class(b)
String Literals
"hello" is the same as 'hello'
str <- "This is the first line.
This is the second line. [1] "This is the first line.\nThis is the second line.\nAnd this is the third line.\n"
And this is the third line.
"
str
str <- "This is the first line.
This is the second line.
And this is the third line.
"
cat(str)
nchar(str)
Escape characters
Operators
Conditional statements
a <- 200
b <- 33
c <- 500
if (a > b & c > a) {
print("Both conditions are true")
} else if (a > b | a > c) {
print ("At least one of the conditions is true")
}
A for loop is used for iterating over a sequence
Loops for (x in 1:10) { print(x) }
fruits
i <- 1
<- list("a", "b", "c")
while (i < 6) {
i <- i + 1
for (x in fruits) {
if(i==3)
print(x)
{
}
next # skip an iteration
} dice <- c(1, 2, 3, 4, 5, 6)
print(i)
if (i == 4) { for (x in dice) {
break # stop the loop print(x)
} }
}
adj <- list("red", "big", "tasty")
fruits <- list("apple", "banana", "cherry")
for (x in adj) {
for (y in fruits) { print(paste(x,
y)) }
}
Functions
A function is a block of code which only runs when it is called. You can pass data, known as
parameters, into a function.
my_function <- function() my_function <- function(fname, lname) {
{ paste(fname, lname)
print("Hello World!") }
}
my_function("Peter", "Griffin")
my_function()
my_function <- function(fname, lname) { my_function <- function(x) {
paste(fname, lname) return (5 * x)
} }
my_function("Peter", "Griffin") print(my_function(3))
my_function()
Data Structures
Vectors
• A vector is simply a list of items that are of the same type.
• To combine the list of items to a vector, use the c() function and separate the items by a
comma.
# Vector of strings # Access the first item
fruits <- (banana)
c("banana", "apple", "orange") fruits[1]
# Vector of numerical # Access the first and third
values item (banana and orange)
numbers <- c(1, 2, 3) fruits[c(1, 3)]
# Vector with numerical values in a
sequence # Access all items except
numbers <- 1:10 for the first item
# Vector with numerical decimals in a sequence fruits[c(-1)]
numbers1 <- 1.5:6.5
# Vector with numerical decimals in a sequence where the last element is not
used
numbers2 <- 1.5:6.3
# Vector of logical values
log_values <- c(TRUE, FALSE, TRUE, FALSE)
Lists
• A list in R can contain many different data types
inside it.
• A list is a collection of data which is ordered and
changeable.
# List of strings
thislist <- list("apple", "banana", "cherry")
thislist[1]
thislist[1]
<- "blackcurrant"
# To add an item to the end of the list
append(thislist, "orange")
thislist <- list("apple", "banana", "cherry")
append(thislist, "orange", after = 2)
Lists
Matrices
A matrix can be created with the matrix() function. Specify the nrow and ncol parameters to get
the amount of rows and columns .
# Create a matrix
thismatrix <- matrix(c(1,2,3,4,5,6), nrow = 3, ncol
= 2)
thismatrix <- matrix(c("apple", "banana", "cherry", "orange"), nrow = 2,
ncol = 2)
# Access the items #Remove the first row and the first
thismatrix[1, 2] column
thismatrix[2,] #whole row thismatrix <- thismatrix[-c(1), -c(1)]
thismatrix[,2] #whole col dim(thismatrix)
thismatrix[c(1,2),] length(thismatrix)
thismatrix[, c(1,2)]
newmatrix <- cbind(thismatrix,
c("strawberry", "blueberry“))
newmatrix <- rbind(thismatrix,
c("strawberry", "blueberry“))
Arrays
array[row position, column position, matrix level]
Compared to matrices, arrays can have more than two dimensions. Arrays can only
have one data type
# An array with one dimension with values ranging from 1 How does dim=c(4,3,2) work?
to 24 The first and second number in the
bracket specifies the amount of rows
thisarray <- c(1:24) and columns.
# An array with more than one dimension The last number in the bracket
multiarray <- array(thisarray, dim = c(4, 3, 2)) specifies how many dimensions we
want.
multiarray[2, 3, 2] #22
multiarray[c(1),,1]
multiarray[,c(1),1]
Data Frame
• Data Frames are data displayed in a format as a table.
• Data Frames can have different types of data inside it.
• Column name should be non empty. Row name should be unique
• However, each column should have the same type of data.
summary(Data_Frame)
Data Frame
Use the rbind() function to add new rows in a Data Frame
# Add a new row
New_row_DF <- rbind(Data_Frame, c("Strength", 110, 110))
# Add a new column
New_col_DF <- cbind(Data_Frame, Steps
= c(1000, 6000, 2000))
# Remove the first row and column
Data_Frame_New <- Data_Frame[-c(1), -c(1)]
dim(Data_Frame)
ncol(Data_Frame)
nrow(Data_Frame)
length(Data_Frame) # same as ncol
Factors
• Factors are used to categorize data. Examples of factors are:
Demography: Male/Female
Music: Rock, Pop, Classic, Jazz
Training: Strength, Stamina
• To create a factor, use the factor() function and add a vector as argument
You can also set the levels, by adding the levels argument inside the factor() function
Factors
To access the items in a factor, refer to the index number, using [] brackets
R script
test.R
myobject= 5:10 or myobject<- 5:10 //creating a object
myobject
plot(myobject)
Click on Run
Console and environment window updated