First steps in R
by Martin Heissenberger
www.r-tutorials.com
1. Download of R and R Studio
2. RStudio Orientation
3. Packages
4. Help Functionalities
help.start()
?hist
apropos("hist")
example(hist)
vignette()
# shows supporting material
vignette("grid")
5. R as a giant calculator
4+ 4+ 5
- space does not matter
- operators: +,-,*,/
5+6-4/2*5
- assign values to objects
x <- c(4,5,6)
c(4,5,76) -> y
x = c(4,5,6)
assign("x", c(4.2,1,5))
- c stands for "concatenate" - important command for vectors
creation
- see which objects are already occupied
ls()
objects()
- remove objects
rm("x")
- some vector calculus examples
x = c(y, 5, y)
objectrandom <- x < 5
objectrandom
sum(x)
sqrt(x)
- to see which value this position has
x[1]
6. Types of brackets
() round brackets as the standard brackets
[] box brackets if we are dealing with index positions of vectors
{} curled brackets for functions and loops
7. Data Sequences
seq(3, 5)
seq(from = 3, to = 5)
seq(from = 3, length =3)
seq(from = 3, length = 3, by = 0.5)
seq(from = 3, by = 0.5, length = 3)
- argument order does not matter
o paste Function - characters
paste ("xyz", 1:10)
paste ("xyz", c(2,5,7,"test", 4.5))
paste ("xyz", 1:10, sep = "")
What do you think this can be useful for?
o to repeat sequences
rep(c(3,4,5) , 3)
rep(1:10, times = 3)
x <- c(1,2,3)
rep(x, each = 3)
rep(x, each = 3, times = 3)
o knowing the position
x = c(4:20)
which(x == 10) # note the 2 equal signs, used for logical operations
o reverse of
x[3]
Exercises - Coding Basics
1. Define the object "myobject" and assign the vector 1:10 in at least 3
different ways
2. Get the sum of your object
3. Create the following vector by using the paste function
- [1] "R is great 4 and I will love it"
- [2] "R is great 7 and I will love it"
- [3] "R is great 45 and I will love it"
4. vector of 1,2,3, repeat the vector to get 11 x 1, 10 x 2, and 10 x 3
5. What is the value of this vector on position 7 ?
Solutions
1. Define the object "myobject" and assign the vector 1:10 in at least 3
different ways
myobject <- (1:10)
myobject = (1:10)
(1:10) -> myobject
assign("myobject", 1:10)
2. Get the sum of your object
sum (myobject)
3. Create the following vector by using the paste function
- [1] "R is great 4 and I will love it"
- [2] "R is great 7 and I will love it"
- [3] "R is great 45 and I will love it"
paste ("R is great", c(4,7,45), "and I will love it")
4. vector of 1, 2, 3, repeat the vector to get 11 x 1, 10 x 2, and 10 x 3
x=rep(1:3, length = 31)
5. What is the value of this vector on position 7 ?
x[7]
8. New Datasets
R has the preinstalled package "datasets"
in this package you can find many test datasets for exercise
purpose
?lynx
head(lynx)
head(iris)
tail(lynx)
summary(lynx)
o to get a visual idea
plot(lynx)
hist(lynx)
o those datasets are dataframes - 2 dimensional objects with
different data types
o work with subsets of those dataframes: $
head(iris)
sum(iris$Sepal.Length)
o or we can use the attach argument to make handling this set easier
attach(iris)
sum(Sepal.Length)
9. Functions in R
Brief description: R functions are OBJECTS
They do calculations for you
R function structure: name <- function (argument) {statements}
The arguments specify the components to be used in the function
myfirstfn <- function(x) {x+x}
myfirstfn(10)
o stepwise working functions
mysecondfn <- function(t,z) {
value = z*3
value = value *t
print(value)}
t= 5
z= 9
mysecondfn(t,z)
10. Loops - loops and functions are a crucial part in
programming
FOR loops allow a certain operation to be repeated a fixed nr of
times
This is opposed to the While loop where the rep nr is not prefixed
The syntax looks like this: for (name in vector) {commands}
for (i in 1:15) {print (i)}
for (z in 1:15) {print (z)}
o variable name does not matter although you will see i quite often
Can be used for quite complex calculations
Example calculation of primes with the Eratosthenes method (the
oldest known systematic method)
PrimVec = function(n){
# to start from 2
if (n>=2) {
# to further specify the sequence we want to work with
s = seq(2,n)
# p will be the container for our primes,
# numbers will be moved from s to p step by step if they meet the
criteria
p = c()
# we start the loop
for (i in seq(2,n)){
# we use any to check that i (of this loop round) is still in s, multiples of i
will be removed
if(any(s==i)){
# we store i if it meets our criteria in p together with the previous p
p = c(p,i)
# to search for numbers with a remainder at modulus division
s = c(s[(s%%i) != 0],i)
}}
return(p) }
# to specify the output if n < 2 (optional)
else{
stop("Input at least 2")
}}
PrimVec(100)
To learn more about loops and functions take a look at my Level 1
course – Redeem the code ‘happy17’ to get it for $17 and save 70%
11. Graphs in R
many different types are available
different packages can help you
easiest way: scatterplot
x=5:7
y=8:10
plot(x,y)
or by using a dataset
plot(lynx)
plot(lynx, main="Lynx Trappings", col="red",
col.main=52, cex.main=1.5)
o the cex family can be used to change magnification factors
handling the labs
plot(lynx, ylab="Lynx Trappings", xlab="")
plot(lynx, ylab="Lynx Trappings", xlab="", las=2)
o changing the scale direction
las - 0:3, axis labels orientation
par(mfrow=c(2,2), col.axis="red")
plot(1:8, las=0, xlab="xlab", ylab="ylab", main="LAS = 0")
plot(1:8, las=1, xlab="xlab", ylab="ylab", main="LAS = 1")
plot(1:8, las=2, xlab="xlab", ylab="ylab", main="LAS = 2")
plot(1:8, las=3, xlab="xlab", ylab="ylab", main="LAS = 3")
colors
colors() # huge variety
point symbol types
?pch
x=2:4
plot(x, pch="c")
plot(x, pch=13)
Line Types
par(mfrow=c(1,1), col.axis="black")
library(plotrix)
plot(1:6, ylab="", main="Line Types lty 1:6", xlab="lty 1:6")
ablineclip(v=1, lty=1, col="sienna2", lwd=2)
ablineclip(v=2, lty=2, col="sienna2", lwd=2)
ablineclip(v=3, lty=3, col="sienna2", lwd=2)
ablineclip(v=4, lty=4, col="sienna2", lwd=2)
ablineclip(v=5, lty=5, col="sienna2", lwd=2)
ablineclip(v=6, lty=6, col="sienna2", lwd=2)
ablineclip(v=7, lty=0, col="sienna2", lwd=2)
Types of plots
o by using "type" we can specify which kind of plot we want
plot(lynx) # plot for time series data
plot(lynx, type="p", main="Type p") # points (default)
plot(lynx, type="l", main="Type l") # lines (default for time series)
plot(lynx, type="b", main="Type b") # points connected by lines
plot(lynx, type="o", main="Type o") # points overlaid by lines
plot(lynx, type="h", main="Type h") # high density
plot(lynx, type="s", main="Type s") # steps
plot(lynx, type="n", main="Type n") # no plot
Example: by skilled usage of all the available parameters you can create
quite complex graphs
par(mar=c(4,3,3,3), col.axis="black")
plot(cars$speed, type="s", col="red", bty="n", xlab="Cars ID", ylab="")
text(8, 14, "Speed in mph", cex=0.85, col="red") # this adds the
explanatory text to plot 1
par(new=T) # allows 2 in one plot
plot(cars$dist, type="s", bty="n", ann=F, axes=F, col="darkblue")
axis(side=4) # y axis for plot 2
text(37, 18, "Stopping distance in ft", cex=0.85, col="darkblue") #
explanations to plot 2
title(main="Speed and Stopping\n Distances of Cars")
graphical parameters
par()
?par
Check out the ‘Graphs in R’ course to learn about every single par -
Redeem the code ‘happy29’ to get it for $29 and save 70%
12. How to export your graphs
13. Alternative graphs packages and some examples
Exercise - R Graphs
Get familiar with the "rivers" dataset
- find out how many observations there are
- plot rivers against the index
- name labels and header(red)
- choose an appropriate symbol (not default) in green
# hint: \n for new lines in text
Solution
?rivers
x = 1:141
y= rivers
plot(x,y, col = "green", pch = 20,
main = "Lengths of\nMajor N. American Rivers",
col.main ="red", xlab = "",
ylab = "length in miles")
Dear Students
- 5 hours content
- objects, functions, loops,
apply fam., graphs
- exercises
- script
- satisfied Students
This is the R Tutorials’ Course Tree
which allows you to see how the
courses are structured – they cover R
programming intensively so that you
are ready to branch out to your
specific field.
The course tree itself has a 2
component foundation. The courses R
Basics and Level 1 contain all the
necessary ingredients to bring you to
an intermediate level in a short period
of time. The courses are meant for
total beginners, contain exercises and
have a step by step approach that
keeps you motivated and wanting to
learn more.
As soon as you have a solid foundation you can branch out to specific fields. Graphs in R, Statistics
in R and Text Mining, Scraping and Sentiment Analysis are advanced courses. You need to be
familiar with R in order to benefit the most. You can choose the order in which you take them or
even skip one if you do not work with those tools at all.
The Machine Learning course is even a
more specialised course, which requires
solid statistics knowledge, therefore it is
recommended to do the Statistics in
R course first. That makes sure that you
have an optimal learning success.
You can always take a look at the course
tree and check out which courses are
best suited for your career and in which
order you want to participate in them!
Redeem the coupon code ‘RMEGAPACK’ or ’RSTATISTICS’
for the ‘Statistics in R’ course.
Within 24 hours you will get a message from us via Udemy
with the access links to the remaining two courses.
Enjoy the tutorials and get better in R!
Do you have a wish?
Three courses of your choice for $49
Should you have any questions
about R or about my courses,
just drop me a line within Udemy
and I will be happy to help you.
Kind Regards
Martin