0% found this document useful (0 votes)
1 views

R programmig LAB MANUAL[1]

This document is a lab manual for the R programming course in the Bachelor of Technology program for Computer Science and Engineering. It outlines the course objectives, outcomes, and includes a detailed syllabus with experiments related to R programming. Additionally, it presents the vision and mission of the university and department, along with ethical guidelines and program educational objectives.

Uploaded by

deepak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

R programmig LAB MANUAL[1]

This document is a lab manual for the R programming course in the Bachelor of Technology program for Computer Science and Engineering. It outlines the course objectives, outcomes, and includes a detailed syllabus with experiments related to R programming. Additionally, it presents the vision and mission of the university and department, along with ethical guidelines and program educational objectives.

Uploaded by

deepak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 50

Bachelor of Technology

LAB MANUAL

For

Department of Computer Science and


Engineering (CSE)
Bachelors in Technology
AIML

Course Name: R programming Lab


Course Code:
Semester: 3rd

Prepared by:
Approved by:
Mr. karan
AIML Trainer

Faculty of Engineering and Technology


Contents of Course File

Sr. No. Content Page No.

1. Vision and Mission of the University 3

2. Vision and Mission of the Department 3

3. PEOs, POs, PSOs and SOs 4

4. Code of Ethics 7
5. Course Syllabus 8
6. Course Outcome and COs-POs, COs-SOs mapping of the course 9

7. List of Experiments 10

8. 1. R Environment setup and Essentials. 13-23

9. 2. R Basic Objects 24-30

10. 3. R Working With Strings 31-34

11. 4. R - Working With Data 35-37

12. 5. R - Visualizing Data 37-50

13. 6. R - Analyzing Data 51-54

2|Page
LAB MANUAL
Vision of the University
To nurture individual’s excellence through value based, cross-cultural, integrated and holistic education
adopting the contemporary and advanced means blended with ethical values to contribute in building a
peaceful and sustainable global civilization.
Mission of the University

 To impart higher education at par with global standards that meets the changing needs of the
society
 To provide access to quality education and to improve quality of life, both at individual and
community levels with advancing knowledge in all fields through innovations and ethical
research.
 To actively engage with and promote growth and welfare of the surrounding community through
suitable extension and outreach activities
 To develop socially responsible citizens, fostering ethical values and compassion through
participation in community engagement, extension and promotion activities.
 To create competitive and coordinated environment wherein the individual develop skills and a
lifelong learning attitude to excel in their endeavours.
 To develop Centers of Excellence culminating in achieving the cutting-edge technology in all
fields.

Vision of the Department


To be one of the top leaders who develops competitive and devoted professionals in the field of
computer engineering by providing cutting-edge knowledge in a variety of innovative technology
areas, including artificial intelligence, machine learning, data science, cybersecurity, Internet of
Things (IoT), cloud computing, big data, software development and more.
Mission of the Department
Our mission is to produce computer science experts who can address the challenges of the digital
age. We achieve this through high-quality education and practical training that prepares our
students to contribute in an impactful way to this field. Join us in cultivating the next generation
of computer engineers and driving positive change in the digital world.

3|Page
LAB MANUAL
Program Educational Objectives (PEOs)

PEO1: Graduates of the Computer Science and Engineering will contribute to the Nation’s growth
through their ability to solve diverse and complex computer science and engineering problems across a
broad range of application areas. (PEO1 is focused on Problem Solving)
PEO2: Graduates of the Computer Science and Engineering will be successful professionals, designing
and implementing Products & Services of global standards in the field of Computer Science &
Engineering, becoming entrepreneurs, Pursuing higher studies & research. (PEO 2 is focused on
Professional Success)
PEO3: To install leadership qualities in graduates with a sense of confidence, professionalism and
ethical attitude to produce professional leaders for serving the society.
PEO3: Graduates of the Computer Science and Engineering Program will be able to adapt to changing
scenario of dynamic technology with an ability to solve larger societal problems using logical and
flexible approach in decision making. (PEO 3 is focused on Attaining Flexibility and Adaptability)
Program Outcomes (POs)

Engineering Graduates will be able to understand and apply the below mentioned PO’s.
PO1: Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals,
and an engineering specialization to the solution of complex engineering problems.
PO2: Problem analysis: Identify, formulate, review research literature, and analyze complex engineering
problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and
engineering sciences.
PO3: Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for the
public health and safety, and the cultural, societal, and environmental considerations.
PO4: Conduct investigations of complex problems: Use research-based knowledge and research
methods including design of experiments, analysis and interpretation of data, and synthesis of the
information to provide valid conclusions.
PO5: Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modelling to complex engineering activities with an
understanding of the limitations.
PO6: The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.
PO7: Environment and sustainability: Understand the impact of the professional engineering solutions
in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.
PO8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms
of the engineering practice.
PO9: Individual and team work: Function effectively as an individual, and as a member or leader in
diverse teams, and in multidisciplinary settings.
PO10: Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and write effective
reports and design documentation, make effective presentations, and give and receive clear instructions.

4|Page
LAB MANUAL
PO11: Project management and finance: Demonstrate knowledge and understanding of the engineering
and management principles and apply these to one’s own work, as a member and leader in a team, to
manage projects and in multi-disciplinary environments.
PO12: Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
Program Specific Outcomes (PSOs)
A Graduate of Computer Science and Engineering Program will be able to acquire the below mentioned
PSOs:
PSO1: Exhibit attitude for continuous learning and deliver efficient solutions for emerging challenges in
the computation domain.
PSO2: Apply standard software engineering principles to develop viable solutions for Information
Technology Enabled Services (ITES).

Student Outcomes (SOs)


CAC
SO1 Analyze a complex computing problem and to apply principles of computing and
other relevant disciplines to identify solutions. √
SO2 Design, implement and evaluate a computing-based solution to meet a given set of computing
requirements in the context of the program ‘s discipline. √
SO3 Communicate effectively in a variety of professional contexts.
SO4 Recognize professional responsibilities and make informed judgments in computing practice
based on legal and ethical principles.
SO5 Function effectively as a member or leader of a team engaged in activities appropriate to the
program ‘s discipline.

SO6 Identify and analyze user needs and to take them into account in the selection, creation,
integration, evaluation, and administration of computing-based systems √

5|Page
LAB MANUAL
EAC
An ability to identify, formulate, and solve complex engineering problems by √
SO1
applying
principles of engineering, science, and mathematics
An ability to apply engineering design to produce solutions that meet specified needs
SO2 with consideration of public health, safety, and welfare, as well as global, cultural, √
social, environmental, and economic factor
SO3 An ability to communicate effectively with a range of audiences

An ability to recognize ethical and professional responsibilities in engineering


situations and
SO4
make informed judgments, which must consider the impact of engineering solutions
in
global, economic, environmental, and societal contexts

SO5 An ability to function effectively on a team whose members together provide


leadership, create a collaborative and inclusive environment, establish goals, plan
tasks, and meet objectives

SO6 An ability to develop and conduct appropriate experimentation, analyze and √


interpret data, and
use engineering judgment to draw conclusions

SO7 An ability to acquire and apply new knowledge as needed, using appropriate √
learning strategies.

6|Page
LAB MANUAL
Code of Ethics
I. To uphold the highest standards of integrity, responsible behavior, and ethical conduct
in professional activities.
1. to hold paramount the safety, health, and welfare of the public, to strive to comply
with ethical design and sustainable development practices, to protect the privacy of
others, and to disclose promptly factors that might endanger the public or the
environment;
2. to improve the understanding by individuals and society of the capabilities and
societal implications of conventional and emerging technologies, including intelligent
systems.
3. to avoid real or perceived conflicts of interest whenever possible, and to disclose
them to affected parties when they do exist;
4. to avoid unlawful conduct in professional activities, and to reject bribery in all its forms;
5. to seek, accept, and offer honest criticism of technical work, to acknowledge and
correct errors, to be honest and realistic in stating claims or estimates based on available
data, and to credit properly the contributions of others;
6. to maintain and improve our technical competence and to undertake technological
tasks for others only if qualified by training or experience, or after full disclosure of
pertinent limitations;
II. To treat all persons fairly and with respect, to not engage in harassment or discrimination,
and to avoid injuring others.
7. to treat all persons fairly and with respect, and to not engage in discrimination based
on characteristics such as race, religion, gender, disability, age, national origin, sexual
orientation, gender identity, or gender expression;
8. to not engage in harassment of any kind, including sexual harassment or
bullying behaviour;
9. to avoid injuring others, their property, reputation, or employment by false or
malicious actions, rumours or any other verbal or physical abuses;
III. To strive to ensure this code is upheld by colleagues and co-workers.
10. to support colleagues and co-workers in following this code of ethics, to strive to
ensure the code is upheld, and to not retaliate against individuals reporting a
violation.

7|Page
LAB MANUAL
MCA

1. Name of the Department- Computer Science Engineering

2. Course Name R programming Lab L T P

3. Course Code 0 0 4

4. Type of Course (use tick mark) Core() PE(✓) OE ()

5. Pre-requisite (if 6. Frequency (use Even Odd Either Every


any) tick marks) ()
(✓) Sem () Sem ()

7. Total Number of Lectures, Tutorials, Practical (assuming 14 weeks of one semester)

Lectures = 0 Tutorials = 0 Practical = 28

8. Course Description

This course presents a gentle introduction into the concepts of data analysis, the role of a Data Analyst, and
the tools that are used to perform daily functions. You will gain an understanding of the data ecosystem and
the fundamentals of data analysis, such as data gathering or data mining. You will then learn the soft skills
that are required to effectively communicate your data to stakeholders, and how mastering these skills can
give you the option to become a data driven decision maker.

9. Learning objectives:
• Provide you with the knowledge and expertise to become a proficient data
scientist
• Demonstrate an understanding of statistics and machine learning concepts that are
vital for data science;
• Produce Python code to statistically analyse a dataset;
• Critically evaluate data visualisations based on their design and use for
communicating stories from data.

8|Page
LAB MANUAL
10. Course Outcomes (COs):

1. Explain how data is collected, managed and stored for data science;
2. Understand the key concepts in data science, including their real-world applications and
the toolkit used by data scientists.
3. Implement data collection and management scripts using MongoDB.

11. List of Experiments

1. R Environment setup and Essentials.


2. R Basic Objects
3.R Working With Strings
4. R - Working With Data
5. R - Visualizing Data
6. R - Analyzing Data

COs-POs mapping of the course


PSO1

PSO2
PO10

PO11

PO12
PO1

PO2

PO3

PO4

PO5

PO6

PO7

PO8

PO9

CO

CO-1 - - - 3 - 3 - - - - - - - 3

CO-2 - - 3 - - 3 - 3 - - - - -

CO-3 3 3 - 3 - 3 - - - - - - 3 -

a) Explicitly indicate which of the student outcomes listed in Criterion 3 or any other outcomes
are addressed by the course.

CAC EAC

9|Page
LAB MANUAL
CO’s/SO’s
SO1

SO2

SO3

SO4

SO5

SO6

SO1

SO2

SO3

SO4

SO5

SO6

SO7
CO-1 -----

CO-2 🗸 ---- ------ ---- ---- 🗸 ---- ---- 🗸 ----- 🗸 ----- ----
- -
CO-3 ----- 🗸 ------ ---- ---- ------ ---- 🗸 ------ ----- ----- ----- ----
- - -

Experiment1
1. R Environment setup and Essentials.

You can install R from


official website (https://www.r-project.org/),
download R (https://cran.r-project.org/mirrors.html),
choose a nearby mirror (For India https://mirror.niser.ac.in/cran/),
download a version for your operating system,
select base as subdirectory, and click on “Download R 3.2.3 for Windows”.
The latest version while writing this content is 3.2.3. It may be different when you are trying to install R.
If you are Windows user, you can download an installer for the latest version.
Then run the Windows installer to install R.
Even though the installation process is easy, many users face issues during the installation.
When choosing the components to install, in the Windows drop-down, the installer would display four components. Install
the default options as shown below

10 | P a g e
LAB MANUAL
11 | P a g e
LAB MANUAL
12 | P a g e
LAB MANUAL
Installing RStudio
RStudio can be downloaded from https://www.rstudio.com/products/rstudio/download.
The preview version with new features can be downloaded from
https://www.rstudio.com/products/rstudio/download/preview.
Note that RStudio does not include R, so you need to make sure that you have R installed while working in RStudio.
Once you complete the installation of RStudio, you see the following user interface of RStudio.

The screenshot of the user interface of RStudio for the Windows operating system is given below.
The main window consists of several parts.
Each part is known as a pane.
Each part performs a different function.
The panes have been designed to help data analyst work with the data.

13 | P a g e
LAB MANUAL
The R Console is also embedded in RStudio.
It works like a command prompt or terminal.
The commands that you type at the console, would be submitted to R engine by RStudio.
R engine is responsible for executing the commands.
RStudio takes the inputs from the user to R engine and presents the results back to the user.
You can use console to execute a command, define a variable, or evaluate an expression
interactively to compute a statistical measure, transform data, or produce charts.

While working with data, we not only type commands at the console but also write scripts, a set
of commands that represent a logic flow, at the editor.
The editor is useful for editing R scripts, markdown documents, web pages, and many types of
other configuration files.
The code editor is a more advanced editor than a plain text editor.

14 | P a g e
LAB MANUAL
It supports advanced functionalities such as syntax highlighting, autocompletion of R Code, and
debugging with the breakpoint.

The environment pane exhibits the variables and functions that have been created and that are
available for repeated use.
By default, variables are shown in the global environment, which is the user workspace where
you are working.
Whenever you create a new object, you can find a new entry in the Environment pane.
You can see the variable name and the short description of its values.
When you change the value of a symbol, the change is reflected in the environment pane.

You can see previous expressions evaluated in the console.


In the history pane, you can repeat the task that were performed previously by simply pressing
up in the console.

15 | P a g e
LAB MANUAL
In the file pane, you can see the files in the folder whereas you can navigate between the folders,
create new folders, delete or rename the folders and files.
When you work on the RStudio project, you can view and organize the project files in the File
pane

You can use the plots pane to see the graphics produced by R code.

16 | P a g e
LAB MANUAL
If there is more than one plot, previous plots are stored. You can view all the plots by navigating
back and forth.

You can view all the installed packages in the package pane.
You can use CRAN to install or update the package or you can remove an existing package from
your library.

17 | P a g e
LAB MANUAL
R platform provides a detailed documentation. You can find the documentation in the Help Pane.
Using this documentation, you can learn how to use the functions.
Ways to view the documentation of a function are:
Type the function name in the Search box and find it directly
Type the function name in the console and press F1
Type ? before the function name and execute it
In practice, you don't have to remember all of R's functions; you only need to remember how to
get help with a function you are not familiar with.

18 | P a g e
LAB MANUAL
The Viewer pane is a new feature; it was introduced as an increasing number of R packages
combine the functionality of both R and existing JavaScript libraries to make rich and interactive
presentations of data.

Experiment2
2. R Basic Objects
The simplest numeric vector is a scalar number. The example is
> 1.5
[1] 1.5
After creating a value, we can store for the future use. We can use equal operator, leftward operator, or rightward
operator. We can create a variable in the following ways
> # equal operator
> x = 1.5
>x
[1] 1.5
> # leftward operator
19 | P a g e
LAB MANUAL
> y <- 2.5
>y
[1] 2.5
> # rightward operator
> 3.5 -> z
>z
[1] 3.5
we can combine several single-element vectors to create a multi-element vector.
> c(1,2,3,4)
[1] 1 2 3 4
We can use : operator to create a series of consecutive integers. The : operator creates an integer vector instead of numeric
vector
> 1:5
[1] 1 2 3 4 5
The simplest logical vectors are TRUE and FALSE themselves:

> TRUE
[1] TRUE
If we want to perform multiple comparisons at the same time, we can directly use numeric vectors in the question:

> c(1,2)>2
[1] FALSE FALSE
> c(1,2)>1
[1] FALSE TRUE
R also uses %in% logical operator. It tells whether each element in the left-hand side vector is contained by the right-hand
side vector:

> 1 %in% c(1,2,3)


[1] TRUE
A character vector is a group of strings.
The character in this case does not mean literally a single letter or symbol in a language, but it means a string like this is a
string.
20 | P a g e
LAB MANUAL
We can use both the double quotation marks as well as single quotation mark to create a character vector, as follows:

> "hello world"


[1] "hello world"
> 'hello world'
[1] "hello world"

You can use escape character (\) to insert double quotes into a string that starts and ends with double quotes
> cat("You are attending \"R Programming\" Class")
You are attending "R Programming" Class
We can also create the named vector without using the quotes

> n = c(First = "Mary", Last = "John")


>n
Print Vector – We can print the vector in the following ways
a. > vec
sum1 num1 dum1 rum1 lum1
21 22 23 24 25
b. > (vec)
sum1 num1 dum1 rum1 lum1
21 22 23 24 25
c. > print(vec)
sum1 num1 dum1 rum1 lum1
21 22 23 24 25
d. > show(vec)
sum1 num1 dum1 rum1 lum1
21 22 23 24 25
Arrange in alphabetical order of names
> vec[sort(names(vec))]
dum1 Lion lum1 sum1 Tiger

21 | P a g e
LAB MANUAL
42 22 11 21 24

Arrange in alphabetical order of names (reverse)


> vec[sort(names(vec), decreasing = TRUE)]
Tiger sum1 lum1 Lion dum1
24 21 11 22 42

Replace values with missing values


> vec[c(2,4)] <- NA
> vec
sum1 Lion dum1 Tiger lum1
21 NA 42 NA 11
You can use [[ ]] to extract one element only. You cannot extract more than one element
> x[[c("a","b")]]
Error in x[[c("a", "b")]] :
attempt to select more than one element in vectorIndex

You cannot use negative integers as well


> x[[-1]]
Error in x[[-1]] : invalid negative subscript in get1index <real>

Sub-setting a vector with non-existing position or name will produce missing values. But [[ ]] cannot if we try to extract an
element that is beyond the range.
> x[["d"]]
Error in x[["d"]] : subscript out of bounds
When we create a matrix, by default, no name is given to the rows and columns. But we can provide the names of the
columns while creating the matrix

> A = matrix(
+ c(1, 2, 3, 4, 5, 6), #the data elements
+ nrow = 3, #desired number of rows

22 | P a g e
LAB MANUAL
+ byrow = TRUE, #Fill rows by columns
+ dimnames = list( #To give names of rows and columns
+ c('r1','r2','r3'), #row name
+ c('c1','c2') #column name
+ ))
>A #Print the value of A
c1 c2
r1 1 2
r2 3 4
r3 5 6
We can create the array with names for these dimensions using dimnames

> mularray = array( #Call array() function


+ c(1:24), #1 dimension vector with 24 values
+ dim=c(4,3,2), #specify 4x3x2 dimensional array
+ dimnames = list(
+ c('x1','x2','x3','x4'),#specify the names of 1st dim
+ c('y1','y2','y3'), #specify the names of 2nd dim
+ c('z1','z2') #specify the names of 3rd dim
+ )
+)
n <- c(2,3,5) #Numeric Vector
> p <- c(TRUE, FALSE) #Logical Vector
> q <- c('a','b','c') #Character Vector
> x <- list(n,p,q,3) #Heterogeneous members
>x #Print List x
[[1]]
[1] 2 3 5

[[2]]

23 | P a g e
LAB MANUAL
[1] TRUE FALSE

[[3]]
[1] "a" "b" "c"

[[4]]
[1] 3
Extracting Element from List
We can use $ sign to extract the value of a list element by name

> x <- list(


+ n=c(2,3,5), #Numeric Vector
+ p=c(TRUE, FALSE), #Logical Vector
+ q=c('a','b','c'), #Character Vector
+ 3)
> x$n
[1] 2 3 5
> x$p
[1] TRUE FALSE
> x$q
[1] "a" "b" "c"
names(x) <- c('Num','Pum','Dum')
>x
$Num
[1] 2 3 5

$Pum
[1] TRUE FALSE

$Dum

24 | P a g e
LAB MANUAL
[1] "a" "b" "c"

$<NA>
[1] 3
We can create a data frame using data.frame() and give the data of each column using a vector of the corresponding type

> batch <- data.frame(


+ Name = c("John","Nancy","Kate"),
+ Gender = c("Male","Female","Female"),
+ Age = c(20, 19, 21),
+ Major = c("Statistics","Mathematics","Computer Science")
+)
> batch
Name Gender Age Major
1 John Male 20 Statistics
2 Nancy Female 19 Mathematics
3 Kate Female 21 Computer Science
We can also create a data frame from a matrix
> mtx <- matrix(c(1:9),nrow = 3, byrow = FALSE)
> data.frame(mtx)
X1 X2 X3
1 1 4 7
2 2 5 8
3 3 6 9
df1 <- data.frame(id=1:5, x=c(-1, 0, 1, 2, 3), y =c(0.76, 0.45, 0.56, 0.63, 0.71))
> df1
id x y
1 1 -1 0.76
2 2 0 0.45
3 3 1 0.56

25 | P a g e
LAB MANUAL
4 4 2 0.63
5 5 3 0.71
person <- data.frame(
+ Name=c("Kate","Mate","Late","Jate"),
+ Age=c(24, 25, 35, 26),
+ Gender=c("Female","Male","Female","Female"),
+ MaritalStatus =c("Single","Single","Married","Single"), stringsAsFactors = TRUE)
> str(person)
'data.frame': 4 obs. of 4 variables:
$ Name : Factor w/ 4 levels "Jate","Kate",..: 2 4 3 1
$ Age : num 24 25 35 26
$ Gender : Factor w/ 2 levels "Female","Male": 1 2 1 1
$ MaritalStatus: Factor w/ 2 levels "Married","Single": 2 2 1 2
For Loop with List and Data Frame
for (i in 1:nrow(df)) {
+ cat("row", i, "\n", str(df[i,]),"\n")
+}
'data.frame': 1 obs. of 2 variables:
$ x: num 1
$ y: chr "A"
row 1

'data.frame': 1 obs. of 2 variables:


$ x: num 2
$ y: chr "B"
row 2

'data.frame': 1 obs. of 2 variables:


$ x: num 3
$ y: chr "C"

26 | P a g e
LAB MANUAL
row 3
The while loop does not stop running until a specific condition is met.
while (test_expression)
{
expr
}

Experiment 3
R Working With Strings
Concatenating Strings
We use paste() function to concatenate several character vectors. In this function, spaces are
used as the default separator.
> paste("hello","world")
[1] "hello world"
> paste("hello","world", sep="-")
[1] "hello-world”
To avoid the separator, we can set sep="" or alternatively call paste0():

> paste0("hello","world")
[1] "helloworld"
calc <- function(type, x, y) {
type <- tolower(type)
if (type == "add") {
x+y
} else if (type == "times") {
27 | P a g e
LAB MANUAL
x*y
} else {
stop("Not supported type of command")
}
}
> c(calc("add", 2, 3), calc("Add", 2, 3), calc("TIMES", 2, 3))
[1] 5 5 6
Counting Character
We can use nchar() to count the number of characters of each element of a character vector.

> nchar("Programming")
[1] 11

The function nchar() is also vectorized

> nchar(c("Learn","R","Programming"))
[1] 5 1 11

We can use substr() function to extract the months

substr(dates, 1, 3)
[1] "Jan" "Jun" "Sep"

Splitting Text
> class <- strsplit(c("Tim,34,USA","Travis,45,Germany","Pascal,23,France"),split=",")
28 | P a g e
LAB MANUAL
> class
[[1]]
[1] "Tim" "34" "USA"

[[2]]
[1] "Travis" "45" "Germany"

[[3]]
[1] "Pascal" "23" "France"
Parsing Text as Date
We can create date and time from a standard text representation

> my_date <- as.Date("2020-01-07")


> my_date
[1] "2020-01-07"
The question arises if the date can be represented as a string, why do we need a Date object. The
Date object has good arithmetic properties.
We can add or subtract a number of days from a date to get a new date.

> my_date + 7
[1] "2020-01-14"
> my_date - 80
[1] "2019-10-19"

We can use either as.POSIXct() or as.POSIXlt() to create date time from the text representation.
29 | P a g e
LAB MANUAL
The two functions are different implementations of date/time.
The implementation of as.POSIXlt()is given below.

> my_time <- as.POSIXlt("2020-01-07 14:45:17")


> my_time
[1] "2020-01-07 14:45:17 IST"
In such a case, a format string can be used to let the as.Date() function know, how to parse string
to a date.

> as.Date('2017.05.21', format='%Y.%m.%d')


[1] "2017-05-21"

Experiment 4
R - Working With Data

Importing Data Using Built in Function


The function readLines() can be used to read a text file. This function returns a number of lines as
a character vector

> readLines("data/student.txt")
[1] "Name,Gender,Age,Major" "John,Male,24,Finance"
[3] "Amily,Female,25,Statistics" "Jessie,Female,23,Computer Science"

To import the data from a CSV file, we can call read.csv().

30 | P a g e
LAB MANUAL
> student <- read.csv("data/student.csv")
> str(student)
'data.frame': 3 obs. of 4 variables:
$ Name : chr "John" "Amily" "Jessie"
$ Gender: chr "Male" "Female" "Female"
$ Age : int 24 25 23
$ Major : chr "Finance" "Statistics" "Computer Science"

Importing Data Using readr Package


The typical call to readr package is as follows
> library(readr)
> student1 <- read_csv("data/student.csv")
── Column specification
────────────────────────────────────────────────────────────────────
cols(
Name = col_character(),
Gender = col_character(),
Age = col_double(),
Major = col_character()
)
> student1
# A tibble: 3 x 4
Name Gender Age Major
<chr> <chr> <dbl> <chr>
31 | P a g e
LAB MANUAL
1 John Male 24 Finance
2 Amily Female 25 Statistics
3 Jessie Female 23 Computer Science

Reading and writing Excel Worksheet


Excel workbook is another format for storing tabular data. R does not provide any built-in
function to read an Excel workbook.
But several R packages such as readxl (https://github.com/hadley/readxl), are available to work
with Excel worksheets. You can install the readxl package from CRAN using
install.packages("readxl").

> library(readxl)
> price <- read_excel("data/price.xlsx")
> price
# A tibble: 6 x 3
Date Price Growth
<dttm> <dbl> <dbl>
1 2020-01-03 00:00:00 136 NA
2 2020-02-03 00:00:00 138 0.0147
3 2020-03-03 00:00:00 137 -0.00725
4 2020-04-03 00:00:00 130 -0.0511
5 2020-05-03 00:00:00 139 0.0692
6 2020-06-03 00:00:00 140 0.00719

Reading and Writing native data files

32 | P a g e
LAB MANUAL
R has its own data file format that uses .rds extensions. We can use readRDS() function to read a
R data file.
dat <- readRDS("ACS.rds")
The .rds file format is usually smaller than its text file and hence it takes up less storage space.
The .rds file format also preserves data types and classes such as factors and dates eliminating
the need to redefine data types after loading the file.

Experiment 5
R - Visualizing Data
Creating Scatter Plot

plot(1:10)

33 | P a g e
LAB MANUAL
x <- rnorm(200)

y <- 2*x + rnorm(200)

plot(x,y)

Customize Chart Elements

We can customize several chart elements such as title (main or title()), the label of the x axis (xlab), the label of y axis (ylab),
the range of the x axis (xlim), and the range of the y axis (ylim)

plot(x, y,

main = ”Correlated Random numbers",

xlab = "x", ylab = "2x + noise",

xlim = c(-3, 3), ylim = c(-6, 6))

34 | P a g e
LAB MANUAL
Scatter Plot – Logical Condition

We can also distinguish the two groups of points by a logical condition. We know that pch is vectorized.

So, we can use ifelse() to specify the point of each observation based on certain condition.

The following example applies pch = 17 to the points satisfying x * y > 1 otherwise pch = 1;

x <- rnorm(200)

y <- 2*x + rnorm(200)

plot(x,y,

pch = ifelse(x * y > 1, 17, 1),

main = "Scatter plot with conditional pch")

35 | P a g e
LAB MANUAL
Scatter Plot – 2 Data Sets

A plot containing two separate datasets sharing the same x-axis can be drawn using plot() and points().

In the previous example, a normally distributed vector x, and a linearly correlated random vector y were generated.

For this example, we will generate another random vector, z, that has a non-linear relationship with x. In this example, we
have plotted both y and z against x whereas both the plots have different point styles:

x <- rnorm(75)

y <- 1.5*x + rnorm(75)

z <- sqrt(1 + x ^ 2) + rnorm(75)

plot(x, y, pch = 1,

xlim = range(x), ylim = range(y, z),

xlab = "x", ylab = "value")

points(x, z, pch = 17)

title("Scatter plot with two datasets")

36 | P a g e
LAB MANUAL
Create Line Plot

On several data analysis problems such as time series analysis, we use line plots to demonstrate the trend and variation
across time. We should use type=”l” while calling plot().

t <- 1:50

y <- 2.5 * sin(t * pi / 60) + rnorm(t)

plot(t, y, type = "l", main = "Line plot")

Line Type and Width

In the following example, we have drawn the auxiliary lines

37 | P a g e
LAB MANUAL
in a plot using the function abline().

In this example, first of all, we created a plot of y with time, t.

We have shown the mean value and the range (minimum and maximum values) of y along with the time.

We can easily draw these auxiliary lines very easily by using different line types and colors.

plot(t, y, lwd = 2, type = "l")

abline(h = mean(y), col = "red", lty = 2)

abline(h = range(y), col = "blue", lty = 3)

abline(v = t[c(which.min(y), which.max(y))], col = "brown", lty = 3)

title("Line plot with auxiliary lines")

As we used points() in the case of scatter plot,

we can use lines() in the case of the line plot.

p <- 40

plot(t[t <= p], y[t <= p], type = "l",

xlim = range(t), xlab = "t", ylab = "y")

lines(t[t >= p], y[t >= p], lty = 2)

title("Two period Line Plot")

38 | P a g e
LAB MANUAL
Line Plot with Points

We can plot both the lines and points in the same chart. This can be done easily by first plotting a line chart and then adding
points() of the same data to the plot again.

plot(y, type = "l")

points(y, pch = 16)

title("Line plot with points")

Alternatively, first, we can plot a scatter plot using the plot()

function and then we can add lines using the lines() function.

plot(y, pch = 16)

lines(y)

title("Line plot with points")

39 | P a g e
LAB MANUAL
Multi Series Chart with Legend

In the following code, we have generated two series, y1 and y2, with time t and created a chart with the two series with
respect to time t.

t <- 1:30

y1 <- 1.5 * t + 6 * rnorm(30)

y2 <- 2.5 * sqrt(t) + 8 * rnorm(30)

plot(t, y1, type = "l", col = "black",

ylim = range(y1, y2), ylab ="y1, y2")

points(y1, pch = 15)

lines(y2, col = "blue", lty = 2)

points(y2, col = "blue", pch = 16)

title ("Plot of two series")

legend("topleft",

legend = c("y1", "y1"),

col = c("black", "blue"),

lty = c(1, 2), pch = c(15, 16),

cex = 0.8, x.intersp = 0.5, y.intersp = 0.8)

40 | P a g e
LAB MANUAL
Bar Chart

The basic syntax to create a barplot in R is:

barplot(H, xlab, ylab, main, names.arg, col)

H: is a vector or matrix containing numeric values

xlab: label for x-axis

ylab: label for y-axis

main: title of the bar chart

names.arg: vector of names appearing under each bar

col: color for the bars in the graph

Let’s plot a simple bar chart

barplot(1:10, names.arg = LETTERS[1:10])

41 | P a g e
LAB MANUAL
Project NYCflights – Part 1

Before we can start using the dataset, we will use the command install.packages("nycflights13") to install the dataset.

data("flights", package = "nycflights13")

carriers <- table(flights$carrier)

carriers

9E AA AS B6 DL EV F9 FL HA MQ OO UA US VX WN

18460 32729 714 54635 48110 54173 685 3260 342 26397 32 58665 20536 5162 12275

YV

601

In the previous code, we have used table() to count the number of flights in the record for each carrier. Now sort the
carriers in decreasing order.

carriers_sort <- sort(carriers, decreasing = TRUE)

carriers_sort

UA B6 EV DL AA MQ US 9E WN VX FL AS F9 YV HA

58665 54635 54173 48110 32729 26397 20536 18460 12275 5162 3260 714 685 601 342

OO

32

Pie Chart
42 | P a g e
LAB MANUAL
The basic syntax of plotting a pie-chart by using R programming is as follows.

pie(x, labels, radius, main, col, clockwise)

x: vector that contains the numeric values that are used in the pie chart

labels: to provide the description of the slices

radius: to provide the radius of the circle of the pie chart (value between -1 and +1)

main: to provide the title of the chart

col: indicates the color palette

clockwise: indicates whether the slices are drawn clockwise or anti-clockwise

Histogram

The basic syntax for creating a histogram is as follows

hist(v,main,xlab,xlim,ylim,breaks,col,border)

The description of the parameters are as follows:

v: a vector containing the numeric values that are used in the histogram

main: title of the chart

xlab: description of the x-axis

43 | P a g e
LAB MANUAL
xlim: range of values on the x-axis

ylim: range of values on the y-axis

breaks: width of each bar

col: color of the bars

border: border-color of each bar

random_norm <- rnorm(10000)

hist(random_norm)

curve(dnorm, add = TRUE, lwd = 2, col ="blue")

Box Plot
44 | P a g e
LAB MANUAL
Boxplot is used to visualize the distribution of the data in a data set. The boxplot represents the minimum, maximum,
median, first quartile, and the third quartile in the data set. You can compare the distribution of data across data sets by
drawing the boxplot for each one of them.

The basic syntax to create a box plot is as follows

boxplot(x, data, notch, varwidth, names, main)

x: vector or a formula

data: data frame

notch: logical value. Draws a notch is set as TRUE

varwidth: logical value, If TRUE, the width of the box is proportionate to the sample size names: group label that can be
printed under each boxplot

main: provides the title to the graph

We can plot a simple box plot as follows

x <- rnorm(1000)

boxplot(x)

Experiment-6
45 | P a g e
LAB MANUAL
R - Analyzing Data

Linear Model

x <- rnorm(100)

f <- function(x) 3 + 2 * x

y <- f(x) + 0.5 * rnorm(100)

If we want to access the coefficients of the model, we can use the following code.

coef(linear_model)

(Intercept) x

3.036328 1.994926

The linear_model is a list. So, we can also use it to access the coefficients.

> linear_model$coefficients

(Intercept) x

3.036328 1.994926

We can also use the summary(linear_model) to access the statistical properties of the linear model.

> summary(linear_model)

Call:

lm(formula = y ~ x)

Residuals:

Min 1Q Median 3Q Max

-1.39226 -0.31731 0.01711 0.28940 1.26922

Coefficients:

Estimate Std. Error t value Pr(>|t|)

46 | P a g e
LAB MANUAL
(Intercept) 3.03633 0.04980 60.97 <2e-16 ***

x 1.99493 0.05589 35.69 <2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.497 on 98 degrees of freedom

Multiple R-squared: 0.9286, Adjusted R-squared: 0.9278

F-statistic: 1274 on 1 and 98 DF, p-value: < 2.2e-16

You can plot the data and the regression line using the following code.

plot(x, y, main = "A simple linear regression")

abline(coef(linear_model), col = "blue")

Now we can call the predict() function to make predictions using the fitted model. We can predict y with standard errors
when x = -1 and x = 0.5 using the following code.

> predict(linear_model, list(x = c(-1, 0.5)), se.fit = TRUE)

$fit

1 2

1.041402 4.033791
47 | P a g e
LAB MANUAL
$se.fit

1 2

0.0772407 0.0554981

$df

[1] 98

$residual.scale

[1] 0.4969659

Decision Tree

First of all, we need to install the party package by executing the following command in the R console.

install.packages("party")

Let’s create the decision tree using the ctree() function

input_data <- readingSkills[c(1:105),]

decision_tree <- ctree(

nativeSpeaker ~ age + shoeSize + score,

data = input_data)

# Plot the tree.

plot(decision_tree)

48 | P a g e
LAB MANUAL
49 | P a g e
LAB MANUAL
50 | P a g e
LAB MANUAL

You might also like