R programmig LAB MANUAL[1]
R programmig LAB MANUAL[1]
LAB MANUAL
For
Prepared by:
Approved by:
Mr. karan
AIML Trainer
4. Code of Ethics 7
5. Course Syllabus 8
6. Course Outcome and COs-POs, COs-SOs mapping of the course 9
7. List of Experiments 10
2|Page
LAB MANUAL
Vision of the University
To nurture individual’s excellence through value based, cross-cultural, integrated and holistic education
adopting the contemporary and advanced means blended with ethical values to contribute in building a
peaceful and sustainable global civilization.
Mission of the University
To impart higher education at par with global standards that meets the changing needs of the
society
To provide access to quality education and to improve quality of life, both at individual and
community levels with advancing knowledge in all fields through innovations and ethical
research.
To actively engage with and promote growth and welfare of the surrounding community through
suitable extension and outreach activities
To develop socially responsible citizens, fostering ethical values and compassion through
participation in community engagement, extension and promotion activities.
To create competitive and coordinated environment wherein the individual develop skills and a
lifelong learning attitude to excel in their endeavours.
To develop Centers of Excellence culminating in achieving the cutting-edge technology in all
fields.
3|Page
LAB MANUAL
Program Educational Objectives (PEOs)
PEO1: Graduates of the Computer Science and Engineering will contribute to the Nation’s growth
through their ability to solve diverse and complex computer science and engineering problems across a
broad range of application areas. (PEO1 is focused on Problem Solving)
PEO2: Graduates of the Computer Science and Engineering will be successful professionals, designing
and implementing Products & Services of global standards in the field of Computer Science &
Engineering, becoming entrepreneurs, Pursuing higher studies & research. (PEO 2 is focused on
Professional Success)
PEO3: To install leadership qualities in graduates with a sense of confidence, professionalism and
ethical attitude to produce professional leaders for serving the society.
PEO3: Graduates of the Computer Science and Engineering Program will be able to adapt to changing
scenario of dynamic technology with an ability to solve larger societal problems using logical and
flexible approach in decision making. (PEO 3 is focused on Attaining Flexibility and Adaptability)
Program Outcomes (POs)
Engineering Graduates will be able to understand and apply the below mentioned PO’s.
PO1: Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals,
and an engineering specialization to the solution of complex engineering problems.
PO2: Problem analysis: Identify, formulate, review research literature, and analyze complex engineering
problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and
engineering sciences.
PO3: Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for the
public health and safety, and the cultural, societal, and environmental considerations.
PO4: Conduct investigations of complex problems: Use research-based knowledge and research
methods including design of experiments, analysis and interpretation of data, and synthesis of the
information to provide valid conclusions.
PO5: Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modelling to complex engineering activities with an
understanding of the limitations.
PO6: The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.
PO7: Environment and sustainability: Understand the impact of the professional engineering solutions
in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.
PO8: Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms
of the engineering practice.
PO9: Individual and team work: Function effectively as an individual, and as a member or leader in
diverse teams, and in multidisciplinary settings.
PO10: Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and write effective
reports and design documentation, make effective presentations, and give and receive clear instructions.
4|Page
LAB MANUAL
PO11: Project management and finance: Demonstrate knowledge and understanding of the engineering
and management principles and apply these to one’s own work, as a member and leader in a team, to
manage projects and in multi-disciplinary environments.
PO12: Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
Program Specific Outcomes (PSOs)
A Graduate of Computer Science and Engineering Program will be able to acquire the below mentioned
PSOs:
PSO1: Exhibit attitude for continuous learning and deliver efficient solutions for emerging challenges in
the computation domain.
PSO2: Apply standard software engineering principles to develop viable solutions for Information
Technology Enabled Services (ITES).
SO6 Identify and analyze user needs and to take them into account in the selection, creation,
integration, evaluation, and administration of computing-based systems √
5|Page
LAB MANUAL
EAC
An ability to identify, formulate, and solve complex engineering problems by √
SO1
applying
principles of engineering, science, and mathematics
An ability to apply engineering design to produce solutions that meet specified needs
SO2 with consideration of public health, safety, and welfare, as well as global, cultural, √
social, environmental, and economic factor
SO3 An ability to communicate effectively with a range of audiences
SO7 An ability to acquire and apply new knowledge as needed, using appropriate √
learning strategies.
6|Page
LAB MANUAL
Code of Ethics
I. To uphold the highest standards of integrity, responsible behavior, and ethical conduct
in professional activities.
1. to hold paramount the safety, health, and welfare of the public, to strive to comply
with ethical design and sustainable development practices, to protect the privacy of
others, and to disclose promptly factors that might endanger the public or the
environment;
2. to improve the understanding by individuals and society of the capabilities and
societal implications of conventional and emerging technologies, including intelligent
systems.
3. to avoid real or perceived conflicts of interest whenever possible, and to disclose
them to affected parties when they do exist;
4. to avoid unlawful conduct in professional activities, and to reject bribery in all its forms;
5. to seek, accept, and offer honest criticism of technical work, to acknowledge and
correct errors, to be honest and realistic in stating claims or estimates based on available
data, and to credit properly the contributions of others;
6. to maintain and improve our technical competence and to undertake technological
tasks for others only if qualified by training or experience, or after full disclosure of
pertinent limitations;
II. To treat all persons fairly and with respect, to not engage in harassment or discrimination,
and to avoid injuring others.
7. to treat all persons fairly and with respect, and to not engage in discrimination based
on characteristics such as race, religion, gender, disability, age, national origin, sexual
orientation, gender identity, or gender expression;
8. to not engage in harassment of any kind, including sexual harassment or
bullying behaviour;
9. to avoid injuring others, their property, reputation, or employment by false or
malicious actions, rumours or any other verbal or physical abuses;
III. To strive to ensure this code is upheld by colleagues and co-workers.
10. to support colleagues and co-workers in following this code of ethics, to strive to
ensure the code is upheld, and to not retaliate against individuals reporting a
violation.
7|Page
LAB MANUAL
MCA
3. Course Code 0 0 4
8. Course Description
This course presents a gentle introduction into the concepts of data analysis, the role of a Data Analyst, and
the tools that are used to perform daily functions. You will gain an understanding of the data ecosystem and
the fundamentals of data analysis, such as data gathering or data mining. You will then learn the soft skills
that are required to effectively communicate your data to stakeholders, and how mastering these skills can
give you the option to become a data driven decision maker.
9. Learning objectives:
• Provide you with the knowledge and expertise to become a proficient data
scientist
• Demonstrate an understanding of statistics and machine learning concepts that are
vital for data science;
• Produce Python code to statistically analyse a dataset;
• Critically evaluate data visualisations based on their design and use for
communicating stories from data.
8|Page
LAB MANUAL
10. Course Outcomes (COs):
1. Explain how data is collected, managed and stored for data science;
2. Understand the key concepts in data science, including their real-world applications and
the toolkit used by data scientists.
3. Implement data collection and management scripts using MongoDB.
PSO2
PO10
PO11
PO12
PO1
PO2
PO3
PO4
PO5
PO6
PO7
PO8
PO9
CO
CO-1 - - - 3 - 3 - - - - - - - 3
CO-2 - - 3 - - 3 - 3 - - - - -
CO-3 3 3 - 3 - 3 - - - - - - 3 -
a) Explicitly indicate which of the student outcomes listed in Criterion 3 or any other outcomes
are addressed by the course.
CAC EAC
9|Page
LAB MANUAL
CO’s/SO’s
SO1
SO2
SO3
SO4
SO5
SO6
SO1
SO2
SO3
SO4
SO5
SO6
SO7
CO-1 -----
CO-2 🗸 ---- ------ ---- ---- 🗸 ---- ---- 🗸 ----- 🗸 ----- ----
- -
CO-3 ----- 🗸 ------ ---- ---- ------ ---- 🗸 ------ ----- ----- ----- ----
- - -
Experiment1
1. R Environment setup and Essentials.
10 | P a g e
LAB MANUAL
11 | P a g e
LAB MANUAL
12 | P a g e
LAB MANUAL
Installing RStudio
RStudio can be downloaded from https://www.rstudio.com/products/rstudio/download.
The preview version with new features can be downloaded from
https://www.rstudio.com/products/rstudio/download/preview.
Note that RStudio does not include R, so you need to make sure that you have R installed while working in RStudio.
Once you complete the installation of RStudio, you see the following user interface of RStudio.
The screenshot of the user interface of RStudio for the Windows operating system is given below.
The main window consists of several parts.
Each part is known as a pane.
Each part performs a different function.
The panes have been designed to help data analyst work with the data.
13 | P a g e
LAB MANUAL
The R Console is also embedded in RStudio.
It works like a command prompt or terminal.
The commands that you type at the console, would be submitted to R engine by RStudio.
R engine is responsible for executing the commands.
RStudio takes the inputs from the user to R engine and presents the results back to the user.
You can use console to execute a command, define a variable, or evaluate an expression
interactively to compute a statistical measure, transform data, or produce charts.
While working with data, we not only type commands at the console but also write scripts, a set
of commands that represent a logic flow, at the editor.
The editor is useful for editing R scripts, markdown documents, web pages, and many types of
other configuration files.
The code editor is a more advanced editor than a plain text editor.
14 | P a g e
LAB MANUAL
It supports advanced functionalities such as syntax highlighting, autocompletion of R Code, and
debugging with the breakpoint.
The environment pane exhibits the variables and functions that have been created and that are
available for repeated use.
By default, variables are shown in the global environment, which is the user workspace where
you are working.
Whenever you create a new object, you can find a new entry in the Environment pane.
You can see the variable name and the short description of its values.
When you change the value of a symbol, the change is reflected in the environment pane.
15 | P a g e
LAB MANUAL
In the file pane, you can see the files in the folder whereas you can navigate between the folders,
create new folders, delete or rename the folders and files.
When you work on the RStudio project, you can view and organize the project files in the File
pane
You can use the plots pane to see the graphics produced by R code.
16 | P a g e
LAB MANUAL
If there is more than one plot, previous plots are stored. You can view all the plots by navigating
back and forth.
You can view all the installed packages in the package pane.
You can use CRAN to install or update the package or you can remove an existing package from
your library.
17 | P a g e
LAB MANUAL
R platform provides a detailed documentation. You can find the documentation in the Help Pane.
Using this documentation, you can learn how to use the functions.
Ways to view the documentation of a function are:
Type the function name in the Search box and find it directly
Type the function name in the console and press F1
Type ? before the function name and execute it
In practice, you don't have to remember all of R's functions; you only need to remember how to
get help with a function you are not familiar with.
18 | P a g e
LAB MANUAL
The Viewer pane is a new feature; it was introduced as an increasing number of R packages
combine the functionality of both R and existing JavaScript libraries to make rich and interactive
presentations of data.
Experiment2
2. R Basic Objects
The simplest numeric vector is a scalar number. The example is
> 1.5
[1] 1.5
After creating a value, we can store for the future use. We can use equal operator, leftward operator, or rightward
operator. We can create a variable in the following ways
> # equal operator
> x = 1.5
>x
[1] 1.5
> # leftward operator
19 | P a g e
LAB MANUAL
> y <- 2.5
>y
[1] 2.5
> # rightward operator
> 3.5 -> z
>z
[1] 3.5
we can combine several single-element vectors to create a multi-element vector.
> c(1,2,3,4)
[1] 1 2 3 4
We can use : operator to create a series of consecutive integers. The : operator creates an integer vector instead of numeric
vector
> 1:5
[1] 1 2 3 4 5
The simplest logical vectors are TRUE and FALSE themselves:
> TRUE
[1] TRUE
If we want to perform multiple comparisons at the same time, we can directly use numeric vectors in the question:
> c(1,2)>2
[1] FALSE FALSE
> c(1,2)>1
[1] FALSE TRUE
R also uses %in% logical operator. It tells whether each element in the left-hand side vector is contained by the right-hand
side vector:
You can use escape character (\) to insert double quotes into a string that starts and ends with double quotes
> cat("You are attending \"R Programming\" Class")
You are attending "R Programming" Class
We can also create the named vector without using the quotes
21 | P a g e
LAB MANUAL
42 22 11 21 24
Sub-setting a vector with non-existing position or name will produce missing values. But [[ ]] cannot if we try to extract an
element that is beyond the range.
> x[["d"]]
Error in x[["d"]] : subscript out of bounds
When we create a matrix, by default, no name is given to the rows and columns. But we can provide the names of the
columns while creating the matrix
> A = matrix(
+ c(1, 2, 3, 4, 5, 6), #the data elements
+ nrow = 3, #desired number of rows
22 | P a g e
LAB MANUAL
+ byrow = TRUE, #Fill rows by columns
+ dimnames = list( #To give names of rows and columns
+ c('r1','r2','r3'), #row name
+ c('c1','c2') #column name
+ ))
>A #Print the value of A
c1 c2
r1 1 2
r2 3 4
r3 5 6
We can create the array with names for these dimensions using dimnames
[[2]]
23 | P a g e
LAB MANUAL
[1] TRUE FALSE
[[3]]
[1] "a" "b" "c"
[[4]]
[1] 3
Extracting Element from List
We can use $ sign to extract the value of a list element by name
$Pum
[1] TRUE FALSE
$Dum
24 | P a g e
LAB MANUAL
[1] "a" "b" "c"
$<NA>
[1] 3
We can create a data frame using data.frame() and give the data of each column using a vector of the corresponding type
25 | P a g e
LAB MANUAL
4 4 2 0.63
5 5 3 0.71
person <- data.frame(
+ Name=c("Kate","Mate","Late","Jate"),
+ Age=c(24, 25, 35, 26),
+ Gender=c("Female","Male","Female","Female"),
+ MaritalStatus =c("Single","Single","Married","Single"), stringsAsFactors = TRUE)
> str(person)
'data.frame': 4 obs. of 4 variables:
$ Name : Factor w/ 4 levels "Jate","Kate",..: 2 4 3 1
$ Age : num 24 25 35 26
$ Gender : Factor w/ 2 levels "Female","Male": 1 2 1 1
$ MaritalStatus: Factor w/ 2 levels "Married","Single": 2 2 1 2
For Loop with List and Data Frame
for (i in 1:nrow(df)) {
+ cat("row", i, "\n", str(df[i,]),"\n")
+}
'data.frame': 1 obs. of 2 variables:
$ x: num 1
$ y: chr "A"
row 1
26 | P a g e
LAB MANUAL
row 3
The while loop does not stop running until a specific condition is met.
while (test_expression)
{
expr
}
Experiment 3
R Working With Strings
Concatenating Strings
We use paste() function to concatenate several character vectors. In this function, spaces are
used as the default separator.
> paste("hello","world")
[1] "hello world"
> paste("hello","world", sep="-")
[1] "hello-world”
To avoid the separator, we can set sep="" or alternatively call paste0():
> paste0("hello","world")
[1] "helloworld"
calc <- function(type, x, y) {
type <- tolower(type)
if (type == "add") {
x+y
} else if (type == "times") {
27 | P a g e
LAB MANUAL
x*y
} else {
stop("Not supported type of command")
}
}
> c(calc("add", 2, 3), calc("Add", 2, 3), calc("TIMES", 2, 3))
[1] 5 5 6
Counting Character
We can use nchar() to count the number of characters of each element of a character vector.
> nchar("Programming")
[1] 11
> nchar(c("Learn","R","Programming"))
[1] 5 1 11
substr(dates, 1, 3)
[1] "Jan" "Jun" "Sep"
Splitting Text
> class <- strsplit(c("Tim,34,USA","Travis,45,Germany","Pascal,23,France"),split=",")
28 | P a g e
LAB MANUAL
> class
[[1]]
[1] "Tim" "34" "USA"
[[2]]
[1] "Travis" "45" "Germany"
[[3]]
[1] "Pascal" "23" "France"
Parsing Text as Date
We can create date and time from a standard text representation
> my_date + 7
[1] "2020-01-14"
> my_date - 80
[1] "2019-10-19"
We can use either as.POSIXct() or as.POSIXlt() to create date time from the text representation.
29 | P a g e
LAB MANUAL
The two functions are different implementations of date/time.
The implementation of as.POSIXlt()is given below.
Experiment 4
R - Working With Data
> readLines("data/student.txt")
[1] "Name,Gender,Age,Major" "John,Male,24,Finance"
[3] "Amily,Female,25,Statistics" "Jessie,Female,23,Computer Science"
30 | P a g e
LAB MANUAL
> student <- read.csv("data/student.csv")
> str(student)
'data.frame': 3 obs. of 4 variables:
$ Name : chr "John" "Amily" "Jessie"
$ Gender: chr "Male" "Female" "Female"
$ Age : int 24 25 23
$ Major : chr "Finance" "Statistics" "Computer Science"
> library(readxl)
> price <- read_excel("data/price.xlsx")
> price
# A tibble: 6 x 3
Date Price Growth
<dttm> <dbl> <dbl>
1 2020-01-03 00:00:00 136 NA
2 2020-02-03 00:00:00 138 0.0147
3 2020-03-03 00:00:00 137 -0.00725
4 2020-04-03 00:00:00 130 -0.0511
5 2020-05-03 00:00:00 139 0.0692
6 2020-06-03 00:00:00 140 0.00719
32 | P a g e
LAB MANUAL
R has its own data file format that uses .rds extensions. We can use readRDS() function to read a
R data file.
dat <- readRDS("ACS.rds")
The .rds file format is usually smaller than its text file and hence it takes up less storage space.
The .rds file format also preserves data types and classes such as factors and dates eliminating
the need to redefine data types after loading the file.
Experiment 5
R - Visualizing Data
Creating Scatter Plot
plot(1:10)
33 | P a g e
LAB MANUAL
x <- rnorm(200)
plot(x,y)
We can customize several chart elements such as title (main or title()), the label of the x axis (xlab), the label of y axis (ylab),
the range of the x axis (xlim), and the range of the y axis (ylim)
plot(x, y,
34 | P a g e
LAB MANUAL
Scatter Plot – Logical Condition
We can also distinguish the two groups of points by a logical condition. We know that pch is vectorized.
So, we can use ifelse() to specify the point of each observation based on certain condition.
The following example applies pch = 17 to the points satisfying x * y > 1 otherwise pch = 1;
x <- rnorm(200)
plot(x,y,
35 | P a g e
LAB MANUAL
Scatter Plot – 2 Data Sets
A plot containing two separate datasets sharing the same x-axis can be drawn using plot() and points().
In the previous example, a normally distributed vector x, and a linearly correlated random vector y were generated.
For this example, we will generate another random vector, z, that has a non-linear relationship with x. In this example, we
have plotted both y and z against x whereas both the plots have different point styles:
x <- rnorm(75)
plot(x, y, pch = 1,
36 | P a g e
LAB MANUAL
Create Line Plot
On several data analysis problems such as time series analysis, we use line plots to demonstrate the trend and variation
across time. We should use type=”l” while calling plot().
t <- 1:50
37 | P a g e
LAB MANUAL
in a plot using the function abline().
We have shown the mean value and the range (minimum and maximum values) of y along with the time.
We can easily draw these auxiliary lines very easily by using different line types and colors.
p <- 40
38 | P a g e
LAB MANUAL
Line Plot with Points
We can plot both the lines and points in the same chart. This can be done easily by first plotting a line chart and then adding
points() of the same data to the plot again.
function and then we can add lines using the lines() function.
lines(y)
39 | P a g e
LAB MANUAL
Multi Series Chart with Legend
In the following code, we have generated two series, y1 and y2, with time t and created a chart with the two series with
respect to time t.
t <- 1:30
legend("topleft",
40 | P a g e
LAB MANUAL
Bar Chart
41 | P a g e
LAB MANUAL
Project NYCflights – Part 1
Before we can start using the dataset, we will use the command install.packages("nycflights13") to install the dataset.
carriers
9E AA AS B6 DL EV F9 FL HA MQ OO UA US VX WN
18460 32729 714 54635 48110 54173 685 3260 342 26397 32 58665 20536 5162 12275
YV
601
In the previous code, we have used table() to count the number of flights in the record for each carrier. Now sort the
carriers in decreasing order.
carriers_sort
UA B6 EV DL AA MQ US 9E WN VX FL AS F9 YV HA
58665 54635 54173 48110 32729 26397 20536 18460 12275 5162 3260 714 685 601 342
OO
32
Pie Chart
42 | P a g e
LAB MANUAL
The basic syntax of plotting a pie-chart by using R programming is as follows.
x: vector that contains the numeric values that are used in the pie chart
radius: to provide the radius of the circle of the pie chart (value between -1 and +1)
Histogram
hist(v,main,xlab,xlim,ylim,breaks,col,border)
v: a vector containing the numeric values that are used in the histogram
43 | P a g e
LAB MANUAL
xlim: range of values on the x-axis
hist(random_norm)
Box Plot
44 | P a g e
LAB MANUAL
Boxplot is used to visualize the distribution of the data in a data set. The boxplot represents the minimum, maximum,
median, first quartile, and the third quartile in the data set. You can compare the distribution of data across data sets by
drawing the boxplot for each one of them.
x: vector or a formula
varwidth: logical value, If TRUE, the width of the box is proportionate to the sample size names: group label that can be
printed under each boxplot
x <- rnorm(1000)
boxplot(x)
Experiment-6
45 | P a g e
LAB MANUAL
R - Analyzing Data
Linear Model
x <- rnorm(100)
f <- function(x) 3 + 2 * x
If we want to access the coefficients of the model, we can use the following code.
coef(linear_model)
(Intercept) x
3.036328 1.994926
The linear_model is a list. So, we can also use it to access the coefficients.
> linear_model$coefficients
(Intercept) x
3.036328 1.994926
We can also use the summary(linear_model) to access the statistical properties of the linear model.
> summary(linear_model)
Call:
lm(formula = y ~ x)
Residuals:
Coefficients:
46 | P a g e
LAB MANUAL
(Intercept) 3.03633 0.04980 60.97 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
You can plot the data and the regression line using the following code.
Now we can call the predict() function to make predictions using the fitted model. We can predict y with standard errors
when x = -1 and x = 0.5 using the following code.
$fit
1 2
1.041402 4.033791
47 | P a g e
LAB MANUAL
$se.fit
1 2
0.0772407 0.0554981
$df
[1] 98
$residual.scale
[1] 0.4969659
Decision Tree
First of all, we need to install the party package by executing the following command in the R console.
install.packages("party")
data = input_data)
plot(decision_tree)
48 | P a g e
LAB MANUAL
49 | P a g e
LAB MANUAL
50 | P a g e
LAB MANUAL