0% found this document useful (0 votes)
20 views

Statistics With R Programming For Bigdata (Autosaved)

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Statistics With R Programming For Bigdata (Autosaved)

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 41

Guest lecture

on
STATISTICS with R PROGRAMMING for DATA SCIENCE

Dr.A.MANIMARAN B.E,M.E,Ph.D
PROFESSOR,
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SAVEETHA SCHOOL OF ENGINEERING, SIMATS, CHENNAI
In this Lecture

 R and R Studio
 How do
 Set the working directory
 Create an R file and save it
 Execute an R file
 Variable
 Basic Data Types
 Advance Data Structure
 Function
 classes
R

 R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand
 Open Source Programming Language
 R is world’s most widely used statistics programming language and graphics
 Statistical Software and Data Analysis Tool
 Command Line Interface
 Platforms,
 Windows,
 Line X
 Macos
What is R studio?

 Integrated Development Environment(IDE) for R

 available -Open source and Commercial software

 Edition-Desktop version and Server version


A first look of R studio
Basic Program(syntax)
 Depart<-” Welcome AIDS ”
 Print(Depart)/ cat(“depart”, Depart)

 OutPut
 AIDS

 The values of the variables can be printed using print() or cat() function.
 The cat() function combines multiple items into a continuous print output
Variable

EXAMPLE
 A variable is a memory allocated
for the storage of specific data  var1 = "hello"
print(var1)
 R Variables Syntax
# using leftward operator
• Using equal to operators var2 < - "hello"
variable_name = value print(var2)
• using leftward operator
# using rightward operator
variable_name <- value
"hello" -> var3
• using rightward operator print(var3)
value -> variable_name
Kept in Mind

 Allowed characters are Alphanumeric, “ _” “.”.

 Always Start With alphabets.

 No special characters like @,$ etc

 No keywords
R DATA TYPES
Basic Data Types Values Examples
Numeric Set of all real numbers "numeric_value <- 3.14"

Integer Set of all integers, Z "integer_value <- 42L"

Logical TRUE and FALSE "logical_value <- TRUE"


"complex_value <- 1 +
Complex Set of complex numbers
2i"
“a”, “b”, “c”, …, “@”,
"character_value <-
Character “#”, “$”, …., “1”, “2”,
"Hello Geeks"
…etc
TYPE VERIFICATION

 Syntax:
is.data_type()
# Logical
print(is.logical(TRUE))
# Integer
print(is.integer(3L))
# Numeric
print(is.numeric(10.5))
# Complex
print(is.complex(1+2i))
# Character
print(is.character("12-04-2020"))
print(is.integer("a"))
print(is.numeric(2+3i))
Convert The Data Type Of An Object To Another

 Syntax
as.data_type(object)

# Complex
print(as.character(1+2i))

# Can't possible
print(as.numeric("12-04-2020"))

# Numeric
print(as.logical(10.5))
Advance Data Structure (Data Types)
 A data structure is a particular way of organizing data in a computer so
that it can be used effectively
• Vectors
• Lists
• Dataframes
• Matrices
• Arrays
• Factors
Vectors

 Vectors contain a sequence of homogeneous types of data.


 Atomic Vector
 Integer
 Double
 Logical
 Character
 Complex
 Raw
 Recursive Vector
 list
 The function c() :
x <- c(1, -1, 3.5, 2)
Print(x)
print(typeof(x))
Output: 1,-1,3.5,2
Lists
 A list is a generic object consisting of an ordered collection of objects.
 Lists are heterogeneous data structures
 The function list()
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
empList = list(empId, empName)
Print(empList)
Output:
1 2 3 4
"Debi" "Sandeep" "Subham" "Shiba"
Accessing Components

 By name(all components of a list can be named)


 empList=list("ID" = empId, "names"=empName)
 print(empList$names)

 By indices
 To Access top level components, use double slicing operator “[[]]” or [], and for
lower /inner level components use “[]” along with “[[]]”,
 Print(emplist[1])
 Print(emplist[1][2])
Manipulating Lists

 A List can be modified by accessing Components & replacing them


 empList[[2]][5]="manimaran“
 print(empList)

 Concatenation of List:
 li=c(list1,list2)
Matrices
 A matrix is a rectangular arrangement of numbers in rows and
columns
 function matrix()
 matrix(data, nrow, ncol, byrow, dimnames)
Matrices
M <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(M)
# Elements are arranged sequentially by column.
N <- matrix(c(3:14), nrow = 4, byrow = FALSE)
print(N)
# Define the column and row names.
rownames = c("row1", "row2", "row3", "row4")
colnames = c("col1", "col2", "col3")
P <- matrix(c(3:14), nrow = 4, byrow = TRUE, dimnames =
list(rownames, colnames))
print(P)
DATA FRAME

 CREATE
 Access rows and columns
 Edit
 Add new rows and columns
Dataframes
 Dataframes are generic data objects of R which are used to store the
tabular data
 Function data.frame()
Age = c(22, 25, 45)
Language = c("R", "Python", "Java")
Name = c("Amiya", "Raj", "Asish")
df = data.frame(Name, Language, Age)
print(df)
Arrays

 Arrays are the R data objects which store the data in


more than two dimensions. Arrays are n-dimensional
data structures
function array()

A = array( c(1, 2, 3, 4, 5, 6, 7, 8), dim = c(2, 2, 2) )


print(A)
Factors

 Factors in R Programming Language are data structures that


are implemented to categorize the data or represent categorical
data and store it on multiple levels.

.
FUNCTION
 Block of code which runs only when it is called function.
It has some inputs called arguments, and an output called the return value.
Creating a Function in R
 by using the command function()
)

TYPES OF FUNCTION IN R LANGUAGE

Built-in Function: User-defined Function


 R language allow us to write our own
Built-in functions in R are pre-defined functions
function
Find sum of numbers 4 to 6. evenOdd = function(x)
print(sum(4:6)) {
if(x %% 2 == 0)
return("even")
# Find max of numbers 4 and 6. else
print(max(4:6)) return("odd")
}
# Find min of numbers 4 and 6. print(evenOdd(4))
print(min(4:6)) print(evenOdd(3))
Example 1 Example square
2 = function(x)
mean2 <- function(x) {
{ x^2
n <- length(x) }
sum(x)/n
} square(4)
mean2(1:10)
Recursive Function in R

 Recursion is when the function calls itself.


 This forms a loop, where every time the function is called, it calls itself again and
again and this technique is known as recursion.

rec_fac <- function(x){


if(x==0 || x==1)
{
return(1)
}
else
{
return(x*rec_fac(x-1))
}
}
Find the sum of squares of a given series of numbers.
Sum = 12+22+…+N2
sum_series <- function(vec){
if(length(vec)<=1)
{
return(vec^2)
}
else
{
return(vec[1]^2+sum_series(vec[-1]))
}
}
series <- c(1:10)
sum_series(series)
R – OBJECT ORIENTED PROGRAMMING
Class and Object

 Class is the blueprint or a prototype


from which objects are made by
encapsulating data members and
functions.
 An object is a data structure that
contains some methods that act upon
its attributes.
 S3 class
 S4 class
 Reference class
S3 CLASS

 A list that will contain all the class members


 Then this list is passed to the class() method as an argument
 Syntax:
 variable_name<-list(attribute1,attribute2, attribute3….attributeN)
 # List creation with its attributes name and roll no.
 a <- list(name="Adam", Roll_No=15)
 # Defining a class "Student"
 class(a) <- "Student"
 # Creation of object
 a
S3 CLASS

a=list(name="manimaran",Rollno=101)
print.Student <- function(obj)
{
cat("name: " ,obj$name, "\n")
cat("Roll No: ", obj$Roll_No, "\n")
}
print(a)
S4 CLASS
 S4 class has a predefined definition. It contains functions for defining methods
and generics
setClass()

 Syntax:
setClass(“myclass”,slots=list(name=”character”,
Roll_No=”numeric”))
 new() function is used to create an object of the S4 class
 pass the class name as well as the value for the slots.
S4 CLASS

setClass("Student",slots=list(name="character",
Roll_No="numeric"))
a <- new("Student", name="Adam", Roll_No=20)
a
R Programming Structure

Loop statements  Flow chart


for loop
Syntax:

for(value in vector)
{
statements .... ....
}
for (i in 1: 4)
{
print(i ^ 2)
}
 Repeat loop-To
iterate over a block of code
multiple number of times.
 It executes the same code again and
again until a break statement is found.
 Syntax:

Repeat
{ commands
if(condition)
{
break
}
}
Example

[1] "Hello World“


result <- c("Hello World")
i <- 1 [1] "Hello World“
repeat {
print(result) [1] "Hello World“
i <- i + 1
if(i >5) { [1] "Hello World“
break
} [1] "Hello World"
}
 R- While loop Syntax :

while (test_expression)
{
Statement
update_expression
}
# R program to illustrate while loop
result <- c("Hello World")
i <- 1
while (i < 6) {
print(result)
i=i+1
}
[

Next Statement

It discontinues a particular iteration and  Output:


jumps to the next iteration
for (i in c(3, 6, 23, 19, 0, 21))
[1] 3
{ [1] 6
{
[1] 23
next [1] 19
}
print(i)
[1] 21
} [1] Outside loop
print('Outside Loop’)
Break statement

 The break keyword is a jump no <- 1:10


statement that is used to terminate the for (val in no)
loop at a particular iteration. {
if (val == 5)

 Syntax: {

if (test_expression) print(paste("Coming out from for loop Where i = ",


val))
{
break
Break
}
}
print(paste("Values are: ", val))
}
Decision Making in R Programming
if statement
if(condition is true)
{
execute this statement
}
a <- 76
b <- 67
if(a > b)
{ c <- a - b
print("condition a > b is TRUE")
print(paste("Difference between a, b is : ", c))
}
a <- 67
 b <- 76
Syntax:
if-else statement if(a > b)
if(condition is true) {
{ c <- a - b
print("condition a > b is TRUE")
execute this statement
print(paste("Difference between a, b is : ", c))
} } else
else { {
execute this statement c <- a - b
print("condition a > b is FALSE")
}
print(paste("Difference between a, b is : ", c))
}
THANKS

You might also like