0% found this document useful (0 votes)
39 views23 pages

CLASS notes-UNIX-R Assignments123

This document provides information on various UNIX commands and R programming concepts. It discusses commands like cd, ls, pwd, cat, rm, mkdir, clear, cp, mv in UNIX and concepts like vectors, sequences, random numbers, lists, arrays, matrices in R. It also covers data types, coercion, naming elements, indexing, appending, deleting in vectors and lists. Operations on vectors like union, intersect are also summarized.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views23 pages

CLASS notes-UNIX-R Assignments123

This document provides information on various UNIX commands and R programming concepts. It discusses commands like cd, ls, pwd, cat, rm, mkdir, clear, cp, mv in UNIX and concepts like vectors, sequences, random numbers, lists, arrays, matrices in R. It also covers data types, coercion, naming elements, indexing, appending, deleting in vectors and lists. Operations on vectors like union, intersect are also summarized.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

UNIX Commands

1) cd(change directory)
cd command is used to change the one working directory location to another working directory.

It is also used to move inside to the folder, it will come to home directory.

2)cd.. command is helpful to go one step backward.

3)ls :ls command (listing)


this command is used to show all availavable folders and files inside the directory

4)ls-l show the details of all files and folders with whole descriptions which is avialable in side the
directory

5)pwd

pwd present working directory command is used to show the present


working directory path
6) cat command displays the file content from .txt fiLE
7) echo $HOME command displays path of the your home folder/directory

ex: hduser@ubuntu:~$ echo $HOME


/home/hduseR
8)rm -r command Deletes the file from a folder if necessary
9)mkdir command is used for creating a file or directory

10)clear command is useful for clearing the executing program

1
11)cp command is used for copying files from source file to destination file

12)cp -r command for copying and files

13)mv command is useful for moving the files completely from source to
destination.

“R” Language
ASSIGNMENT-

2
R does not provide direct access to the computer’s memory but rather provides a number of
specialized data structures we will refer to as objects. These objects are referred to through symbols
or variables. In R, however, the symbols are themselves objects and can be manipulated in the same
way as any other object. This is different from many other languages and has wide ranging effects.
Every object has two properties mode and length.
Class/mode commands used to find out the type of data of the variable. Mode is used to know what
kind of data the object holds.
Data types: numeric, character, logical and complex
Numeric type is the root class which includes integers and decimals. Character type -characters,
logical-TRUE/FALSE, complex-2+3i for example. Converting the variable “a” of type integer and
storing into another variable can be written as as.integer(a)->b is called as type casting.
a=20,
b=25.6
c=a+b
We can observe, class of c is also like that of a i.e. numeric even though it is float.
Vectors
A vector is a sequence of data elements of the same basic type. Members in a vector are officially
called components or elements. Vector – a collection of ordered homogeneous elements.
A vector containing three numeric values 2, 3 and 5 can be denoted as
> c(2, 3, 5)
[1] 2 3 5
And here is a vector of logical values:
2
> c(TRUE, FALSE, TRUE, FALSE, FALSE)
[1] TRUE FALSE TRUE FALSE FALSE
A vector can contain character strings.
> c("aa", "bb", "cc", "dd", "ee")
[1] "aa" "bb" "cc" "dd" "ee"
Incidentally, the number of members in a vector is given by the length function.
> length(c("aa", "bb", "cc", "dd", "ee"))
[1] 5
Vectors can hold both character and numeric values objects. Components/elements of a vector are
accessed through indexing operations such as x[i],for accessing the “i”th element. R has six basic
vector types: logical, integer, real, complex, string (or character) and raw. Length of a vector is No. of

2
elements in the the vector. we can append a vector a=c(“hello”,1,2,3) by append(a, uma) command
Above screenshot shows how to access the vector locations using index operators and appending.
creating a character/string vector and its class and length can be observed here:
3
4
Workspace
Workspace is a folder in the ‘R’ console, the place where the all the workdone during the runtime is
getting saved.
Suppose a=2,b=3, c=a+b , d=13 stored in workspace myR –folder (say),we can delete the
varaiable’b’ from the workspace by using the command rm(b) and rm(list=ls()) command deletes the
entire list of elements.
setwd(“/home/hduser/myR”) command is used to set working directory :creating my R folder and
getwd() command is used to get working directory like /home/hduser/myR after execution.
Sequences-vectors
Notation : vector=1stelement: last element
a= seq (1st element, last element, stepping factor) or
a=seq(1st element, by=step factor, last element)
5
APPENDING A VECTOR :Let a=3:15.For appending ‘a’,by4 at 9th position a=append(a,4,9) and
another vector at 3rd position can be observed here;
REPEAT FUNCTION : rep(element to be repeated, No of times) or rep (1 st element: Last element,
each=No of reputations that should occcur)
Random Numbers: runif (N) command generates the random numbers where N is the no of random
numbers to be generated or a=runif(N, min value, max value)
6
Naming the elements of a vector
Suppose a=c(12,14,16) or c(12:2) and names(a)=c(“small”, ”medium”, ”large”) the after executing a
or print(a) we get results as shown below;
Replacement of elements of a vector vector [position of the element]=substituting element.
NULL-NA
NULL is a special object c . It is used whenever there is a need to indicate or specify that an object is
absent. It should not be confused with a vector or list of zero length.
The NULL object has no type and no modifiable properties. There is only one NULL object in R, to
which all instances refer. To test for NULL use is null. You cannot set attributes on NULL.
7
8
The above screens shows how to use commands-is.na(a),!is.na(a) and NA and omit NA.
Let a=c(1:10) and b=a[-c(5)] then vector b has all elements of a ,except 5 th element from vector(a)
If a=c(1:10),b=min(a),b=max(a)
9
A=c(1:10) is a interger vector converting it into numeric vector type X=as.numeric(a) or as.numeric(a)
10
Coercion happens in R when the type of objects are changed during computation either implicitly or
by using functions for explicit coercion (such as as.numeric, as.data.frame, etc.).
Here number 100 is converted into character by”100”-conversion of numeric into character.
Conversion of logicalvector into numeric vector
11
Conversion of logicalvector to character vector
Order of precedence( in decreasing order with character being the highest):
Character,numeric,integer and logical.
LISTS
A vector with possible heterogeneous elements. The elements of a list can be numeric vectors,
character vectors, matrices, arrays, and lists.
Let v1=c(“true”,” false”),v2=1:5,v3=c(“cat””dog”)
My_list=list(v1,v2,v3) is a collection of three different vectors. my_list is a collection of type list and it
can hold any kind of simple or complex data types.
my_list[1] will outputs the following:
SCREEN

3
Here,the length of the llist my_list is number of elements in the list i.e.3
Unlisting an element from a list using the command: unlist(list)[position of the object /vector to be
deleted]
unlist(my_list)[2]
“false”
The above command will directly print”a”.ie mylist is temporarily converted into flat vector when it is
executed.
Naming the list elements:
Let v1=c(“true”,”false”),v2=15,v3=c(“cat”,”dog”) and my_list =list(v1,v2,v3)
Mynames=c(“A Boolean vector”,” An integer vector”,”Animal string”)
Names(my_list)=mynames
Check the contents of my_list by command print(my_list) and result is as shown below:
SCREEN
Print(my_lsit$”Animal string”) would be resulted as [1] “cat” “dog”
List manipulation: add,delete and update list elements
Adding an element to a list
Let v1=c(“true”,”false”),v2=1:5,v3=c(“cat”,dog”) and my_list=list(v1,v2,v3)
If specify mylist[4] = “Hello,”command, then it would add “hello” to the list
Deletion of an element: by specifying mylist[3]=NULL,it will deletes 3 rd element of the list i.e.v3vector

Merging the lists


Set properties and operations
Suppose a=c(1:6),b=c(5:10) then is.element(1,a) 0r 1%%a
Is.element(a,b) checks if every element in a is present in every element in b and resulted a logical
vector.
Picking random samples from a vector
a=c(1:10)
sample(a) function would picks the random sample from the vector”a”.The random sample will pick all
the elements from “a” but each time the ordering will be different.To ensure the ordering of the
element is same every time use set.seed(Number of seed) i.e.set.seed(100).If we want the ordering
to be same ,we haveto use the same seed value

Suppose the original vector has 5 elements and we need a sample of size 10 elements to be picked
,ho to do that?
a=c(1:5),b=sample(a,replace =T) where replace = T means, the resultant vector can have repeated
elements in the sample.
How to sort the elements in a vector “b” in descending order?
C=sort(b,decreasing =TRUE)
Or
C=rev(sort(b))]b=c(-1,100,2)
Order(b)
Ordering will give the position of the elements in asceding order.The elements of the vector will not be
printed.

a=1:10
b=5:10
Union(a,b)->c will merge two vectors into a single vectors but the elements will not be repeated.
Intersect (a,b)->d,Intersection will bring only the common elements of a&b in to new vector.
The any command will check if there is any element in vector b, which is greater than any of the
elements in vector b and return a TRUE or FALSE.
Swapping
Suppose a=20,b=17,c=b,d=a,a=c,b=d then after swapping we can get a=17 and b=20

Division operation :Let a=10:15 ,b=20:25 and c=a/b


The following the screen shows the division, multiplication and remainders operations on vectors
ARRAYS
Consider a vector with 4 elements a=1:4,d=c(2,2)
Myarr=array(a,d).This statement will produce a 2 dimensional array with 2 rows and 2 columns.The
idea of arrays is that the vector elements are arranged into rows and columns as specified by the user

4
If a=1:15, b=c(2,2) and myarr=array(a,b) then access the element of the array like 2 nd row -myarr[2,]
and 2nd column -myarr[,2]
MATRICES
Suppose mymat=matrix (c(1:10), nrow=5,byrow=TRUE)
Print (mymat) command will arrange the elements of the vector of 10 elements into a table format
(rows and columns ) where the number or rows is mentioned as 5.
Understanding the byrow=TRUE ,to understand this ,change it to byrow=FALSE
In the a command and the following screen shows the difference
Suppose a=1:4,myarr=array(a,dim=c(2,3))
Print(myarr)

Let mymat=matrix(a,nrow=2,ncol=3,byrow=TRUE)
Vector length=product of nrow and ncol.vectors and I dimensional arrays are the same except that
arrays have the dim argument.
Naming the matrix(assigning the names to rows and columns)
rnames=c(“r1”,”r2”)
cnames=c(“c1”,”c2”)
mymat=matrix(c(1:4),nrow=2,byrow=TRUE,dimnames=list(rnames,rnames))
the rows will be named as r1 and r2 and the columns as c1 and c2
how to access the elements
mymat[2,2] will access the element belong to 2 nd row and 2nd column=4
mymat[1,] will access 1st row elements 1 and 2
mymat[,2] will access 2nd column elements 2 and 4

DATE AND TIME TYPES IN R


To knowt the current date ,the command would be Sys.Date() and date()command results current
date with time.Let d1=Sys.date (today)and d=Sys.date()-12 then d1-d2=the number of days between
the these two dates.

IF –ELSE Opperations in R and logical operators


a=2
b=5
if (a>b) {
print(“a>b”)
} else
{
print(“a is not greater than b”)
}
Let us take a number 5,the nearest perfect square number which is less than 4 is 5, so we need to
check if 5 is divisible by all integers from 2 to4 ,if divisible then 5 is not prime number

a=5
if (a%%2==0 || a%%3 ==0 || a%%4 ==0)
{
print(“a is not a prime number”
} else
{
print(“a is a prime number”)
}
A sample program to demonstrate the logical OR,AND operator along with else –if statement
age=37
education=”PG”
status=”married”
if(age<35&&education==”PG”&&status==”married”)
{
print(“The candidate is eligible for AUS immigration”)
} else (status!=single)

5
{
print(“The candidate is rejected since the status is single”)
} elseif(education!=”PG”)
{
print(“The candidate is rejected since education is not PG”)
} elseif(age>35)
{
Print(“The candidate is rejected since age is over 35”)
}
NESTED IF-ELSE STATEMENTS
Age=27
Education=”PG”
Status=”single”
If(age<35) { # 1st level of nesting
if(education==”PG”) { # 2nd level
if(status==”married”) { # 3rd level
print ( “eligible for immigration”)
}else {
print(“Not eligible for immigration”)
} # end of 3rd level
} # end of 2nd level
} # end of 1st level

SWITCH CASE STATEMENT


Code=1
x= switch(code,”Violet”,”Indigo”,”Blue”,”Green”,”Yellow”,”Orange”,”Red”)
print(x)

ASSIGNMENT-3

User defined functions: R language provides default functions like sum and nchar .users can define

their own functions and can call as and when they required.

Let us define myfucntion= function(a) ,and

If we need to print “Hello” 5 times,

The following screen shows ,calling my func(5)

6
Write a function to print the squres of the first N natural numbers,where N would be the the
argument the user passes to the function

Let us define myfunc=function(N) where N is a natural number,we can get the sum of the squares of
the first N number as shown below:

Functions without arguments

greetings=function{

print(“Hello..goodmorning”)

7
}

greeting()

Functions with default arguments

Myfunc=function(a=6,b=4)

s=a+b

print(s)

Myfunc()

Functions returning values

8
Mysum=function(a,b)

S=a+b

Return(s)

Print(mysum(10,20))

Functions returningvalues in DIFFERENT WAY

Mysum=function(a,b)

s=a+b # s will be automatically returned

Print(mysum(10,20))

9
mysum=function(a,b)

s=a+b

b=100 # since b is the last object in this block,b will be returned

Printmysum(10,20))

Write a program where a vector with 10 numbers is passed on to user defined function,increment
every number in the vector and return it back to display the contents

10
Neo<-function(a,b,…)

Invisible{seq(a,b,…)

b=neo(1,10,3)

print(b)

Strings

a= “Hello how are you”

11
b=”Iam fine”

c=”Hi Iam fine’ how about you”

d=’this is reall’y cool’

are valid strings.

Stringconcatenation using paste function

a=”Hi”

b=”Good morning’

c=paste(a,b,sep=” ”)

print(c)

To concatenate 2 strings by passing the strings as an argument to a function and then returning it
back from the function and display the result

12
a=”Hi Hello”

nchar(a)=8

toupper(a)

A program to demondtrate a function calling another function

myfunc1=function(a){

i=0

for(i in a){

i=i*i

13
cat(i,"")

myfunc2=function(b){

j=1

for(j in b){

j=j+i

cat(i,j," ")

Data frame is a collection of vectors arranged in rows and columns with the same number like mXm.

a=1:5

b=6:10

c=11:15

data.frame(a,b,c)->df

print(df)

14
By executing the str(df),summary(df),print(df$a) and print(df$a,df$b) commands on the above data
frame ,the ouput would be like this:

How to etract rows from a dataframe?

row1=df[1,]

printrow1

row1to2=df[1:2]

print(row1to2)

the above is to extract the first 2 rows from the dataframe.

15
row1and5=df[c(1,5)]

row(1,5) and col3=df[c(1,5),3]

Stack command will stack all the vectors in a data frame in to a single column

16
Unstack(df)

cbind() command is used to add more columns to an existing data frame

d=100:104

df=cbind(df,d)print(df)

17
Adding rows to an existing data frame using the command :rbind

rows6=1000:1003

df=rbind(df,row6)

fix(name of data frame):the fix command wil present the contents of the data frame in a separate

EXCEL style window.

If a=1:5,b=6:10,c=11:15 and data.frame (a,b,c)->df,then the transepose of the data frame is t(df)

Interchanging the rows and columns

18
Accessing the elements of the data frame

Suppose we a data frame with 20 rows ,using the head command we can see just the first 6 rows of
the data frame

Head(name of the data frame)

By default ,head command will display only the first 6 rows in a data frame. Head(df,20)will display
first 20 rows of the data frame ,where bdf is the name of the data frame.Head(df1,1) will display

The first row of the data frame named “df”

Tail command: tail(df) works exactly like the head function but bottomup if
a=1:7,b=8:14,c=15:21,d=22:28,e=29:35 data.frame(a,b,c,d,e)->df

19
Let v1=1:5

v1[3]=NA

v2=6:10

v3=11:15

v3[4]=NA

mydf=data.frame(v1,v2,v3)

print(mydf)

na.omit(mydf)

the entire row which contains NA will be ignored.

20
a=1:20, b=1:6,mydf=data.frame(a,b)

Pretty() produces a new vector containing the elements from the originalvector,with the specified
intervals.

Factors type of vectors contain a set of numeric codes with character –valued levels

Example: a family of two girls(1) and four boys(0),

kids=factor(c(1,0,1,0,0,0),

Levels=c(0,1)

Labels=c(“boy”,”girl”)

>kids

21
[1]girl boy girl boy boy boy

Levels:boy girl

>class(kids)

[1] “factor”

>mode(kids)

[1] “numeric’

F1=factor(c(“Pig,”Hive”,”Hbase”,”Pig”)

nlevels(f1)#Nlevels would print the number o funique data in the above list

output=3

22
f1=factor(c(“Hello”,”world”,11)

f1 factors are used in plotting graphs while working with categoricaldata

sizes= c(small,medium,large,small)

f1= factor(sizes)

>f i.e.on printing the object f

Levels:small,medium,large

>unclass(f)

[1]3 2 1 1 3

Attr(,”levels”)

[1] “large” “medium” “small”

3 2 1 3 are the numbers assigned to the values inside the factor[q] the numbers are assigned based
on dictionary order of sorting

>nelevels(f)

3 this is the number of distinct data elements in the object[q]

table(f)

Large medium small

1 1 2

23

You might also like