0% found this document useful (0 votes)
84 views22 pages

Learning R

The document provides an overview of the R programming language and its environment, highlighting its characteristics, installation process, and the use of RStudio as an integrated development environment. It includes basic commands, examples of R programs, and exercises for statistical analysis using R. The document serves as a guide for beginners to understand and utilize R for data analytics.

Uploaded by

Muskan Arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views22 pages

Learning R

The document provides an overview of the R programming language and its environment, highlighting its characteristics, installation process, and the use of RStudio as an integrated development environment. It includes basic commands, examples of R programs, and exercises for statistical analysis using R. The document serves as a guide for beginners to understand and utilize R for data analytics.

Uploaded by

Muskan Arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Data Analytics Using R Environment

Prabhat Mittal

University of Delhi
Introduction Getting Started with R Program Exercises

Table of contents I

1 Introduction
About R
The R Foundation
The R Environment
Characteristics of R
Why R?
Installation of R?
Help Facility

2 Getting Started with R Program


Introduction

3 Exercises

Prabhat Mittal Data Analytics Using R Environment 2 / 22


Introduction Getting Started with R Program Exercises

Introduction
About R

R is a language and environment for statistical computing and


graphics. It is a GNU project created by Ross Ihaka and Robert Gen-
tleman at the University of Auckland, New Zealand and is currently
developed by the R Development Core team of which Chambers is
member.
R is an implementation of the S programming language combined
with lexical scoping. The S-statistical programming developed by
John Chambers and colleagues while at Bell Laboratories (formerly
AT &T, now Lucent Technologies) is often the vehicle of choice for
research in statistical methodology, and R provides an Open Source
route to participation in that activity.
R is available as Free Software under GNU General Public License
in source code form (Unix like Operating System, started in 1984).
It compiles and runs on a wide variety of UNIX platforms, Berkley
Software Distribution (freeBSD), Linux, Windows and MacOS.
Prabhat Mittal Data Analytics Using R Environment 3 / 22
Introduction Getting Started with R Program Exercises

Introduction
The R Foundation

The R Foundation is seated in Vienna, Austria and currently hosted


by the Vienna University of Economics and Business. The R Foun-
dation can be contacted at
The R Foundation for Statistical Computing
c/o Institute for Statistics and Mathematics
Wirtschaftsuniversitt Wien
Welthandelsplatz 1
1020 Vienna, Austria
Tel: (+43 1) 31336 4754

Prabhat Mittal Data Analytics Using R Environment 4 / 22


Introduction Getting Started with R Program Exercises

Introduction
The R Environment

R is a programming language and software environment.We can


access R and commands can be submitted to R through the
command Prompt. We mark the code in a script and press
”Control-R” to executed the code.

Prabhat Mittal Data Analytics Using R Environment 5 / 22


Introduction Getting Started with R Program Exercises

Introduction
Characteristics of R

R is an integrated suite of software facilities for data


manipulation, calculation and graphical display. It includes
an effective data handling and storage facility,
a suite of operators for calculations on arrays, in particular
matrices,
a large, coherent, integrated collection of intermediate
tools for data analysis,
graphical facilities for data analysis and display either
on-screen or on hardcopy, and
a well-developed, simple and effective programming
language which includes conditionals, loops, user-defined
recursive functions and input and output facilities.

Prabhat Mittal Data Analytics Using R Environment 6 / 22


Introduction Getting Started with R Program Exercises

Introduction
Why R?

The term environment is intended to characterize it as a fully


planned and coherent system, rather than an incremental
accretion of very specific and inflexible tools, as is frequently
the case with other data analysis software.
R, like S, is designed around a true computer language, and it
allows users to add additional functionality by defining new
functions. Much of the system is itself written in the R dialect
of S, which makes it easy for users to follow the algorithmic
choices made. For computationally-intensive tasks, C, C++
and Fortran code can be linked and called at run time.
Advanced users can write C code to manipulate R objects
directly.
Many users think of R as a statistics system. We prefer to
think of it of an environment within which statistical
techniques are implemented.
Prabhat Mittal Data Analytics Using R Environment 7 / 22
Introduction Getting Started with R Program Exercises

Introduction
Why R?

R can be extended (easily) via packages. There are about


eight packages supplied with the R distribution and many
more are available through the Comprehensive R-Archive
Network (CRAN) family of Internet sites covering a very wide
range of modern statistics.
R has its own LaTeX-like documentation format, which is
used to supply comprehensive documentation, both on-line in
a number of formats and in hardcopy.

Prabhat Mittal Data Analytics Using R Environment 8 / 22


Introduction Getting Started with R Program Exercises

Introduction
Installation of R?

RConsole R (Programming language) and its console allows


to write and execute codes but it is not elegant form of
coding.

RStudio is an integrated development environment (IDE) a


comprehensive facilities to computer programmers for software
development. It consists of source code editor, build
automation tools & debugger.

Prabhat Mittal Data Analytics Using R Environment 9 / 22


Introduction Getting Started with R Program Exercises

Introduction
Characteristics of RStudio?

Syntax highlighting, code completion, and smart indentation


Execute R code directly from the source editor
Easily manage multiple working directories using projects
Interactive debugger to diagnose and fix errors quickly
Workspace browser and data viewer
Extensive package development tools

Prabhat Mittal Data Analytics Using R Environment 10 / 22


Introduction Getting Started with R Program Exercises

Introduction
Microsoft R Open with RStudio?

Microsoft R Open (MRO) is the perfect complement for the


RStudio environment. MRO supports multiple operating
system and provide features that enhance the performance and
reproducible code, sharing, and collaboration in R language.
Install R Studio
Install Microsoft R Open
install.packages(”readxl”)
After you have installed MRO on your system, open RStudio,
go the ”Tools” tab at the top, and select ”Global Options”.
You should see a couple of pop-up windows. If RStudio is not
already pointing to MRO browse to it, and Click ”OK”.
Prabhat Mittal Data Analytics Using R Environment 11 / 22
Introduction Getting Started with R Program Exercises

Introduction
The Interface of RStudio

Prabhat Mittal Data Analytics Using R Environment 12 / 22


Introduction Getting Started with R Program Exercises

Introduction
The Interface of RStudio

R Console: This area shows the output of code you run.


Also, you can directly write codes in console
R Script: As the name suggest, here you get space to write
codes. To run those codes, simply select the line(s) of code
and press Ctrl + Enter.
R environment: This space displays the set of external
elements added. To check if data has been loaded properly in
R, always look at this area.
Graphical Output: This space display the graphs created
during exploratory data analysis.

Prabhat Mittal Data Analytics Using R Environment 13 / 22


Introduction Getting Started with R Program Exercises

Introduction
Help Facility

R Installation and Administration can be accessed from a Web


browser at
https://cran.r-project.org/doc/manuals/R-admin.html

The help facility can be accessed from a Web browser at


https://cran.r-project.org/doc/manuals/R-intro.html

Prabhat Mittal Data Analytics Using R Environment 14 / 22


Introduction Getting Started with R Program Exercises

Getting Started with R Program


Basic Commands

getwd() returns an absolute filepath representing the current


working directory of the R process
setwd(”c:/docs/mydir”) is used to set the working directory
to mydir

Prabhat Mittal Data Analytics Using R Environment 15 / 22


Introduction Getting Started with R Program Exercises

Getting Started with R Program


R-Program Example I

Simulate 100 normally distributed random numbers and store them in


the object x which is stored in the R memory.

Prabhat Mittal Data Analytics Using R Environment 16 / 22


Introduction Getting Started with R Program Exercises

Getting Started with R Program


R-Program Example I

Prabhat Mittal Data Analytics Using R Environment 17 / 22


Introduction Getting Started with R Program Exercises

Getting Started with R Program


R-Program Example II

polynomial equation: AX 2 + BX + C = 0 has


Consider the general √
the solutions:(−B ± B 2 − 4AC )/2A

Solve the following equation:X 2 + 3X + 1 = 0 .Construct a vector of


length 2 that contains the solutions.

Prabhat Mittal Data Analytics Using R Environment 18 / 22


Introduction Getting Started with R Program Exercises

Getting Started with R Program


R-Program Example III

Construct the object x containing 100 random numbers following


N(µ = .32, σ 2 = .01). Calculate the mean, SD and Histogram of X
(second axis on right side).

Prabhat Mittal Data Analytics Using R Environment 19 / 22


Introduction Getting Started with R Program Exercises

Getting Started with R Program


R-Program Example IV

Perform the following tasks:


Assign the first five positive odd numbers to a vector A.
A <-seq(1,10,2)
Assign the mean of vector A to variable B.
B <-mean(A)
Assign the first five positive even numbers (0 excluded) to X.
X <-seq(2,10,2)
Add vector A and X and assign the result to vector Z.
Z <-A+X
Some Useful syntax
Help(tail), ?tailare commands for help
x <- 5 : 6 will result 5 6
ls() can be used to list all the R objects stored in the working
memory
Prabhat Mittal Data Analytics Using R Environment 20 / 22
Introduction Getting Started with R Program Exercises

Exercises

Perform statistical analysis of data using R programming:


Generate descriptive statistics of the data
Summarize Samples and Tables
Test of hypotheses
Perform Students t-test and Analysis of Variance
Develop Plots, Correlograms and Line Charts
Linear and Logistic Regression models
Cluster Analysis

Prabhat Mittal Data Analytics Using R Environment 21 / 22


Thank You
Email: [email protected]
URL: http://people.du.ac.in/p̃mittal/

You might also like