0% found this document useful (0 votes)
3 views

Fds Unit 2 Notes

The document outlines various subjects and courses related to computer engineering, including topics in data science, statistics, and frequency distribution. It explains the types of data, frequency distributions, and the importance of organizing data for analysis. Additionally, it covers graphical representations of data, such as histograms and frequency polygons, to facilitate understanding and interpretation.

Uploaded by

sarangopi2019
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Fds Unit 2 Notes

The document outlines various subjects and courses related to computer engineering, including topics in data science, statistics, and frequency distribution. It explains the types of data, frequency distributions, and the importance of organizing data for analysis. Additionally, it covers graphical representations of data, such as histograms and frequency polygons, to facilitate understanding and interpretation.

Uploaded by

sarangopi2019
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

Click on Subject/Paper under Semester to enter.

Professional English Discrete Mathematics Environmental Sciences


Professional English - - II - HS3252 - MA3354 and Sustainability -
I - HS3152 GE3451
Digital Principles and
Statistics and Probability and
Computer Organization
Matrices and Calculus Numerical Methods - Statistics - MA3391
- CS3351
- MA3151 MA3251
3rd Semester
1st Semester

4th Semester
2nd Semester

Database Design and Operating Systems -


Engineering Physics - Engineering Graphics
Management - AD3391 AL3452
PH3151 - GE3251

Physics for Design and Analysis of Machine Learning -


Engineering Chemistry Information Science Algorithms - AD3351 AL3451
- CY3151 - PH3256
Data Exploration and Fundamentals of Data
Basic Electrical and
Visualization - AD3301 Science and Analytics
Problem Solving and Electronics Engineering -
BE3251 - AD3491
Python Programming -
GE3151 Artificial Intelligence
Data Structures Computer Networks
- AL3391
Design - AD3251 - CS3591

Deep Learning -
AD3501

Embedded Systems
Data and Information Human Values and
and IoT - CS3691
5th Semester

Security - CW3551 Ethics - GE3791


6th Semester

7th Semester

8th Semester

Open Elective-1
Distributed Computing Open Elective 2
- CS3551 Project Work /
Elective-3
Open Elective 3 Intership
Big Data Analytics - Elective-4
CCS334 Open Elective 4
Elective-5
Elective 1 Management Elective
Elective-6
Elective 2
All Computer Engg Subjects - [ B.E., M.E., ] (Click on Subjects to enter)
Programming in C Computer Networks Operating Systems
Programming and Data Programming and Data Problem Solving and Python
Structures I Structure II Programming
Database Management Systems Computer Architecture Analog and Digital
Communication
Design and Analysis of Microprocessors and Object Oriented Analysis
Algorithms Microcontrollers and Design
Software Engineering Discrete Mathematics Internet Programming
Theory of Computation Computer Graphics Distributed Systems
Mobile Computing Compiler Design Digital Signal Processing
Artificial Intelligence Software Testing Grid and Cloud Computing
Data Ware Housing and Data Cryptography and Resource Management
Mining Network Security Techniques
Service Oriented Architecture Embedded and Real Time Multi - Core Architectures
Systems and Programming
Probability and Queueing Theory Physics for Information Transforms and Partial
Science Differential Equations
Technical English Engineering Physics Engineering Chemistry
Engineering Graphics Total Quality Professional Ethics in
Management Engineering
Basic Electrical and Electronics Problem Solving and Environmental Science and
and Measurement Engineering Python Programming Engineering
Page 1 of 80
www.BrainKart.com

AD3491 FUNDAMENTALS OF DATA SCIENCE

UNIT 2
Frequency Distribution and Data: Types, Tables, and Graphs

Frequency distribution in statistics provides the information of the number of occurrences


(frequency) of distinct values distributed within a given period of time or interval, in a list, table,
or graphical representation.
Types of Frequency Distribution:
There are two types of Frequency Distribution.
• Grouped
• Ungrouped
There are two types Data is a collection of numbers or values
Data: Any bit of information that is expressed in a value or numerical number is data. Data is
basically a collection of information, measurements or observations.

For example

• The marks you scored in your Math exam is data

• The number of cars that pass through a bridge in a day.

Raw data :

Raw data is an initial collection of information. This information has not yet been organized. After
the very first step of data collection, you will get raw data. For example,

A group of five friends their favourite colour. The answers are Blue, Green, Blue, Red, and Red. This
collection of information is the raw data.

Discrete data :Discrete data is that which is recorded in whole numbers, like the number of
children in a school or number of tigers in a zoo. It cannot be in decimals or fractions.

Continuous data :Continuous data need not be in whole numbers, it can be in decimals. Examples
are the temperature in a city for a week, your percentage of marks for the last exam etc.

Example of Data Handling:

• Pictographs

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 2 of 80
www.BrainKart.com

• Bar Graphs

• Histogram and Pie-Charts

• Chance and Probability

• Arithmetic Mean and Median and Mode


Frequency

The frequency of any value is the number of times that value appears in a data set. So from the
above examples of colours, we can say two children like the colour blue, so its frequency is two. So
to make meaning of the raw data, we must organize. And finding out the frequency of the data
values is how this organisation is done.
Frequency Distribution
Many times it is not easy or feasible to find the frequency of data from a very large dataset. So to
make sense of the data we make a frequency table and graphs. Let us take the example of the
heights of ten students in cms.
Frequency Distribution Table

139, 145, 150, 145, 136, 150, 152, 144, 138, 138

This frequency table will help us make better sense of the data given. Also when the data set is too
big (say if we were dealing with 100 students) we use tally marks for counting. It makes the task
more organized and easy. Below is an example of how we use tally marks.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 3 of 80
www.BrainKart.com

Frequency Distribution Graph


Using the same above example we can make the following graph:

Learn more about Bar Graphs and Histogram here.


Types of Frequency Distribution
• Grouped frequency distribution.
• Ungrouped frequency distribution.
• Cumulative frequency distribution.
• Relative frequency distribution.

• Relative cumulative frequency distribution.


Grouped Data
At certain times to ensure that we are making correct and relevant observations from the data set,
we may need to group the data into class intervals. This ensures that the frequency
distribution best represents the data. example :the height of students.

Class Interval Frequency

130-140 4

140-150 3

150-160 3

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 4 of 80
www.BrainKart.com

From the above table, you can see that the value of 150 is put in the class interval of 150-160 and
not 140-150. This is the convention we must follow.

• The table gives the number of snacks ordered and the number of days as a tally. Find
the frequency of snacks ordered. 2

Answer: From the frequency table the number of snacks ordered ranging between

• 2-4 is 4 days

• 4 to 6 is 3 days

• 6 to 8 is 9 days

• 8 to 10 is 9 days

• 10 to 12 is 7 days.
So the frequencies for all snacks ordered are 4, 3, 9, 9, 7

• How to find frequency distribution? 2

Answer: We can find frequency distribution by the following steps:

• First of all, calculate the range of the data set.

• Next, divide the range by the number of the group you want your data in and then round up.

• After that, use class width to create groups

• Finally, find the frequency for each group.


• Define frequency distribution in statistics? 2

Answer: In an overview, the frequency distribution of all distinct values in some variables and the
number of times they occur. Meaning that it tells how frequencies are distributed overvalues in a
frequency distribution. However, mostly we use frequency distributions to summarize categorical
variables.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 5 of 80
www.BrainKart.com

• Why are frequency distributions important? 2

Answer: It has great importance in statistics. Also, a well-structured frequency distribution makes
possible a detailed analysis of the structure of the population with respect to given characteristics.
Therefore, the groups into which the population break down can be determined.

• State the components of frequency distribution? 2

Answer: The various components of the frequency distribution are: Class interval, types of class
interval, class boundaries, midpoint or class mark, width or size o class interval, class frequency,

frequency density = class frequency/ class width,

relative frequency = class frequency/ total frequency, etc.

Descriptive Statistics

A population is the group to be studied, and population data is a collection of all elements in the
population. For example:

• All the fish in Long Lake.


• All the lakes in the Adirondack Park.
• All the grizzly bears in Yellowstone National Park.

A sample is a subset of data drawn from the population of interest. For example:

• 100 fish randomly sampled from Long Lake.


• 25 lakes randomly selected from the Adirondack Park.
• 60 grizzly bears with a home range in Yellowstone National Park.

Populations are characterized by descriptive measures called parameters. Inferences about


parameters are based on sample statistics.

For example,

The population mean (µ) is estimated by the sample mean (x̄ ). The population variance (σ2) is
estimated by the sample variance (s2).

Variables are the characteristics we are interested in.

For example:

• The length of fish in Long Lake.


• The pH of lakes in the Adirondack Park.
ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 6 of 80
www.BrainKart.com

• The weight of grizzly bears in Yellowstone National Park.

Variables are divided into two major groups: Qualitative And Quantitative.

1. Qualitative variables

• Qualitative variables have values that are attributes or categories.

• Mathematical operations cannot be applied to qualitative variables.

• Examples of qualitative variables are gender, race, and petal color.

• Quantitative variables have values that are typically numeric, such as measurements.

• Mathematical operations can be applied to these data. Examples of quantitative variables


are age, height, and length.

2. Quantitative variables

o Quantitative variables can be broken down further into two more categories:
discrete and continuous variables.

o Discrete variables have a finite or countable number of possible values. Think of


discrete variables as “hens.” Hens can lay 1 egg, or 2 eggs, or 13 eggs… There are a
limited, definable number of values that the variable could take on.

o Continuous variables have an infinite number of possible values. Think of


continuous variables as “cows.” Cows can give 4.6713245 gallons of milk, or
7.0918754 gallons of milk, or 13.272698 gallons of milk … There are an almost
infinite number of values that a continuous variable could take on.

Examples

Is the variable qualitative or quantitative?

Species Weight Diameter Zip Code

(qualitative quantitative, quantitative, qualitative)

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 7 of 80
www.BrainKart.com

Graphs
Data can be described clearly and concisely with the aid of a well-constructed frequency
distribution.
GRAPHS FOR QUANTITATIVE DATA
Histograms
A bar-type graph for quantitative data. The common boundaries between adjacent bars
emphasize the continuity of the data, as with continuous variables.
A histogram in Figure shows a casual glance at this histogram confirms previous conclusions: a
dense concentration of weights among the 150s, 160s, and 170s, with a spread in the direction
of the heavier weights. Let’s pinpoint some of the more important features of histograms.
■ Equal units along the horizontal axis (the X axis, or abscissa) reflect the various class intervals
of the frequency distribution.
■ Equal units along the vertical axis (the Y axis, or ordinate) reflect increases in frequency. (The
units along the vertical axis do not have to be the same width as those along the horizontal axis.)
■ The intersection of the two axes defines the origin at which both numerical scales equal 0

Frequency Polygon
A line graph for quantitative data that also emphasizes the continuity of continuous variables
An important variation on a histogram is the frequency polygon, or line graph. Frequency
polygons may be constructed directly from frequency distributions. However, we will follow the
step-by-step transformation of a histogram into a frequency polygon, as described in panels A,
B, C, and D of Figure 2.2. A. This panel shows the histogram for the weight distribution. B. Place
dots at the midpoints of each bar top or, in the absence of bar tops, at midpoints for classes on
the horizontal axis, and connect them with straight lines. [To find the midpoint of any class, such

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 8 of 80
www.BrainKart.com

as 160–169, simply add the two tabled boundaries (160 + 169 = 329) and divide this sum by 2
(329/2 = 164.5).] C. Anchor the frequency polygon to the horizontal axis. First, extend the upper
tail to the midpoint of the first unoccupied class (250–259) on the upper flank of the histogram.
Then extend the lower tail to the midpoint of the first unoccupied class (120–129) on the lower
flank of the histogram. Now all of the area under the frequency polygon is enclosed completely.
D. Finally, erase all of the histogram bars, leaving only the frequency polygon. Frequency
polygons are particularly useful when two or more frequency distributions or relative frequency
distributions are to be included in the same graph.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 9 of 80
www.BrainKart.com

Stem and Leaf Displays:


A device for sorting quantitative data on the basis of leading and trailing digits.
Still another technique for summarizing quantitative data is a stem and leaf display. Stem and
leaf displays are ideal for summarizing distributions, such as that for weight data, without
destroying the identities of individual observations.
Constructing a Display
The stemplot (also called stem and leaf plot) is another graphical display ofthe
distribution of quantitative variable.
To create a stemplot, the idea is to separate each data point into a stemand leaf,
asfollows:

• The leaf is the right-most digit.


• The stem is everything except the right-most digit.
• So, if the data point is 34, then 3 is the stem and 4 is the leaf.
• If the data point is 3.41, then 3.4 is the stem and 1 is the leaf.

• Note: For this to work, ALL data points should be rounded to the same
number of decimal places.
EXAMPLE: Best Actress Oscar Winners

We will continue with the Best Actress Oscar winners example


34 34 26 37 42 41 35 31 41 33 30 74 33 49 38 61 21 41 26 80 43 29 33 35 45
49 39 34 26 25 35 33
To make a stemplot:

• Separate each observation into a stem and a leaf.


• Write the stems in a vertical column with the smallest at the top, and
draw avertical line at the right of this column.
• Go through the data points, and write each leaf in the row to the
right of itsstem.
• Rearrange the leaves in an increasing order.

When some of the stems hold a large number of leaves, we can split each stem into two:
one holding the leaves 0-4, and the other holding the leaves 5-9. A

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 10 of 80
www.BrainKart.com

statistical software package will often do the splitting for you, when appropriate.Note
that when rotated 90 degrees counter-clockwise, the stemplot visuallyresembles a
histogram:

The stemplot has additional unique features:


• preserves the original data.
• It sorts the data (which will become very useful in the next section).

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 11 of 80
www.BrainKart.com

Typical Shapes

Whether expressed as a histogram, a frequency polygon, or a stem and leaf display, an important
characteristic of a frequency distribution is its shape. Figure 2.3 shows some of the more typical
shapes for smoothed frequency polygons (which ignore the inevitable irregularities of real data).

Normal
Any distribution that approximates the normal shape in panel A of Figure 2.3 can be analyzed, as
we will see in Chapter 5, with the aid of the well-documented normal curve. The familiar bell-
shaped silhouette of the normal curve can be superimposed on many frequency distributions,
including those for uninterrupted gestation periods of human fetuses, scores on standardized
tests, and even the popping times of individual kernels in a batch of popcorn.
Bimodal
Any distribution that approximates the bimodal shape in panel B of Figure 2.3 might, as
suggested previously, reflect the coexistence of two different types of observations in the same
distribution. For instance, the distribution of the ages of residents in a neighborhood consisting
largely of either new parents or their infants has a bimodal shape.
Positively Skewed The two remaining shapes in Figure 2.3 are lopsided. A lopsided distribution
caused by a few extreme observations in the positive direction (to the right of the majority of
observations), as in panel C of Figure 2.3, is a positively skewed distribution.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 12 of 80
www.BrainKart.com

The distribution of incomes among U.S. families has a pronounced positive skew, with most
family incomes under $200,000 and relatively few family incomes spanning a wide range of
values above $200,000. The distribution of weights in Figure 2.1 also is positively skewed.
Negatively Skewed A lopsided distribution caused by a few extreme observations in the negative
direction (to the left of the majority of observations), as in panel D of Figure 2.3, is a negatively
skewed distribution. The distribution of ages at retirement among U.S. job holders has a
pronounced negative skew, with most retirement ages at 60 years or older and relatively few
retirement ages spanning the wide range of ages younger than 60.
A GRAPH FOR QUALITATIVE (NOMINAL) DATA:
The distribution in Table 2.7, based on replies to the question “Do you have a Facebook profile?”
appears as a bar graph in Figure 2.4. A glance at this graph confirms that Yes replies occur
approximately twice as often as No replies. As with histograms, equal segments along the
horizontal axis are allocated to the different words or classes that appear in the frequency
distribution for qualitative data. Likewise, equal segments along the vertical axis reflect
increases in frequency. The body of the bar graph consists of a series of bars whose heights
reflect the frequencies for the various words or classes. A person’s answer to the question “Do
you have a Facebook profile?” is either Yes or No, not some impossible intermediate value, such
as 40 percent Yes and 60 percent No. Gaps are placed between adjacent bars of bar graphs to
emphasize the discontinuous nature of qualitative data. A bar graph also can be used with
quantitative data to emphasize the discontinuous nature of a discrete variable, such as the
number of children in a family.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 13 of 80
www.BrainKart.com

Misleading Graphs:
Graphs can be constructed in an unscrupulous manner to support a particular point of view.
Indeed, this type of statistical fraud gives credibility to popular sayings, including “Numbers
don’t lie, but statisticians do” and “There are three kinds of lies—lies, damned lies, and
statistics.” For example, to imply that comparatively many students responded Yes to the
Facebook profile question, an unscrupulous person might resort to the various tricks shown in
Figure 2.5:
■ The width of the Yes bar is more than three times that of the No bar, thus violating the custom
that bars be equal in width.
■ The lower end of the frequency scale is omitted, thus violating the custom that the entire scale
be reproduced, beginning with zero. (Otherwise, a broken scale should be highlighted by
crossover lines, as in Figures 2.1 and 2.2.)
■ The height of the vertical axis is several times the width of the horizontal axis, thus violating
the custom, heretofore unmentioned, that the vertical axis be approximately as tall as the
horizontal axis is wide. Beware of graphs in which, because the vertical axis is many times larger
than the horizontal axis (as in Figure 2.5), frequency differences are exaggerated, or in which,
because the vertical axis is many times smaller than the horizontal axis, frequency differences
are suppressed.

AVERAGES
A center of a data set is a way of describing a location. We can measure a center of a
data in 3 different ways: the mean (average), the median and the mode.

The two main numerical measures for the center of a distribution are the mean and the
median. Each one of these measures is based on a completely different idea of
describing the center of a distribution. Let us first present each one of the measures,
and then compare their properties.
MEAN
ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 14 of 80
www.BrainKart.com

The mean is the average of a set of observations (i.e., the sum of the observations
divided by the number of observations).
The mean is the average of a set of observations. If the n observations are written as their
mean can be written mathematically as: their mean is:

We read the symbol as “x-bar.” The bar notation is commonly used to


represent the samplemean, i.e. the mean of the sample.

EXAMPLE: Best Actress Oscar Winners


We will continue with the Best Actress Oscar winners example .
34 34 26 37 42 41 35 31 41 33 30 74 33 49 38 61 21 41 26 80 43 29 33 35 45
49 39 34 26 25 35 33
The mean age of the 32 actresses is:

We add all of the ages to get 1233 and divide by the number of ages which was 32 to
get 38.5. We denote this result as x-bar and called the sample mean.

EXAMPLE: World Cup Soccer

Often we have large sets of data and use a frequency table to display the data more
efficiently. Data were collected from the last three World Cup soccer tournaments. A
total of 192 games were played. The table below lists the number of goals scored per
game (not including any goals scored in shootouts).
Total # Frequency
Goals/Game
0 17
1 45
2 51

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 15 of 80
www.BrainKart.com

3 37
4 25
5 11
6 3
7 2
8 1

To find the mean number of goals scored per game, we would need to find the sum of
all 192 numbers, and then divide that sum by 192.

Rather than add 192 numbers, we use the fact that the same numbers appear many
times. For example, the number 0 appears 17 times, the number 1 appears 45 times,
the number2 appears 51 times, etc.

If we add up 17 zeros, we get 0. If we add up 45 ones, we get 45. If we add up 51 twos,


we get 102. Repeated addition is multiplication.

Thus, the sum of the 192 numbers

= 0(17) + 1(45) + 2(51) + 3(37) + 4(25) + 5(11) + 6(3) + 7(2) + 8(1) = 453.

The sample mean is then 453 / 192 = 2.359.

Note that, in this example, the values of 1, 2, and 3 are the most common andour
averagefalls in this range representing the bulk of the data.

MEDIAN

Define and calculate the sample median of a quantitative variable.


The median M is the midpoint of the distribution. It is the number suchthat half
of the observations fall above, and half fall below.
To find the median:
Order the data from smallest to largest.
Consider whether n, the number of observations, is even or odd.
If n is odd, the median M is the center observation in the ordered list. Thisobservation is the
one “sitting” in the (n + 1) / 2 spot in the ordered list.

If n is even, the median M is the mean of the two center observations in the ordered
ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 16 of 80
www.BrainKart.com

list. These two observations are the ones “sitting” in the (n / 2) and(n / 2) + 1 spots
in the ordered list.
EXAMPLE: Median (1)
For a simple visualization of the location of the median, consider the following two
simple cases of n = 7 and n = 8 ordered observations, with each observation
represented by asolid circle:

Comments:
In the images above, the dots are equally spaced, this need not indicate the data values
are actually equally spaced as we are only interested in listing them in order. In fact, in
the above pictures, two subsequent dots could have exactly the same value. It is clear
that the value of the median will be in the same position regardless of the distance
between data values.

EXAMPLE: Median (2)


To find the median age of the Best Actress Oscar winners, we first need to order the data.
It would be useful, then, to use the stemplot, a diagram in which the dataare
already ordered.
Here n = 32 (an even number), so the median M, will be the mean of thetwo
center observations.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 17 of 80
www.BrainKart.com

These are located at the (n / 2) = 32 / 2 = 16th and(n / 2) +


1 = (32 / 2) + 1 = 17th

Counting from the top, we find that: the 16th ranked observation is 35the 17thranked
observation also happens to be 35. Therefore, the median M = (35 + 35) / 2 = 35

Comparing the Mean and the Median

The mean and the median, the most common measures of center, each describe the
centerof a distribution of values in a different way.

The mean describes the center as an average value, in which the actual values of the
data points play an important role.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 18 of 80
www.BrainKart.com

The median, on the other hand, locates the middle value as the center, and
theorder of the data is the key.

To get a deeper understanding of the differences between these twomeasures of


center,consider the following example. Here are two datasets:

Data set A → 64 65 66 68 70 71 73
Data set B → 64 65 66 68 70 71 730
For dataset A, the mean is 68.1, and the median is 68.

Looking at dataset B, notice that all of the observations except the last one are
close together. The observation 730 is very large, and is certainly an outlier. In this case,
the median is still 68, but the mean will be influenced by the high outlier, and shifted
up to 162.
The message that we should take from this example is:
The mean is very sensitive to outliers (because it factors in their magnitude), while
the median is resistant (or robust) to outliers.

MODE: 3rd Measure

The mode of a data set is the number that occurs most frequently in the set.
• If no value appears more than once in the data set, the data set has no mode.
• If a there are two values that appear in the data set an equal number of
times, theyboth will be modes etc.

For symmetric distributions with no outliers: the mean is approximately equaltothe


median.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 19 of 80
www.BrainKart.com

For skewed right distributions and/or datasets with high outliers: the mean is

greater than the median.

For skewed left distributions and/or datasets with low outliers: the mean is less than
the median.

When to use which measures?

• Use the sample mean as a measure of center for symmetric distributions with
no outliers. Otherwise, the median will be a more appropriate measure of the
center of our data.

Let’s Summarize

• The two main numerical measures for the center of a distribution are the

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 20 of 80
www.BrainKart.com

mean and the median. The mean is the average value, while the median is the middle
value.
• The mean is very sensitive to outliers (as it factors in their magnitude),
while the median is resistant to outliers.
• The mean is an appropriate measure of center for symmetric distributions
with no outliers. In all other cases, the median is often a better measure of the
center of the distribution.
Describing Variability

Intuitive Approach
• In Figure 4.1, each of the three frequency distributions consists of seven scores with the
same mean (10) but with different variabilities. (Ignore the numbers in boxes; their
significance will be explained later.) Before reading on, rank the three distributions from
least to most variable. Your intuition was correct if you concluded that distribution A has
the least variability, distribution B has intermediate variability, and distribution C has the
most variability. If this conclusion is not obvious, look at each of the three distributions, one
at a time, and note any differences among the values of individual scores. For distribution
A with the least (zero) variability, all seven scores have the same value (10). For
distribution B with intermediate variability, the values of scores vary slightly (one 9 and
one 11), and for distribution C with most variability, they vary even more (one 7, two 9s,
two 11s, and one 13). Importance of Variability Variability assumes a key role in an analysis
of research results. For example, a researcher might ask: Does fitness training improve, on
average, the scores of depressed patients on a mental-wellness test? To answer this
question, depressed patients are randomly assigned to two groups, fitness training is given
to one group, and wellness scores are obtained for both groups. Let’s assume that the mean
wellness score is larger for the group with fitness training. Is the observed mean difference
between the two groups real or merely transitory? This decision depends not only on the
size of the mean difference between the two groups but also on the inevitable variabilities
of individual scores within each group. To illustrate the importance of variability, Figure 4.2
shows the outcomes for two fictitious experiments, each with the same mean difference of
2, but with the two groups in experiment B having less variability than the two groups in
experiment C. Notice that groups B and C in Figure 4.2 are the same as their counterparts
in Figure 4.1. Although the new group B* retains exactly the same (intermediate) variability

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 21 of 80
www.BrainKart.com

as group B, each of its seven scores and its mean have been shifted 2 units to the right.
Likewise, although the new group C* retains exactly the same (most) variability as group
C, each of its seven scores and its mean have been shifted 2 units to the right. Consequently,
the crucial mean difference of 2 (from 12 − 10 = 2) is the same for both experiments. Before
reading on, decide which mean difference of 2 in Figure 4.2 is more apparent. The mean
difference for experiment B should seem more apparent because of the smaller variabilities
within both groups B and B*. Just as it’s easier to hear a phone message when static is
reduced, it’s easier to see a difference between group means when variabilities within
groups are reduced.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 22 of 80
www.BrainKart.com

Range

A range measures the spread of a data inside the limits of a data set, it is calculated as
a difference between the highest and lowest values in the data set. The larger the range,
the greater the spread of the data.The range covered by the data is the most intuitive
measure of variability. The range is exactly the distance between the smallest data
point (min) and the largest one (Max).

Range = Max – min

Note: When we first looked at the histogram, and tried to get a first feel for the spread
of the data, we were actually approximating the range, rather than calculating the exact
range.

EXAMPLE: Best Actress Oscar Winners

Here we have the Best Actress Oscar winners’ data


34 34 26 37 42 41 35 31 41 33 30 74 33 49 38 61 21 41 26 80 43 29 33 35 45
49 39 34 26 25 35 33
In this example:

min = 21 (Marlee Matlin for Children of a Lesser God, 1986) Max = 80


(Jessica Tandy for Driving Miss Daisy, 1989)

The range covered by all the data is 80 – 21 = 59 years.


Variance:
The mean of all squared deviation scores.
Although both the range and its most important spinoff, the interquartile range
(discussed in Section 4.7), serve as valid measures of variability, neither is among the
statistician’s preferred measures of variability. Those roles are reserved for the
variance and particularly for its square root, the standard deviation, because these
measures serve as key components for other important statistical measures.
Accordingly, the variance and standard deviation occupy the same exalted position
among measures of variability as does the mean among measures of central tendency.
Following the computational procedures described in later sections of this chapter, we
could calculate the value of the variance for each of the three distributions in Figure
4.1. Its value equals 0.00 for the least variable distribution, A, 0.29 for the moderately

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 23 of 80
www.BrainKart.com

variable distribution, B, and 3.14 for the most variable distribution, C, in agreement
with our intuitive judgments about the relative variability of these three distributions.
Reconstructing the Variance To understand the variance better, let’s reconstruct it step
by step. Although a measure of variability, the variance also qualifies as a type of mean,
that is, as the balance point for some distribution. To qualify as a type of mean, the
values of all scores must be added and then divided by the total number of scores. In
the case of the variance, each original score is re-expressed as a distance or deviation
from the mean by subtracting the mean. For each of the three distributions in
Figure 4.1, the face values of the seven original scores (shown as numbers along the X
axis) have been re-expressed as deviation scores from their mean of 10 (shown as
numbers in the boxes). For example, in distribution C, one score coincides with the
mean of 10, four scores (two 9s and two 11s) deviate 1 unit from the mean, and two
scores (one 7 and one 13) deviate 3 units from the mean, yielding a set of seven
deviation scores: one 0, two –1s, two 1s, one –3, and one 3. (Deviation scores above the
mean are assigned positive signs; those below the mean are assigned negative signs.)
Mean of the Deviations Not a Useful Measure No useful measure of variability can be
produced by calculating the mean of these seven deviations, since, as you will recall
from Chapter 3, the sum of all deviations from their mean always equals zero. In effect,
the sum of all negative deviations always counterbalances the sum of all positive
deviations, regardless of the amount of variability in the group.

The standard deviation is to quantify the spread of a distribution by measuring how


far the observations are from their mean. The standard deviation gives the average (or
typicaldistance) between a data point and the mean.

Standard deviation is the measure of the overall spread (variability) of a data set
valuesfrom the mean. The more spread out a data set is, the greater are the distances
from themean and the standard deviation.

There are many notations for the standard deviation: SD, s, Sd, StDev. Here,
we’ll use SDas an abbreviation for standard deviation, and use s as the symbol. Formula

The sample standard deviation formula is:

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 24 of 80
www.BrainKart.com

Calculation
In order to get a better understanding of the standard deviation, it would beuseful
tosee an example of how it is calculated.

EXAMPLE: Video Store Customers

The following are the number of customers who entered a video store in8
consecutivehours: 7, 9, 5, 13, 3, 11, 15, 9

To find the standard deviation of the number of hourly customers:


1. Find the mean, x-bar, of your data:
(7 + 9 + 5 + 13 + 3 + 11 + 15 + 9)/8 = 9

2. Find the deviations from the mean:


• The differences between each observation and the mean here are
(7 – 9), (9 – 9), (5 – 9), (13 – 9), (3 – 9), (11 – 9), (15 – 9), (9 – 9)
-2, 0, -4, 4, -6, 2, 6, 0

• Since the standard deviation attempts to measure the average (typical)


distance between the data points and their mean, it would make sense to
average the deviation we obtained.
• Note, however, that the sum of the deviations is zero.
3. To solve the previous problem, in our calculation, we square each of the
deviations.

(-2)2, (0)2, (-4)2, (4)2, (-6)2, (2)2, (6)2, (0)2

4, 0, 16, 16, 36, 4, 36, 0

4.Sum the squared deviations and divide by n – 1:


(4 + 0 + 16 + 16 + 36 + 4 + 36 + 0)/(8 – 1)

(112)/(7) = 16

• This value, the sum of the squared deviations divided by n – 1, is called the
variance. However, the variance is not used as a measure of spread directly as

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 25 of 80
www.BrainKart.com

the units are the square of the units of the original data.

5. The standard deviation of the data is the square root of the variance calculated
in step.
In this case, we have the square root of 16 which is 4. We will use the lower case
letter s represent the standard deviation. s = 4

• We take the square root to obtain a measure which is in the original units
of the data. The units of the variance of 16 are in “squared customers” which is
difficult to interpret.

• The units of the standard deviation are in “customers” which makes this
measure ofvariation more useful in practice than the variance.
9. The interpretation of the standard deviation is that on average, the actual
number of customers who enter the store each hour is 4 away from 9.
• The standard deviation is the square root of the variance (both population and sample).
• While the sample variance is the positive, unbiased estimator for the population
variance, the units for the variance are squared.
• The standard deviation is a common method for numerically describing the distribution
of a variable. The population standard deviation is σ (sigma) and sample standard
deviation is s.

Population standard deviation Sample standard deviation

Example 7

Compute the standard deviation of the sample data: 3, 5, 7 with a sample mean of 5.

DEGREES OF FREEDOM ( d f)

Degrees of freedom (df) refers to the number of values that are free to vary, given one
or more mathematical restrictions, in a sample being used to estimate a population
characteristic.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 26 of 80
www.BrainKart.com

The number of values free to vary, given one or more mathematical restrictions.

degrees of freedom, that is, df = n – 1.


Inter-Quartile Range (IQR)

The Inter-Quartile Range or IQR measures the variability of a


distribution by giving us the range covered by the MIDDLE 50% of the data.To find
the interquartile range (IQR), first find the median (middle value) of the lower and
upper half of the data. These values are quartile 1 (Q1) and quartile 3 (Q3). The IQR is
the difference between Q3 and Q1.

IQR = Q3 – Q1

Q3 = 3rd Quartile = 75th PercentileQ1 = 1st

Quartile = 25th Percentile

The following picture illustrates this idea: (Think about the horizontal line as the data
ranging from the min to the Max). IMPORTANT NOTE: The “lines” in the following
illustrations are not to scale. The equal distances indicate equal amounts of data NOT
equal distance between the numeric values.

To calculate the IQR:

1. Arrange the data in increasing order, and find the median M. Recall that
the median divides the data, so that 50% of the data points are below the
median, and 50% of the data points are above the median.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 27 of 80
www.BrainKart.com

2. Find the median of the lower 50% of the data. This is called the first
quartile of the distribution, and the point is denoted by Q1. Note from the
picture that Q1 divides the lower 50% of the data into two halves, containing
25% of the data points in eachhalf. Q1 is called the first quartile, since one
quarter of the data points fall below it.

3. Repeat this again for the top 50% of the data. Find the median of the
top 50% of the data. This point is called the third quartile of the distribution,
and is denoted by Q3.Note from the picture that Q3 divides the top 50% of the
data into two halves, with 25%of the data points in each.Q3 is called the third
quartile,since three quarters of the data points fall below it.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 28 of 80
www.BrainKart.com

4. The middle 50% of the data falls between Q1 and Q3, and therefore:

IQR = Q3 – Q1.

Comments:
1. The last picture shows that Q1, M, and Q3 divide the data into four
quarters with 25%of the data points in each, where the median is essentially
the second quartile. The use of IQR = Q3 – Q1 as a measure of spread is therefore
particularly appropriate when the median M is used as a measure ofcenter.

2. We can define a bit more precisely what is considered the bottom or top
50% of the data. The bottom (top) 50% of the data is all the observations whose
position in the ordered list is to the left (right) of the location of the overall
median M. The following picture will visually illustrate this for the simple cases
of n = 7 and n = 8.

ps://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 29 of 80
www.BrainKart.com

Note that when n is odd (as in n = 7 above), the median is not included in either
the bottom or top half of the data; When n is even (as in n = 8 above), the data are
naturally divided into two halves.

EXAMPLE: Best Actress Oscar Winners

To find the IQR of the Best Actress Oscar winners’ distribution, it will be
convenient touse the stemplot.

Q1 is the median of the bottom half of the data. Since there are 16 observations in
that half, Q1 is the mean of the 8th and 9th ranked observations in that half:
Q1 = (31 + 33) / 2 = 32
Similarly, Q3 is the median of the top half of the data, and since there are 16
observations in that half, Q3 is the mean of the 8th and 9th ranked observations
in that half:
Q3 = (41 + 42) / 2 = 41.5
IQR = 41.5 – 32 = 9.5
Note that in this example, the range covered by all the ages is 59 years, while the
range covered by the middle 50% of the ages is only 9.5 years. While the whole
dataset is spread over a range of 59 years, the middle 50% of the datais packed
into only 9.5 years.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 30 of 80
www.BrainKart.com

The Normal Distribution

Many continuous random variables have a bell-shaped or somewhat symmetric


distribution.
This is a normal distribution. In other words, the probability distribution of its relative
frequency histogram follows a normal curve.
The curve is bell-shaped, symmetric about the mean, and defined by µ and σ (the mean and
standard deviation).

Figure 9. A normal distribution.


There are normal curves for every combination of µ and σ.
• The mean (µ) shifts the curve to the left or right.
• The standard deviation (σ) alters the spread of the curve.
• The first pair of curves have different means but the same standard deviation.
• The second pair of curves share the same mean (µ) but have different standard
deviations.
• The pink curve has a smaller standard deviation. It is narrower and taller, and the
probability is spread over a smaller range of values.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 31 of 80
www.BrainKart.com

• The blue curve has a larger standard deviation. The curve is flatter and the tails are
thicker. The probability is spread over a larger range of values.

Figure 10. A comparison of normal curves.

Properties of the normal curve:

• The mean is the center of this distribution and the highest point.
• The curve is symmetric about the mean. (The area to the left of the mean equals the area to
the right of the mean.)
• The total area under the curve is equal to one.
• As x increases and decreases, the curve goes to zero but never touches.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 32 of 80
www.BrainKart.com

• The PDF of a normal curve is .


• A normal curve can be used to estimate probabilities.
• A normal curve can be used to estimate proportions of a population that have certain x-
values.

The Standard Normal Distribution

There are millions of possible combinations of means and standard deviations for
continuous random variables.

Finding probabilities associated with these variables would require us to integrate the PDF
over the range of values we are interested in.

To avoid this, we can rely on the standard normal distribution. T

he standard normal distribution is a special normal distribution with a µ = 0 and σ = 1. We


can use the Z-score to standardize any normal random variable, converting the x-values to
Z-scores, thus allowing us to use probabilities from the standard normal table. So how do
we find area under the curve associated with a Z-score?

Standard Normal Table

• The standard normal table gives probabilities associated with specific Z-scores.
• The table we use is cumulative from the left.
• The negative side is for all Z-scores less than zero (all values less than the mean).
• The positive side is for all Z-scores greater than zero (all values greater than the mean).
• Not all standard normal tables work the same way.

Example 10

What is the area associated with the Z-score 1.62?

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 33 of 80
www.BrainKart.com

Figure 11. The standard normal table and associated area for z = 1.62.

Reading the Standard Normal Table

• Read down the Z-column to get the first part of the Z-score (1.6).
• Read across the top row to get the second decimal place in the Z-score (0.02).
• The intersection of this row and column gives the area under the curve to the left of the Z-
score.

Finding Z-scores for a Given Area

• What if we have an area and we want to find the Z-score associated with that area?
• Instead of Z-score → area, we want area → Z-score.
• We can use the standard normal table to find the area in the body of values and read
backwards to find the associated Z-score.
• Using the table, search the probabilities to find an area that is closest to the probability you
are interested in.

Example 11

To find a Z-score for which the area to the right is 5%:

Since the table is cumulative from the left, you must use the complement of 5%.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 34 of 80
www.BrainKart.com

1.000 – 0.05 = 0.9500

Figure 12. The upper 5% of


the area under a normal curve.

• Find the Z-score for the area of 0.9500.


• Look at the probabilities and find a value as close to 0.9500 as possible.

Figure
13. The standard normal table.
The Z-score for the 95th percentile is 1.64.Area in between Two Z-scores
Example 12

To find Z-scores that limit the middle 95%:

• The middle 95% has 2.5% on the right and 2.5% on the left.
• Use the symmetry of the curve.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 35 of 80
www.BrainKart.com

Figure 14. The middle


95% of the area under a normal curve.

• Look at your standard normal table. Since the table is cumulative from the left, it is easier
to find the area to the left first.
• Find the area of 0.025 on the negative side of the table.
• The Z-score for the area to the left is -1.96.
• Since the curve is symmetric, the Z-score for the area to the right is 1.96.

Common Z-scores

There are many commonly used Z-scores:

• Z.05 = 1.645 and the area between -1.645 and 1.645 is 90%
• Z.025 = 1.96 and the area between -1.96 and 1.96 is 95%
• Z.005 = 2.575 and the area between -2.575 and 2.575 is 99%

Applications of the Normal Distribution

Typically, our normally distributed data do not have μ = 0 and σ = 1, but we can relate any
normal distribution to the standard normal distributions using the Z-score. We can
transform values of x to values of z.

For example, if a normally distributed random variable has a μ = 6 and σ = 2, then a value of
x = 7 corresponds to a Z-score of 0.5.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 36 of 80
www.BrainKart.com

This tells you that 7 is one-half a standard deviation above its mean. We can use this
relationship to find probabilities for any normal random variable.

Figure 15. A normal and standard normal curve.

To find the area for values of X, a normal random variable, draw a picture of the area of
interest, convert the x-values to Z-scores using the Z-score and then use the standard
normal table to find areas to the left, to the right, or in between.

Example 13

Adult deer population weights are normally distributed with µ = 110 lb. and σ = 29.7 lb. As
a biologist you determine that a weight less than 82 lb. is unhealthy and you want to know
what proportion of your population is unhealthy.

P(x<82)

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 37 of 80
www.BrainKart.com

Figure 16. The area


under a normal curve for P(x<82).

Convert 82 to a Z-score

The x value of 82 is 0.94 standard deviations below the mean.

Figure 17. Area under


a standard normal curve for P(z<-0.94).
Go to the standard normal table (negative side) and find the area associated with a Z-score
of -0.94.

This is an “area to the left” problem so you can read directly from the table to get the
probability.

P(x<82) = 0.1736

Approximately 17.36% of the population of adult deer is underweight, OR one deer chosen
at random will have a 17.36% chance of weighing less than 82 lb.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 38 of 80
www.BrainKart.com

Example 14

Statistics from the Midwest Regional Climate Center indicate that Jones City, which has a
large wildlife refuge, gets an average of 36.7 in. of rain each year with a standard deviation
of 5.1 in. The amount of rain is normally distributed. During what percent of the years does
Jones City get more than 40 in. of rain?

P(x > 40)

Figure 18. Area under a normal


curve for P(x>40).

P(x>40) = (1-0.7422) = 0.2578

For approximately 25.78% of the years, Jones City will get more than 40 in. of rain.

Assessing Normality

• If the distribution is unknown and the sample size is not greater than 30 (Central
Limit Theorem), we have to assess the assumption of normality.

• Our primary method is the normal probability plot. This plot graphs the observed
data, ranked in ascending order, against the “expected” Z-score of that rank.

• If the sample data were taken from a normally distributed random variable, then the
plot would be approximately linear.

• Examine the following probability plot.

• The center line is the relationship we would expect to see if the data were drawn
from a perfectly normal distribution.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 39 of 80
www.BrainKart.com

• Notice how the observed data (red dots) loosely follow this linear relationship.
Minitab also computes an Anderson-Darling test to assess normality.

• The null hypothesis for this test is that the sample data have been drawn from a
normally distributed population. A p-value greater than 0.05 supports the
assumption of normality.

Figure 19. A normal probability plot generated using Minitab 16.

Compare the histogram and the normal probability plot in this next example. The
histogram indicates a skewed right distribution.

Figure 20. Histogram and normal probability plot for skewed right data.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 40 of 80
www.BrainKart.com

The observed data do not follow a linear pattern and the p-value for the A-D test is less
than 0.005 indicating a non-normal population distribution.

Normality cannot be assumed. You must always verify this assumption. Remember, the
probabilities we are finding come from the standard NORMAL table. If our data are NOT
normally distributed, then these probabilities DO NOT APPLY.

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 41 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 42 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 43 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 44 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 45 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 46 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 47 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 48 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 49 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 50 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 51 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 52 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 53 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 54 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 55 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 56 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 57 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 58 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 59 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 60 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 61 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 62 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 63 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 64 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 65 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 66 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 67 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 68 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 69 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 70 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 71 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 72 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 73 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 74 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 75 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 76 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 77 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 78 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 79 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Page 80 of 80
www.BrainKart.com

https://play.google.com/store/apps/details?id=info.therithal.brainkart.annauniversitynotes&hl=en_IN
Click on Subject/Paper under Semester to enter.
Professional English Discrete Mathematics Environmental Sciences
Professional English - - II - HS3252 - MA3354 and Sustainability -
I - HS3152 GE3451
Digital Principles and
Statistics and Probability and
Computer Organization
Matrices and Calculus Numerical Methods - Statistics - MA3391
- CS3351
- MA3151 MA3251
3rd Semester
1st Semester

4th Semester
2nd Semester

Database Design and Operating Systems -


Engineering Physics - Engineering Graphics
Management - AD3391 AL3452
PH3151 - GE3251

Physics for Design and Analysis of Machine Learning -


Engineering Chemistry Information Science Algorithms - AD3351 AL3451
- CY3151 - PH3256
Data Exploration and Fundamentals of Data
Basic Electrical and
Visualization - AD3301 Science and Analytics
Problem Solving and Electronics Engineering -
BE3251 - AD3491
Python Programming -
GE3151 Artificial Intelligence
Data Structures Computer Networks
- AL3391
Design - AD3251 - CS3591

Deep Learning -
AD3501

Embedded Systems
Data and Information Human Values and
and IoT - CS3691
5th Semester

Security - CW3551 Ethics - GE3791


6th Semester

7th Semester

8th Semester

Open Elective-1
Distributed Computing Open Elective 2
- CS3551 Project Work /
Elective-3
Open Elective 3 Intership
Big Data Analytics - Elective-4
CCS334 Open Elective 4
Elective-5
Elective 1 Management Elective
Elective-6
Elective 2
All Computer Engg Subjects - [ B.E., M.E., ] (Click on Subjects to enter)
Programming in C Computer Networks Operating Systems
Programming and Data Programming and Data Problem Solving and Python
Structures I Structure II Programming
Database Management Systems Computer Architecture Analog and Digital
Communication
Design and Analysis of Microprocessors and Object Oriented Analysis
Algorithms Microcontrollers and Design
Software Engineering Discrete Mathematics Internet Programming
Theory of Computation Computer Graphics Distributed Systems
Mobile Computing Compiler Design Digital Signal Processing
Artificial Intelligence Software Testing Grid and Cloud Computing
Data Ware Housing and Data Cryptography and Resource Management
Mining Network Security Techniques
Service Oriented Architecture Embedded and Real Time Multi - Core Architectures
Systems and Programming
Probability and Queueing Theory Physics for Information Transforms and Partial
Science Differential Equations
Technical English Engineering Physics Engineering Chemistry
Engineering Graphics Total Quality Professional Ethics in
Management Engineering
Basic Electrical and Electronics Problem Solving and Environmental Science and
and Measurement Engineering Python Programming Engineering

You might also like