0% found this document useful (0 votes)

16 views

BRM File

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

BRM File

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 35

MEASURES OF CENTRAL TENDENCY

Definition
The central tendency is stated as the statistical measure that represents the single
value of the entire distribution or a dataset. It aims to provide an accurate
description of the entire data in the distribution.

Measures of Central Tendency

The central tendency of the dataset can be found out using the three important
measures namely mean, median and mode.

Mean
The mean represents the average value of the dataset. It can be calculated as the
sum of all the values in the dataset divided by the number of values. In general, it is
considered as the arithmetic mean.

Median
Median is the middle value of the dataset in which the dataset is arranged in the
ascending order or in descending order. When the dataset contains an even number
of values, then the median value of the dataset can be found by taking the mean of
the middle two values.

Mode
The mode represents the frequently occurring value in the dataset. Sometimes the
dataset may contain multiple modes and in some cases, it does not contain any
mode at all.

Data analysis and interpretation is the next stage after collecting data from empirical
methods. The dividing line between the analysis of data and interpretation is difficult
to draw as the two processes are symbolic and merge imperceptibly. Interpretation
is inextricably interwoven with analysis.

1
MEAN, MEDIAN AND MODE

2
Measures of Dispersion-Variance, Standard
Deviation

Dispersion is the state of getting dispersed or spread. Statistical dispersion means

the extent to which numerical data is likely to vary about an average value. In other
words, dispersion helps to understand the distribution of the data.

Absolute Measure of Dispersion

An absolute measure of dispersion contains the same unit as the original data set.
The absolute dispersion method expresses the variations in terms of the average of
deviations of observations like standard or means deviations. It includes
range, standard deviation, quartile deviation, etc.

The types of absolute measures of dispersion are:

1. Range: It is simply the difference between the maximum value and the minimum value
given in a data set. Example: 1, 3,5, 6, 7 => Range = 7 -1= 6
2. Variance: Deduct the mean from each data in the set, square each of them and add
each square and finally divide them by the total no of values in the data set to get the
variance. Variance (σ2) = ∑(X−μ)2/N
3. Standard Deviation: The square root of the variance is known as the standard
deviation i.e. S.D. = √σ.
4. Quartiles and Quartile Deviation: The quartiles are values that divide a list of numbers
into quarters. The quartile deviation is half of the distance between the third and the first
quartile.
5. Mean and Mean Deviation: The average of numbers is known as the mean and the
arithmetic mean of the absolute deviations of the observations from a measure of central
tendency is known as the mean deviation (also called mean absolute deviation).

3
MEASURES OF DISPERSION

4
CORRELATION & REGRESSION
Correlation refers to a process for establishing the relationships between two variables.
You learned a way to get a general idea about whether or not two variables are related, is to
plot them on a “scatter plot”. While there are many measures of association for variables
which are measured at the ordinal or higher level of measurement, correlation is the most
commonly used approach.

In statistics, Correlation studies and measures the direction and extent of relationship
among variables, so the correlation measures co-variation, not causation. Therefore, we
should never interpret correlation as implying cause and effect relation. For example, there
exists a correlation between two variables X and Y, which means the value of one variable
is found to change in one direction, the value of the other variable is found to change either
in the same direction (i.e. positive change) or in the opposite direction (i.e. negative
change). Furthermore, if the correlation exists, it is linear, i.e. we can represent the relative
movement of the two variables by drawing a straight line on graph paper.

Correlation Coefficient
The correlation coefficient, r, is a summary measure that describes the extent of the
statistical relationship between two interval or ratio level variables. The correlation
coefficient is scaled so that it is always between -1 and +1. When r is close to 0 this
means that there is little relationship between the variables and the farther away
from 0 r is, in either the positive or negative direction, the greater the relationship
between the two variables.

The two variables are often given the symbols X and Y. In order to illustrate how the
two variables are related, the values of X and Y are pictured by drawing the scatter
diagram, graphing combinations of the two variables. The scatter diagram is given
first, and then the method of determining Pearson’s r is presented.

5
Scatter Diagram
A scatter diagram is a diagram that shows the values of two variables X and Y,
along with the way in which these two variables relate to each other. The values of
variable X are given along the horizontal axis, with the values of the variable Y given
on the vertical axis.

Later, when the regression model is used, one of the variables is defined as an
independent variable, and the other is defined as a dependent variable. In
regression, the independent variable X is considered to have some effect or
influence on the dependent variable Y. Correlation methods are symmetric with
respect to the two variables, with no indication of causation or direction of influence
being part of the statistical consideration. A scatter diagram is given in the following
example. The same example is later used to determine the correlation coefficient.

Types of Correlation
The scatter plot explains the correlation between the two attributes or variables. It
represents how closely the two variables are connected. There can be three such
situations to see the relation between the two variables –

 Positive Correlation – when the values of the two variables move in the same direction
so that an increase/decrease in the value of one variable is followed by an
increase/decrease in the value of the other variable.
 Negative Correlation – when the values of the two variables move in the opposite
direction so that an increase/decrease in the value of one variable is followed by
decrease/increase in the value of the other variable.
 No Correlation – when there is no linear dependence or no relation between the two
variables.

REGRESSION
 Regression is a statistical method used in finance, investing, and other disciplines that
attempts to determine the strength and character of the relationship between one
dependent variable (usually denoted by Y) and a series of other variables (known as
independent variables).

 Also called simple regression or ordinary least squares (OLS), linear regression is the most
common form of this technique. Linear regression establishes the linear
relationship between two variables based on a line of best fit. Linear regression is thus
graphically depicted using a straight line with the slope defining how the change in one
variable impacts a change in the other. The y-intercept of a linear regression relationship
represents the value of one variable when the value of the other is zero. Non-linear
regression models also exist, but are far more complex.

 Regression analysis is a powerful tool for uncovering the associations between variables
observed in data, but cannot easily indicate causation. It is used in several contexts in
business, finance, and economics. For instance, it is used to help investment managers

6
value assets and understand the relationships between factors such as commodity
prices and the stocks of businesses dealing in those commodities.

CORRELATION AND REGRESSION

Chart Title
2000
1800
1600
f(x) = 0.824004402349223 x
1400 R² = 0.968594144287856
1200
1000
800
600
400
200
0
0 500 1000 1500 2000 2500

7
8
DISTRIBUTION OF DATA- SKEWNESS,KURTOSIS, KS
TEST OF NORMALITY
Skewness is a measure of the asymmetry of a distribution. A distribution is
asymmetrical when its left and right side are not mirror images.

A distribution can have right (or positive), left (or negative), or zero skewness. A
right-skewed distribution is longer on the right side of its peak, and a left-skewed
distribution is longer on the left side of its peak:

Kurtosis is a measure of the tailedness of a distribution. Tailedness is how often

outliers occur. Excess kurtosis is the tailedness of a distribution relative to
a normal distribution.

 Distributions with medium kurtosis (medium tails) are mesokurtic.

 Distributions with low kurtosis (thin tails) are platykurtic.
 Distributions with high kurtosis (fat tails) are leptokurtic.

Tails are the tapering ends on either side of a distribution. They represent the
probability or frequency of values that are extremely high or low compared to the
mean. In other words, tails represent how often outliers occur.

9
OUTLIERS

In data analytics, outliers are values within a dataset that vary greatly from the

others—they’re either much larger, or significantly smaller. Outliers may indicate

variabilities in a measurement, experimental errors, or a novelty.In a real-world

example, the average height of a giraffe is about 16 feet tall. However, there

have been recent discoveries of two giraffes that stand at 9 feet and 8.5 feet,

respectively. These two giraffes would be considered outliers in comparison to

the general giraffe population. When going through the process of data analysis,

outliers can cause anomalies in the results obtained. This means that they

require some special attention and, in some cases, will need to be removed in

order to analyze data effectively.There are two main reasons why giving outliers

special attention is a necessary aspect of the data analytics process:

Outliers may have a negative effect on the result of an analysis

Outliers—or their behavior—may be the information that a data analyst requires

from the analysis

Types of outliers

There are two kinds of outliers:

 A univariate outlier is an extreme value that relates to just one variable.

For example, Sultan Kösen is currently the tallest man alive, with a height

of 8ft, 2.8 inches (251cm). This case would be considered a univariate

outlier as it’s an extreme case of just one factor: height.

 A multivariate outlier is a combination of unusual or extreme values for at

least two variables. For example, if you’re looking at both the height and

weight of a group of adults, you might observe that one person in your

dataset is 5ft 9 inches tall—a measurement that would fall within the

normal range for this particular variable. You may also observe that this

person weighs 110lbs. Again, this observation alone falls within the normal

10
range for the variable of interest: weight. However, when you consider

these two observations in conjunction, you have an adult who is 5ft 9

inches and weighs 110lbs—a surprising combination. That’s a multivariate

outlier.

KS TEST OF NORMALITY
In statistics, the Kolmogorov–Smirnov test (K–S test or KS test)
is a nonparametric test of the equality of continuous (or
discontinuous, see Section 2.2), one-dimensional probability
distributions that can be used to compare a sample with a
reference probability distribution (one-sample K–S test), or to
compare two samples (two-sample K–S test).

The frequency distribution of my scores doesn't entirely

overlap with my normal curve. Now, I could calculate
the percentage of cases that deviate from the normal
curve -the percentage of red areas in the chart. This
percentage is a test statistic: it expresses in a single
number how much my data differ from my null hypothesis.
So it indicates to what extent the observed scores deviate
from a normal distribution.
Now, if my null hypothesis is true, then this deviation

11
percentage should probably be quite small. That is, a small
deviation has a high probability value or p-value.
Reversely, a huge deviation percentage is very unlikely and
suggests that my reaction times don't follow a normal
distribution in the entire population. So a large deviation
has a low p-value.

NORMAL DISTRIBUTION

12
0.03 NORMAL DISTRIBUTION
0.025

0.02

0.015

0.01

0.005

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

NORM. DIST. - A NORM. DIST. - B

SKEWNESS AND KURTOSIS

13
KOLMOGOROV-SMIRNOV TEST

14
T-TEST

15
A t test is a statistical test that is used to compare the means of two groups. It is
often used in hypothesis testing to determine whether a process or treatment
actually has an effect on the population of interest, or whether two groups are
different from one another. A t test can only be used when comparing the means of
two groups (a.k.a. pairwise comparison). If you want to compare more than two
groups, or if you want to do multiple pairwise comparisons, use an ANOVA test or a
post-hoc test.

The t test is a parametric test of difference, meaning that it makes the same
assumptions about your data as other parametric tests. The t test assumes your
data:

1. are independent
2. are (approximately) normally distributed
3. have a similar amount of variance within each group being compared (a.k.a.
homogeneity of variance)

If your data do not fit these assumptions, you can try a nonparametric alternative to
the t test, such as the Wilcoxon Signed-Rank test for data with unequal variances.

When choosing a t test, you will need to consider two things: whether
the groups being compared come from a single population or two
different populations, and whether you want to test the difference in a
specific direction. One-sample, two-sample, or paired t test?

 If the groups come from a single population (e.g., measuring before and after
an experimental treatment), perform a paired t test. This is a within-subjects
design.
 If the groups come from two different populations (e.g., two different species,
or people from two separate cities), perform a two-
sample t test (a.k.a. independent t test). This is a between-subjects
design.
 If there is one group being compared against a standard value (e.g.,
comparing the acidity of a liquid to a neutral pH of 7), perform a one-
sample t test.

One-tailed or two-tailed t test?

 If you only care whether the two populations are different from one another,
perform a two-tailed t test.
 If you want to know whether one population mean is greater than or less than
the other, perform a one-tailed t test.

PAIRED T-TEST

16
UNPAIRED T-TEST

17
F-TEST:TWO SAMPLE FOR VARIANCES
F test is a statistical test that is used in hypothesis testing to check
whether the variances of two populations or two samples are equal or not.
In an f test, the data follows an f distribution. This test uses the f statistic to

18
compare two variances by dividing them. An f test can either be one-tailed
or two-tailed depending upon the parameters of the problem.

The f value obtained after conducting an f test is used to perform the one-
way ANOVA (analysis of variance) test. In this article, we will learn more
about an f test, the f statistic, its critical value, formula and how to conduct
an f test for hypothesis testing.

What is F Test in Statistics?

F test is statistics is a test that is performed on an f distribution. A two-

tailed f test is used to check whether the variances of the two given samples
(or populations) are equal or not. However, if an f test checks whether one
population variance is either greater than or lesser than the other, it
becomes a one-tailed hypothesis f test.

F Test Definition
F test can be defined as a test that uses the f test statistic to check
whether the variances of two samples (or populations) are equal to the
same value. To conduct an f test, the population should follow an f
distribution and the samples must be independent events. On conducting
the hypothesis test, if the results of the f test are statistically significant
then the null hypothesis can be rejected otherwise it cannot be rejected.

F Test Formula

The f test is used to check the equality of variances using hypothesis testing.
The f test formula for different hypothesis tests is given as follows:

Left Tailed Test:

Null Hypothesis: H0�0 : σ21=σ22�12=�22

Alternate Hypothesis: H1�1 : σ21<σ22�12<�22

Decision Criteria: If the f statistic < f critical value then reject the null
hypothesis

Right Tailed test:

Null Hypothesis: H0�0 : σ21=σ22�12=�22

Alternate Hypothesis: H1�1 : σ21>σ22�12>�22

Decision Criteria: If the f test statistic > f test critical value then reject the null hypothesis

19
Two Tailed test:

Null Hypothesis: H0�0 : σ21=σ22�12=�22

Alternate Hypothesis: H1�1 : σ21≠σ22�12≠�22

Decision Criteria: If the f test statistic > f test critical value then the null hypothesis is rejected

F Test T-Test

An F test is a test
The T-test is used when the sample
statistic used to check
size is small (n < 30) and the
the equality of
population standard deviation is not
variances of two
known.
populations

The data follows an F The data follows a Student t-

distribution distribution

The t-test statistic for 1 sample is

given by t = ¯¯¯x−μs√n�¯−��,
The F test statistic is
where ¯¯¯x�¯ is the sample
given as F
mean, μ� is the population mean, s is
= σ21σ22�12�22
the sample standard deviation and n is
the sample size.

The f test is used for

It is used for testing means.
variances.

F-TEST

20
Z-TEST:TWO SAMPLES FOR MEANS

21
A z-test is a statistical test used to determine whether two population
means are different when the variances are known and the sample size
is large.

The test statistic is assumed to have a normal distribution, and

nuisance parameters such as standard deviation should be known in
order for an accurate z-test to be performed.

KEY TAKEAWAYS

 A z-test is a statistical test to determine whether two population

means are different when the variances are known and the
sample size is large.
 A z-test is a hypothesis test in which the z-statistic follows a
normal distribution.
 A z-statistic, or z-score, is a number representing the result from
the z-test.
 Z-tests are closely related to t-tests, but t-tests are best
performed when an experiment has a small sample size.
 Z-tests assume the standard deviation is known, while t-tests
assume it is unknown.
The investor rejects the null hypothesis since z is greater than 1.96 and
concludes that the average daily return is greater than 1%.

What's the Difference Between a T-Test and Z-Test?

Z-tests are closely related to t-tests, but t-tests are best performed
when the data consists of a small sample size, i.e., less than 30. Also,
t-tests assume the standard deviation is unknown, while z-tests assume
it is known.

When Should You Use a Z-Test?

If the standard deviation of the population is unknown and the sample
size is greater than or equal to 30, then the assumption of the sample
variance equaling the population variance should be made using the z-
test. Regardless of the sample size, if the population standard deviation
for a variable remains unknown, a t-test should be used instead.

Z TEST: TWO SAMPLE FOR MEANS

22
The z-Test: Two- Sample for Means tool runs a two sample z-Test means
with known variances to test the null hypothesis that there is no difference
between the means of two independent populations. This tool can be used
to run a one-sided or two-sided test z-test.

Two P values are calculated in the output of this test.

 "P(Z <= z) one tail" should be interpreted as P(Z >= ABS(z)) or the
probability of a larger z Critical one-tail value larger than the absolute
value of the observed z value, when there is no difference between the
population means.

 "P(Z <= z) two tail" should be interpreted as P(Z>= ABS(z) or Z <= -

ABS(z)) or the probability of a z Critical two-tail value larger than the
absolute value of the observed z value, where there is no difference
between the two population means.

Z-TEST

23
ANOVA:SINGLE AND TWO FACTOR

24
When it comes to research, in the field of business, economics, psychology, sociology, biology, etc.
the Analysis of Variance, shortly known as ANOVA is an extremely important tool for analysis of data.
It is a technique employed by the researcher to make a comparison between more than two
populations and help in performing simultaneous tests. There is a two-fold purpose of ANOVA.
In one way ANOVA the researcher takes only one factor.

As against, in the case of two-way ANOVA, the researcher investigates two factors concurrently. For
a layman these two concepts of statistics are synonymous. However, there is a difference between
one-way and two-way ANOVA.

When it comes to research, in the field of business, economics, psychology, sociology, biology, etc.
the Analysis of Variance, shortly known as ANOVA is an extremely important tool for analysis of data.
It is a technique employed by the researcher to make a comparison between more than two
populations and help in performing simultaneous tests. There is a two-fold purpose of ANOVA.
In one way ANOVA the researcher takes only one factor.

Comparison Chart

BASIS FOR
ONE WAY ANOVA TWO WAY ANOVA
COMPARISON

Meaning One way ANOVA is a hypothesis test, used to test Two way ANOVA is a statistical technique
the equality of three of more population means wherein, the interaction between factors,
simultaneously using variance. influencing variable can be studied.

Independent One Two

Variable

Compares Three or more levels of one factor. Effect of multiple level of two factors.

Number of Need not to be same in each group. Need to be equal in each group.
Observation

Design of Need to satisfy only two principles. All three principles needs to be satisfied
experiments

Definition of One-Way ANOVA

One way Analysis of Variance (ANOVA) is a hypothesis test in which only one categorical variable or
single factor is considered. It is a technique which enables us to make a comparison of means of
three or more samples with the help of F-distribution. It is used to find out the difference among its
different categories having several possible values.

The null hypothesis (H0) is the equality in all population means, while alternative hypothesis (H1) will
be the difference in at least one mean.

One way ANOVA is based on the following assumptions:

 Normal distribution of the population from which the samples are drawn.
 Measurement of the dependent variable is at interval or ratio level.
 Two or more than two categorical independent groups in an independent variable.
 Independence of samples
 Homogeneity of the variance of the population.

25
Definition of Two-Way ANOVA

Two-way ANOVA as its name signifies, is a hypothesis test wherein the classification of data is
based on two factors. For instance, the two bases of classification for the sales made by the firm is
first on the basis of sales by the different salesman and second by sales in the various regions. It is a
statistical technique used by the researcher to compare several levels (condition) of the two
independent variables involving multiple observations at each level.

Two-way ANOVA examines the effect of the two factors on the continuous dependent variable. It also
studies the inter-relationship between independent variables influencing the values of the dependent
variable, if any.

Assumptions of two-way ANOVA:

 Normal distribution of the population from which the samples are drawn.
 Measurement of dependent variable at continuous level.
 Two or more than two categorical independent groups in two factors.
 Categorical independent groups should have the same size.
 Independence of observations
 Homogeneity of the variance of the population.

ANOVA : SINGLE FACTOR

26
ANOVA : TWO FACTOR WITH
REPLICATION

27
28
CHI-SQUARE TEST

29
A Pearson’s chi-square test is a statistical test for categorical data. It is used to determine whether
your data are significantly different from what you expected. There are two types of Pearson’s chi-
square tests:

 The chi-square goodness of fit test is used to test whether the frequency distribution of a
categorical variable is different from your expectations.
 The chi-square test of independence is used to test whether two categorical variables are
related to each other.

Chi-square is often written as Χ2 and is pronounced “kai-square” (rhymes with “eye-square”). It is also
called chi-squared.

Pearson’s chi-square (Χ2) tests, often referred to simply as chi-square tests, are among the most
common nonparametric tests. Nonparametric tests are used for data that don’t follow
the assumptions of parametric tests, especially the assumption of a normal distribution.

If you want to test a hypothesis about the distribution of a categorical variable you’ll need to use a
chi-square test or another nonparametric test. Categorical variables can be nominal or ordinal and
represent groupings such as species or nationalities. Because they can only have a few specific
values, they can’t have a normal distribution.

Test hypotheses about frequency distributions

There are two types of Pearson’s chi-square tests, but they both test whether the
observed frequency distribution of a categorical variable is significantly different from its expected
frequency distribution. A frequency distribution describes how observations are distributed between
different groups.

Frequency distributions are often displayed using frequency distribution tables. A frequency
distribution table shows the number of observations in each group. When there are two categorical
variables, you can use a specific type of frequency distribution table called a contingency table to
show the number of observations in each combination of groups.

The chi-square formula

Both of Pearson’s chi-square tests use the same formula to calculate the test statistic, chi-square
(Χ2):

Where:

 Χ2 is the chi-square test statistic

 Σ is the summation operator (it means “take the sum of”)
 O is the observed frequency
 E is the expected frequency

The larger the difference between the observations and the expectations (O − E in the equation), the
bigger the chi-square will be. To decide whether the difference is big enough to be statistically
significant, you compare the chi-square value to a critical value.

CHI-SQUARE TEST

30
31
WILCOXON SIGNED-RANK TEST

32
Wilcoxon signed rank test is a nonparametric hypothesis test that can
do the following:Evaluate the median difference between two paired
samples.Compare a 1-sample median to a reference value.

In other words, it is the nonparametric alternative for both the 1-sample

t-test and paired t-test.To perform the 1-sample test, analyze the raw
data values. For the paired version, calculate the differences between
the paired values and analyze them.Most frequently, analysts use the
Wilcoxon signed rank test to evaluate paired samples, such as before
and after treatment scores. For example, a medical study might assess
medication effectiveness by comparing the pre-test and post-test
median symptom scores.Like all hypothesis tests, this one uses
samples to draw conclusions about populations.

Wilcoxon Signed Rank Test Assumptions

Statisticians often use the Wilcoxon signed rank test when their data do
not follow the normal distribution. However, it has other advantages
over t-tests, including the ability to analyze ordinal data and reduce the
impact of outliers.While the data don’t need to be normally distributed,
they must follow a symmetrical distribution. When using the paired form,
the distribution of the differences between the paired values must be
symmetrical.If the distribution is asymmetric, consider using the sign
test. This nonparametric test is like the Wilcoxon signed rank test but
can handle asymmetric distributions. However, the sign test is less
powerful.

Null and Alternative Hypotheses

Now, let’s delve into the hypotheses of the Wilcoxon signed rank test.
There are two sets of hypotheses. Choosing the correct set depends on
whether you perform the paired or one-sample test.

Depending on the form, you’ll either determine whether the median

difference between paired observations differs from zero or determine
whether the median differs from the benchmark value (one-sample).

Paired Test

33
The following are the hypotheses for the paired Wilcoxon signed rank
test

o Null hypothesis: The median of the paired differences equals zero

in the population.
o Alternative hypothesis: The median of the paired differences
does not equal zero in the population.

In the paired Wilcoxon signed rank test, a median difference of zero

indicates no effect or difference between the paired observations. For
example, when the pre-test and post-test medians are not significantly
different, the treatment did not affect the outcomes.

However, if your p-value is less than or equal to your significance level,

the results are statistically significant, and you reject the null hypothesis.
You can conclude that the median difference is not zero. In other words,
an effect exists in the population.

One-Sample Test

The following are the hypotheses for the one-sample Wilcoxon signed
rank test:

o Null hypothesis: The population median equals the benchmark

value.
o Alternative hypothesis: The population median does not equal
the benchmark value.

The one-sample form compares the sample median to a hypothetical

population median. The hypothetical value can be a target or
benchmark. When the results are statistically significant, reject the null
hypothesis and conclude that the population median does not equal
your benchmark value.

Consider this robust, nonparametric alternative when your data are

misbehaving. Whether dealing with nonnormal distributions, ordinal
data, or outliers, it’s a handy tool in your data analysis toolbox.

WILCOXON T-TEST

34
35

Byzantine Machine Learning: A Primer: Rachid Guerraoui Nirupam Gupta Rafael Pinot
No ratings yet
Byzantine Machine Learning: A Primer: Rachid Guerraoui Nirupam Gupta Rafael Pinot
39 pages
FMEA - Screw Air Compressor
100% (2)
FMEA - Screw Air Compressor
6 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Regression Analysis: Definition: The Regression Analysis Is A Statistical Tool Used To Determine The Probable Change
100% (1)
Regression Analysis: Definition: The Regression Analysis Is A Statistical Tool Used To Determine The Probable Change
12 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
100 pages
Correlation and Regression
No ratings yet
Correlation and Regression
64 pages
Unit 3-1
No ratings yet
Unit 3-1
12 pages
Business Statistics Chapter 5
No ratings yet
Business Statistics Chapter 5
43 pages
EDA-GROUP-1
No ratings yet
EDA-GROUP-1
19 pages
Correlation Analysis-Students NotesMAR 2023
No ratings yet
Correlation Analysis-Students NotesMAR 2023
24 pages
Machinery
No ratings yet
Machinery
9 pages
What Are The Various Measures of Central Tendency
No ratings yet
What Are The Various Measures of Central Tendency
4 pages
Appendix 1 Basic Statistics: Summarizing Data
No ratings yet
Appendix 1 Basic Statistics: Summarizing Data
9 pages
Measures of Correlation Module
No ratings yet
Measures of Correlation Module
24 pages
Corr_Regression Analysis
No ratings yet
Corr_Regression Analysis
19 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
34 pages
Correlation 805deee567bf3bca405e2e973070a021
No ratings yet
Correlation 805deee567bf3bca405e2e973070a021
18 pages
QT-Correlation and Regression-1
No ratings yet
QT-Correlation and Regression-1
3 pages
Correlation N Regression
No ratings yet
Correlation N Regression
25 pages
Assignment 12'
No ratings yet
Assignment 12'
6 pages
Regression & Correlation 230224 221642
No ratings yet
Regression & Correlation 230224 221642
9 pages
Simple Linear Correlation-1
No ratings yet
Simple Linear Correlation-1
15 pages
Correlation
No ratings yet
Correlation
7 pages
sem 6 ques data science
No ratings yet
sem 6 ques data science
23 pages
24 11
No ratings yet
24 11
24 pages
Business Project 12 Content
No ratings yet
Business Project 12 Content
33 pages
Types of Correlation
No ratings yet
Types of Correlation
3 pages
Chapter - Six
No ratings yet
Chapter - Six
8 pages
Correlation and Regression
100% (1)
Correlation and Regression
17 pages
Correlation Analysis Notes-2
No ratings yet
Correlation Analysis Notes-2
5 pages
Correlation
No ratings yet
Correlation
17 pages
Appendix 1 Basic Statistics: Summarizing Data
No ratings yet
Appendix 1 Basic Statistics: Summarizing Data
5 pages
Correlation and Regression-1
No ratings yet
Correlation and Regression-1
32 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
Correlation
No ratings yet
Correlation
19 pages
Income Tax
No ratings yet
Income Tax
9 pages
Statistics and Probability
No ratings yet
Statistics and Probability
20 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Process of Research MEd 1 Yr
No ratings yet
Process of Research MEd 1 Yr
39 pages
Qt Module II Correlation and Regression Analysis
No ratings yet
Qt Module II Correlation and Regression Analysis
10 pages
Correlation Analysis
No ratings yet
Correlation Analysis
20 pages
UNIT-2 by Ramanathan
No ratings yet
UNIT-2 by Ramanathan
67 pages
Correlation Bmlt
No ratings yet
Correlation Bmlt
5 pages
Correlation and Regression
No ratings yet
Correlation and Regression
22 pages
Final 2nd MAT1243 Handout 2023 Ac Year
No ratings yet
Final 2nd MAT1243 Handout 2023 Ac Year
81 pages
Difference Between Correlation and Regression
No ratings yet
Difference Between Correlation and Regression
7 pages
Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)
No ratings yet
Business Statistics Method: by Farah Nurul Aisyah (4122001020) Jasmine Alviana Zalzabillah (4122001070)
35 pages
Correlation Analysis PDF
No ratings yet
Correlation Analysis PDF
30 pages
Correlation and Regression
No ratings yet
Correlation and Regression
3 pages
Peter
No ratings yet
Peter
48 pages
CORRELATION
No ratings yet
CORRELATION
4 pages
LS 02 - Correlation & Regression
No ratings yet
LS 02 - Correlation & Regression
17 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
23 pages
Correlation Analysis: Concept and Importance of Correlation
No ratings yet
Correlation Analysis: Concept and Importance of Correlation
8 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
11 pages
Correlation: Self Instructional Study Material Programme: M.A. Development Studies
No ratings yet
Correlation: Self Instructional Study Material Programme: M.A. Development Studies
21 pages
Statistical Measures&Correlation&It'STypes
No ratings yet
Statistical Measures&Correlation&It'STypes
4 pages
Scatter Plot Linear Correlation
No ratings yet
Scatter Plot Linear Correlation
4 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Front Page of BRM Lab[1]
No ratings yet
Front Page of BRM Lab[1]
2 pages
Ism Assg9
No ratings yet
Ism Assg9
6 pages
Ism Assg1
No ratings yet
Ism Assg1
3 pages
Ec Book Unit 3,4
No ratings yet
Ec Book Unit 3,4
8 pages
Fa Unit 1
No ratings yet
Fa Unit 1
24 pages
Fa Unit 4
No ratings yet
Fa Unit 4
13 pages
Mathematics Shelf List
No ratings yet
Mathematics Shelf List
35 pages
Oxidation of 4-Alkylphenols and Catechols by Tyrosinase: - Substituents Alter The Mechanism of Quinoid Formation
No ratings yet
Oxidation of 4-Alkylphenols and Catechols by Tyrosinase: - Substituents Alter The Mechanism of Quinoid Formation
17 pages
38DL Plus Ultrasonic Thickness Gage User's Manual
100% (3)
38DL Plus Ultrasonic Thickness Gage User's Manual
308 pages
Form 1 Chapter 6 Integers: Karmen Thum B. App. SC (Maths & Econ) USM
No ratings yet
Form 1 Chapter 6 Integers: Karmen Thum B. App. SC (Maths & Econ) USM
20 pages
4SDK3
No ratings yet
4SDK3
4 pages
On My Honor, I Have Neither Solicited Nor Received Unauthorized Assistance On This Assignment
No ratings yet
On My Honor, I Have Neither Solicited Nor Received Unauthorized Assistance On This Assignment
6 pages
5 6116152494587379984
No ratings yet
5 6116152494587379984
98 pages
ME 408 Automatic Control - MidSp2021 PDF
No ratings yet
ME 408 Automatic Control - MidSp2021 PDF
2 pages
390058L-E00
No ratings yet
390058L-E00
1 page
Eagle Ford Shale Basin Study APR - 22 - 2019
No ratings yet
Eagle Ford Shale Basin Study APR - 22 - 2019
48 pages
9303 p10 Cons en
No ratings yet
9303 p10 Cons en
130 pages
2 SJ 117
No ratings yet
2 SJ 117
3 pages
Curricu Lum Vitae Muni Reddy. M
No ratings yet
Curricu Lum Vitae Muni Reddy. M
3 pages
F1 Merged
No ratings yet
F1 Merged
79 pages
PSF CP SPG DRAWING
No ratings yet
PSF CP SPG DRAWING
8 pages
Mini Pak Honeywell
No ratings yet
Mini Pak Honeywell
78 pages
12 Binom and Normal
No ratings yet
12 Binom and Normal
6 pages
Math PT - 1ST Quarter
No ratings yet
Math PT - 1ST Quarter
5 pages
Full Download Codes and ciphers Julius Caesar the Enigma and the internet 1st Edition R. F. Churchhouse PDF DOCX
100% (11)
Full Download Codes and ciphers Julius Caesar the Enigma and the internet 1st Edition R. F. Churchhouse PDF DOCX
50 pages
Modelling Stone Columns Using Rocscience Settle 3D: April 2018
No ratings yet
Modelling Stone Columns Using Rocscience Settle 3D: April 2018
9 pages
Dimensions and Weights Acc. To German Approval: Fixed Pot Bearing TF
No ratings yet
Dimensions and Weights Acc. To German Approval: Fixed Pot Bearing TF
3 pages
Boulder Amateur TV Repeater's Newsletter-105
No ratings yet
Boulder Amateur TV Repeater's Newsletter-105
10 pages
Box Culvert (1 X 6 X 2.5)
No ratings yet
Box Culvert (1 X 6 X 2.5)
34 pages
21-5701f - Air Cooled Heat Exchanger
60% (5)
21-5701f - Air Cooled Heat Exchanger
20 pages
Release Notes For Ipp 6.0: Image-Pro Plus 6.0 Has An All-New Ipbasic Editor. This Editor Adds Command
No ratings yet
Release Notes For Ipp 6.0: Image-Pro Plus 6.0 Has An All-New Ipbasic Editor. This Editor Adds Command
4 pages
Config 600 Pro Training Manual Section 2 - Logicalc Language Specification
No ratings yet
Config 600 Pro Training Manual Section 2 - Logicalc Language Specification
8 pages
Engineering The Computer Science and IT
No ratings yet
Engineering The Computer Science and IT
514 pages
88 GDD-Series
No ratings yet
88 GDD-Series
8 pages

BRM File

Uploaded by

BRM File

Uploaded by

MEASURES OF CENTRAL TENDENCY

Measures of Central Tendency

Dispersion is the state of getting dispersed or spread. Statistical dispersion means

Absolute Measure of Dispersion

The types of absolute measures of dispersion are:

CORRELATION AND REGRESSION

Kurtosis is a measure of the tailedness of a distribution. Tailedness is how often

 Distributions with medium kurtosis (medium tails) are mesokurtic.

others—they’re either much larger, or significantly smaller. Outliers may indicate

variabilities in a measurement, experimental errors, or a novelty.In a real-world

respectively. These two giraffes would be considered outliers in comparison to

special attention is a necessary aspect of the data analytics process:

Outliers may have a negative effect on the result of an analysis

Outliers—or their behavior—may be the information that a data analyst requires

from the analysis

There are two kinds of outliers:

 A univariate outlier is an extreme value that relates to just one variable.

of 8ft, 2.8 inches (251cm). This case would be considered a univariate

outlier as it’s an extreme case of just one factor: height.

 A multivariate outlier is a combination of unusual or extreme values for at

these two observations in conjunction, you have an adult who is 5ft 9

inches and weighs 110lbs—a surprising combination. That’s a multivariate

The frequency distribution of my scores doesn't entirely

NORM. DIST. - A NORM. DIST. - B

SKEWNESS AND KURTOSIS

One-tailed or two-tailed t test?

What is F Test in Statistics?

F test is statistics is a test that is performed on an f distribution. A two-

Left Tailed Test:

Null Hypothesis: H0�0 : σ21=σ22�12=�22

Alternate Hypothesis: H1�1 : σ21<σ22�12<�22

Right Tailed test:

Null Hypothesis: H0�0 : σ21=σ22�12=�22

Alternate Hypothesis: H1�1 : σ21>σ22�12>�22

Null Hypothesis: H0�0 : σ21=σ22�12=�22

Alternate Hypothesis: H1�1 : σ21≠σ22�12≠�22

The data follows an F The data follows a Student t-

The t-test statistic for 1 sample is

The f test is used for

The test statistic is assumed to have a normal distribution, and

 A z-test is a statistical test to determine whether two population

What's the Difference Between a T-Test and Z-Test?

When Should You Use a Z-Test?

Z TEST: TWO SAMPLE FOR MEANS

Two P values are calculated in the output of this test.

 "P(Z <= z) two tail" should be interpreted as P(Z>= ABS(z) or Z <= -

Independent One Two

Definition of One-Way ANOVA

One way ANOVA is based on the following assumptions:

Assumptions of two-way ANOVA:

ANOVA : SINGLE FACTOR

Test hypotheses about frequency distributions

The chi-square formula

 Χ2 is the chi-square test statistic

In other words, it is the nonparametric alternative for both the 1-sample

Wilcoxon Signed Rank Test Assumptions

Null and Alternative Hypotheses

Depending on the form, you’ll either determine whether the median

o Null hypothesis: The median of the paired differences equals zero

In the paired Wilcoxon signed rank test, a median difference of zero

However, if your p-value is less than or equal to your significance level,

o Null hypothesis: The population median equals the benchmark

The one-sample form compares the sample median to a hypothetical

Consider this robust, nonparametric alternative when your data are

You might also like