0% found this document useful (0 votes)
205 views19 pages

Cong Thuc Mas202 Summary Xac Suat Thong Ke Toan Doanh Nghiep

The document provides an overview of statistical probability and data analysis in a business context, covering topics such as measurement scales, data collection methods, sampling types, and survey errors. It also discusses organizing and visualizing data, numerical descriptive measures, basic probability concepts, and various probability distributions. Additionally, it includes information on hypothesis testing, ANOVA, and simple regression analysis.

Uploaded by

trannhu18312
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
205 views19 pages

Cong Thuc Mas202 Summary Xac Suat Thong Ke Toan Doanh Nghiep

The document provides an overview of statistical probability and data analysis in a business context, covering topics such as measurement scales, data collection methods, sampling types, and survey errors. It also discusses organizing and visualizing data, numerical descriptive measures, basic probability concepts, and various probability distributions. Additionally, it includes information on hypothesis testing, ANOVA, and simple regression analysis.

Uploaded by

trannhu18312
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Công thfíc MAS202 - Summary Xác suất thống kê toán doanh

nghiệp
Xác suất thống kê toán doanh nghiệp (Trường Đại học FPT)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Qu?nh Nh? Tr?n ([email protected])
CHAP 1: DEFINE & COLLECTING DATA
I. Measurement Scales
1. Nominal scale: no ranking is implied
VD: Yes, No; Type of investment: Growth, value, …
2. ordinal scale: ranking is implied
VD: Very unsatisfied, Fairly unsatisfied, Neutral, Fairly satisfied, Very
satisfied Freshman, Sophomore, Junior, Senior
A, B, C, D, F
3. interval scale: an ordered scale: difference between measurements is a
meaningful quantity but measurements not have a true zero point.
VD: temperature, exam score,…
4. ratio scale: have a true zero
point. VD: height, age, salary,…

II. Collecting Data Via Sampling Is Used When Doing So Is


- Less time consuming
- Less costly
- Less cumbersome and more practical

III. Collecting data from


- Ongoing business activities: dữ liệu từ các hoạt động diễn ra trong nội bộ
công ty
- Compiled by organization/ individuals: dữ liệu ngành từ các công ty
nghiên cứu thị trường, hiệp hội thương mại, báo,,,,
- Survey
- Experiment: sử dụng thử, Thử nghiệm thị trường về các chương trình khuyến
mãi sản phẩm thay thế để xác định xem nên sử dụng chương trình khuyến
mãi nào rộng rãi hơn.
- Observation study (Observation report): Đo lưu lượng giao thông qua ngã
tư để xác định xem một số hình thức quảng cáo tại ngã tư có hợp lý hay
không

IV. Sources of Data


- Primary Source: data from survey, experiment, observation
- Secondary Source: data from print journals, published on the internet
Downloaded by Qu?nh Nh? Tr?n ([email protected])
V. Types of Samples
1. Non probability Sample
- Judgement: get opinions of expert
- Convinience: from family, friends,…
2. Probability Sample:
- Simple Random
- Systematic (kth)
- Stratified
Divide population -> subgroups (called strata) according to common
characteristic -> A simple random sample selected from each subgroups ->
combined into one
- Cluster
Population -> Clusters -> Select simple random sample of clusters

VI. Types of Survey Error


1. Coverage error or selection bias (Lỗi bảo hiểm hoặc sai lệch lựa chọn)
Excluded from frame and have no chance of being selected (Tồn tại nếu một
số nhóm bị loại khỏi khung và không có cơ hội được chọn)
2. Nonresponse error or bias (Lỗi không phản hồi hoặc sai
lệch) People not respond
3. Sampling error (Sai số do việc lấy mẫu)
Random differences from sample to sample
4. Measurement error (Lỗi đo lường)
Due to weaknesses in question design and / or respondent error.

CHAP 2: ORGANIZING & VISUALIZING VARIABLES


I. Organizing
1. Categorical Data
- One categorical variable => Summary table: tallies frequencies/
percentages of items in a set of categories -> see differences between
categories.
- Two categorical variables => Contigency table (Two – way table)
- Three/ More categorical variables => Multidimensional contingency table
2. Numerical Data
- Ordered Array: rank order, from the smallest value to the largest value
Downloaded by Qu?nh Nh? Tr?n ([email protected])
- Frequency Distribution: summary table: data are arranged into
numerically ordered classes.
+ determining a suitable width of a class grouping, establishing boundaries of
each class
+ a frequency distribution should have at least 5 but no more than 15 classes
Highest value – Lowest value
+ Width of a class interval = Number of class
+ Relative Frequency = Frequency / Total
- Cumulative Distribution
+ Cumulative Percentage = Cumulative Frequency / Total * 100

II. Visualizing
1. Categorical Data
a. Summary table
- Bar chart
- Pareto chart
+ categorical data (nominal scale)
+ Vertical bar chart: descending order of frequency.
+ cumulative polygon
+ separate the “vital few” from the “trivial many.”

- Pie chart/ Doughnut chart


b. Contigency table
- Side by side bar chart
- Doughnut chart

Downloaded by Qu?nh Nh? Tr?n ([email protected])


2. Numerical Data
a. Ordered Array => Stem and Leaf display
- Separate sorted data series: leading digits (stems) & trailing digits (leaves).
b. Frequency Distribution & Cumulative Distribution
- Histogram: no gaps between adjacent bars
Frequency

7
6 Histogram: Age Of Students
5
4
3
2
1
0
5 15 25 35 45 55 Mor
e

- Polygon: midpoint of each class represent the data in that class

- Scatter plot (Two numerical variables)

Downloaded by Qu?nh Nh? Tr?n ([email protected])


Cost per Day vs. Production Volume

250

Cost per
200
150

Day
100
50
0
20 25 30 35 40 45 50 55 60 65
Volume per Day

- Time Series plot (Two numerical variables)

III. Graphical Error


- No relative basis
- Compressing vertical axis
- No zero point on vertical axis
- Chart junk

CHAP 3: NUMERICAL DESCRIPTIVE MEASURES


I. Median
n+1
- Tìm L = 2 = a.5

- Median = (xa.5 – 0.5 + xa.5 + 0.5) / 2

II. Z-score

Downloaded by Qu?nh Nh? Tr?n ([email protected])


- A data value is considered an extreme outlier if its Z-score is less than -3.0
or greater than +3.0.
|Z| > 3 => X : outlier
- The larger the absolute value of the Z-score, the farther the data value is
from the mean.
Z1 > Z2 => X1 : higher relative position than X2

III. Shape of Distribution


1. Skewness: Measures the extent to which data values are not symmetrical
- Mean = Median => Symmetric
- Mean < Median => Left-skewed
- Mean > Median => Right-skewed

2. Kutoris: measures the peakedness of the curve of the distribution

IV. Quartile
- Split the ranked data into 4 segments with an equal number of values
per segment

- Find Q1 : Tìm L1 = (n + 1) / 4 -> Tìm Q1


- Find Q2: Tìm L2 = (n + 1) / 2 -> Tìm Q2
- Find Q3: Tìm L3 = 3(n + 1) / 4 -> Tìm Q3

Interquartile range (IQR) = Q3 – Q1


Dữ liệu nằm ngoài khoảng (Q1 – 1.5IQR ; Q3 + 1.5IQR) -> Outlier

Left-Skewed Symmetric Right-Skewed

Downloaded by Qu?nh Nh? Tr?n ([email protected])


Median – Xsmallest
Median – Xsmallest Median – Xsmallest
> ≈ <
Xlargest – Median
Xlargest – Median Xlargest – Median

Q1 – Xsmallest
Q1 – Xsmallest Q1 – Xsmallest
>
≈ <
Xlargest – Q3
Xlargest – Q3 Xlargest – Q3

Median – Q1
Median – Q1 Median – Q1
>
≈ <
Q3 – Median
Q3 – Median Q3 – Median

V. Covariance
- Measures the strength of the linear relationship between two
numerical variables ( X, Y)
- Sample covariance:

cov(X,Y) > 0 -> X , Y : same direction.


cov(X,Y) < 0 -> X, Y : opposite directions.
cov(X,Y) = 0 -> X, Y : independent

- Coefficient of Correlation:

The closer to –1, the stronger the negative linear relationship.


The closer to 1, the stronger the positive linear relationship.
The closer to 0, the weaker the linear relationship.
Downloaded by Qu?nh Nh? Tr?n ([email protected])
CHAP 4: BASIC PROBABILITY
I. Basic Probabilit concept
- Probability: 0 < P < 1
- Impossible Event: P = 0
- Certain Event: P = 1
II. Mutually Exclusive: Events that cannot occur simultaneously
A∩B=∅
III. Collectively Exhaustive: One of the events must occur.
A∪B=S
IV. Multiplication rule
P ( A and B) = P ( A ∩ B) = P(A) x P(B|A) = P(B) x P(A|B)
V. Conditional probability: probability of one event, given that another event has
occurred
P(A|B) = P(A and B) / P(B)
VI. Independent Event
 P(A|B) = P(A)
 P(A ∩ B) = P(A) x P(B)
P (B∨ A) x P( A)
P(A|B) = P( B)

VII. Counting Rule


1. Counting Rule 1:
If any one of k different mutually exclusive and collectively exhaustive
events can occur on each of n trials (Nếu bất kỳ một trong k sự kiện loại trừ
lẫn nhau và đầy đủ khác nhau có thể xảy ra trên mỗi n lần thử)

kn
2. Counting Rule 2:
If there are k1 events on the first trial, k2 events on the second trial, … and kn
events on the nth trial

(k1)(k2)…(kn)

3. Counting Rule 3:

Downloaded by Qu?nh Nh? Tr?n ([email protected])


The number of ways that n items can be arranged in order is

n! = (n)(n – 1)…(1)
4. Counting Rule 4:
Permutation (Hoán vị): The number of ways of arranging X objects selected
from n objects in order is

n Px

5. Counting Rule 5:
Combinations (Tổ hợp): The number of ways of selecting X objects from n
objects, irrespective of order, is

nCx

CHAP 5: DISCRETE PROBABILITY DISTRIBUTIONS


I. Binominal Distribution
X ~ B(n, π)
X = number of success in n trials
n trials: independent
Each trial: success: p = π
Failure: q = 1 –
π P (X = k) = nCk . πk . (1-
π)n-k Mean: E(X) = n π
Variance: V(X) = n. π.(1- π)
Standard of deviation: σ = √ V(
X)

II. Poisson Distribution


X = number of events occur in an interval of time unit
Length: T
k
( λT )
P(X=k) = e-λT .
k!
λ: number of events occur in a unit of time
E(X) = V(X) = λT
Downloaded by Qu?nh Nh? Tr?n ([email protected])
III. Covariance of a Probability Distribution
σxy = ∑(x - µx)(y - µy). P(x,y)
σxy > 0 => same direction
σxy < 0 => opposite direction

IV. Porfolio Expected Return


E(P) = w. E(X) + (1 – w).E(Y)

V. Porfolio Risk
σ p = √ w 2 . σX2 +(1−w)2 .σY 2 +2 w (1− w ). σXY

CHAP 6: CONTINUOUS RANDOM VARIABLES


I. Uniform Distribution
X ~ U[a, b ]
1
P = (b’ – a’ ) . −
b a
a+b
E(X) = 2 = Median

V(X) = (b – a)2 / 12
II. Normal distribution
X ~ N(µx , σ2)
a− μ
P(X ≤ a) = P(z ≤ )
σ
a > µ => P(X ≤ a) > 0.5
a < µ => P (X ≤ a) <
0.5

CHAP 7: SAMPLING DISTRIBUTION


I. Sampling distribution of Mean
If normal / n ≥ 30
X ~ N(

CHAP 9: TEST OF HYPOTHESIS FOR A SINGLE SAMPLE


(1) H0: µ = μ0
Downloaded by Qu?nh Nh? Tr?n ([email protected])
H1: µ ≠ μ0 (two-tailed)
(2) H0: µ ≥ μ0
H1: µ ¿ μ0 (left-tailed)
(3) H0: µ ≤ μ0
H1: µ ¿ μ0 (right-tailed)
Remark:
α = P(Error type I) = P(error of rejecting Ho when Ho true)
β = P(Error type II) = P(error of failing to reject Ho when Ho false)
α tăng, β giảm

p-value < α => Reject Ho


p-value > α => Fail to reject Ho

CHAP 11: ANOVA


I. One Way ANOVA

II. Two Way ANOVA

Downloaded by Qu?nh Nh? Tr?n ([email protected])


CHAP 13: SIMPLE REGRESSION
Prediction Line (Best fit line): ^y = b0 + b1
b0 is the estimated mean value of Y when X = 0
b1 estimates the change in the mean value of Y as a result of a one-unit increase in X.

Coefficient of Correlation = |Multiple R|

R Square (coefficient of determination):

Standard Error of Estimate:

SSE = error sum of squares


n = sample size

Residual: difference between its observed and predicted value.

Downloaded by Qu?nh Nh? Tr?n ([email protected])


T Test for slope:

F Test for Slope:

T test for correlation coefficient:

r = multiple R, p = 0

Downloaded by Qu?nh Nh? Tr?n ([email protected])

You might also like