Econometric S
Econometric S
I. Probability
1. Counting Rules
1.1. The Addition and Multiplication Principles in Combinatorics
Combinatorics often focuses on counting the number of ways an event can occur or the number
of elements in a set. Two of the most fundamental and important rules in combinatorics are the
addition principle and the multiplication principle.
(*) The Addition Principle
Theory:
Statement: If a task (or a choice) can be performed in two (or more) mutually exclusive
ways (i.e., they do not overlap), then the total number of ways to perform this task is the
sum of the number of ways for each separate method.
Expression:
o If there are m ways to perform it by method 1 and n ways by method 2, with the
two methods being mutually exclusive (cannot happen simultaneously), then the
total number of ways is m + n.
o This principle can be extended for multiple mutually exclusive options: if there
are m ways for option 1, n ways for option 2, etc., then the total number of ways
is m + n +…
Illustrative Examples:
Example 1: “To get to point C, one can depart from A or from B.”
o There are 3 roads from A to C.
o There are 4 roads from B to C.
o The two choices “starting at A” and “starting at B” are mutually exclusive (you
cannot start from both A and B at the same time).
o Therefore, by the addition principle, the number of ways to reach C is 3 + 4 = 7.
Example 2: Suppose you have 2 elective course groups: Group 1 has 5 courses, and
group 2 has 4 courses. You only need to pick 1 course from one of the two groups (not
from both at once). Thus, the total number of course choices is 5 + 4 = 9.
(*) The Multiplication Principle
Theory:
Statement: If a task (or a choice) is divided into several consecutive stages, and each
stage has a certain number of ways to be performed, then the total number of ways to
carry out the entire task is the product of the number of ways in each stage.
Expression:
o If the task has 2 stages: stage 1 can be done in m ways, and stage 2 can be done in
n ways, then the total number of ways to complete both stages in sequence is
m×n.
o Likewise, if the task is divided into multiple stages with m1,m2,…,mk ways,
respectively, then the total number of ways to complete the entire task is
m1×m2×⋯×mk.
Illustrative Examples:
Example 1: “To get to point C, you can only depart from A, then must go through B,
before finally reaching C.”
o There are 3 roads from A to B.
o There are 4 roads from B to C.
o Traveling from A to B is stage 1, choosing 1 of 3 roads; then, traveling from B to
C is stage 2, choosing 1 of 4 roads.
o Because these two stages happen in sequence, by the multiplication principle, the
number of ways is 3×4=12.
Example 2: You need to create a password of 2 characters: the first character is a letter
(26 possibilities), and the second character is a digit (10 possibilities). The total number
of different passwords is 26×10=260.
1.2. Permutations
Definition: Given n distinct elements, each way of arranging (ordering) all n elements is
called a permutation of those n elements.
Formula: P(n) = n!(read as “n factorial”)
Where n! = n × (n−1) ×⋯×2×1.
Example: Suppose we have 3 different books A, B, C. The number of ways to line them
up in a row (where order matters) is 3! = 1×2×3 = 6.
1.2.1. Permutations with repetition
We have a total of n elements, but not all are distinct. For instance, the letters of the word
“BANANA,” where A appears 3 times, N appears 2 times, and B appears 1 time.
If there are n elements in total, divided into groups by type (suppose there are k types),
where type 1 appears n1 times, type 2 appears n2 times, …, and type k appears nk times,
then:
n!
The number of permutations can be calculated by the formula
n1 !∗n2 !∗…∗n k !
Example: The word “BANANA” has 6 letters: B(1), A(3), N(2). Total n = 6. The
number of distinct arrangements of these 6 letters is
6!
The Number of permutations is =60
1 !∗2 !∗3 !
1.3. Combinations
Definition: From n distinct elements, we want to select k of them, where order does not
matter. Each selection of k elements is called a combination of k from n.
Formula:
n!
nCk=
k !∗( n−k ) !
1.4. Arrangements
Definition: From n distinct elements, one wants to select k of them and arrange them in
order. Each such ordered selection is called an arrangement of k out of n elements or a
partial permutation.
Formula:
n!
nAk=
( n−k ) !
2. Probabilities and Events
2.1. Definition
Consider a random experiment:
o The sample space, denoted by n(S), is the set of possible outcomes of the
experiment.
o If there are m possible outcomes of the experiment, then we will generally
number them 1 through m. Then n(S) ={1, 2, …, m}
o When dealing with specific examples, we will usually give more descriptive
names to the outcomes.
o The event, denoted by n(A), is all outcomes that we expect to occur when
conducting an experiment; in other words, it is a subset of the sample space.
o A simple Event is an outcome or an event that cannot be further broken down
into simpler components.
o The Sample Space for an experiment consists of all possible simple events; that
is, the sample space consists of all outcomes that can not be broken down any
further.
Notation for Probabilities
o P denote the probability
o A, B, C, D, …etc. denote the specific events
o P(A) denotes the probability that event A occur
n ( A)
P ( A )=
n (S )
Probability limits
o The probability of an impossible event is 0
o The probability of an event that is certain to occur is 1
o For any event A, the probability of A is between 0 and 1 inclusive. That is:
0 ≤ P( A )≤ 1
o Let n(S) = {1, 2, 3,… , m} is all possible outcomes of an experiment
o Let Pi be the probability that i is the outcome of the experiment. Then:
m
∑ Pi =1
i=1
m
P ( S )=∑ Pi =1
i=1
P ( A ∪ B )=P ( A ) + P ( B ) −P( A ∩ B)
Disjoint events
o Events A and B are disjoint (or mutually exclusive) if they cannot occur at the
same time. (That is, disjoint events do not overlap.) In other words, P ( A ∩B )=0.
o If events A and B are disjoint, then:
P ( A ∪ B )=P ( A ) + P ( B )
P ( B| A )=P ( B )
Multiplicational Rule
P ( AB ) =P ( B| A )∗P( A)
Rule of Average Conditional Probabilities
o If B1, B2, B3, ..., Bn is a disjoint partition of n(S), and in each event Bi, there are
events A1, A2, A3, ..., An disjoint together, then
P ( A )=P ( A B1 ) + P ( A B 2) + …+ P ( A B n )=P ( A|B1 )∗P ( B1 ) + P ( A|B2 )∗P ( B 2) + …+ P ( A|Bn )∗P (B n)
Bayes Theorem
o Suppose when conducting an experiment we are interested in events B and B, and
in each of those events, there are A and A , then
P ( AB ) P ( A|B )∗P (B)
P ( B| A )= =
P ( A ) P ( A|B )∗P ( B )+ P ( A|B )∗P( B)
Example 1: A coin is flipped twice. Assuming that all four points in the sample space S={(h, h),
(h, t), (t, h), (t, t)} are equally likely, what is the conditional probability that both flips land on
heads, given that (a) the first flip lands on heads, and (b) at least one of the flips lands on heads?
(*) Solution
Let A = {(h, h)} be the event that both flips land on heads; let B = {(h, h), (h, t)} be the event
that the first flip lands on heads; and let C = {(h, h), (h, t), (t, h)} be the event that at least one of
the flips lands on heads. We have the following solutions:
P ( AB ) P ({(h , h)}) 0.25
( a ) P ( A|B )= = = =0.5
P ( B ) P( {( h ,h ) , ( h , t ) }) 0.5
P ( AC ) P ({( h , h ) }) 0.25 1
( b ) P ( A|C )= = = =
PC ( ) P ( { ( h , h ) , ( h , t ) , ( t , h ) } ) 0.75 3
Example 2: Suppose that two balls are to be withdrawn, without replacement, from an urn that
contains 9 blue and 7 yellow balls. If each ball drawn is equally likely to be any of the balls in
the urn at the time, what is the probability that both balls are blue?
(*) Solution
Let B1 and B2 denote, respectively, the events that the first and second balls withdrawn are blue.
9
∗8
16
P ( B 1 B2 )=P ( B2|B1 )∗P ( B1 )=
15
Example 3: Suppose that, with probability 0.52, the closing price of a stock is at least as high as
the close on the previous day and that the results for successive days are independent. Find the
probability that the closing price goes down in each of the next four days but not on the
following day.
(*) Solution
Let Ai be the event that the closing price goes down on day i. Then, by independence, we have
4
P ( A 1 A 2 A 3 A 4 A 5 )=0.52 ∗0.48=0.0276
Example 4: There are 6 passengers and 10 train cars. Calculate the probability that the first car
has 2 passengers, the second car has 3 passengers, and the third car has 1 passenger.
(*) Solution
One person has ten ways to choose a car, so six people have 106 ways to choose a car.
Let A be the event that the first car has 2 passengers, the second car has 3 passengers, and the
third car has 1 passenger.
( 6 C 2∗10 A 1 )∗( 4 C 3∗9 A 1 )∗(1C 1∗8 A 1)
P ( A )= 6
=0.0432
10
Example 5: Each Vietlott lottery ticket consists of a sequence of 6 randomly chosen numbers.
To win the jackpot, your entire 6-number sequence must match the winning numbers
exactly.
To win the first prize, your ticket must contain 5 consecutive numbers that match 5
consecutive winning numbers.
To win the second prize, your ticket must contain 4 consecutive numbers that match 4
consecutive winning numbers.
To win the third prize, your ticket must contain 3 consecutive numbers that match 3
consecutive winning numbers.
What is the probability that a person who buys 4 random tickets wins all four prizes (jackpot,
first, second, and third prizes)? Suppose that the sequences on each ticket are independent.
(*) Solution
Assume a sequence of 6 random numbers on each ticket is abcdef, then each number (a, b, c, d,
e, or f) will have 10 available choices (from 0 to 9), and the number chosen in a vacant slot can
be repeated.
6
→ The sample space is10
Let A, B, C, and D be events that he/she wins the jackpot, first, second, and third prizes,
respectively.
Suppose the consecutive winning number is 112234, and his/her sequence is abcdef
The probability of winning a Jackpot
o There is only one way to choose an expected outcome for a, b, c, d, e, or f, so
n(A) is 1
1
P ( A )= 6
10
The probability of winning a First prize
o You need 5 consecutive numbers that match 5 consecutive winning numbers.
o Hence, your sequence must be
Case 1: abcdef =11223f = 1*1*1*1*1*10
Case 2: abcdef =a12234 = 10*1*1*1*1*1
o In each number, a or f, you have 10 available choices, hence, your expected
outcomes are 20
20
P ( B )= 6
10
n ( C ) 3∗102
P (C)= =
n (S ) 10
6
P ( Ai +1| A i )∗P ( A i )
P ( A i| Ai−1 ) = → P ( A i+1|A i )∗P ( A i )=P ( Ai| A i−1 )∗P ( A i−1 ) ,(1)
P ( A i−1 )
Where:
P ( Ai +1| A i )=1
(Be sure that the frog will jump ¿leaf B ,C ,∨D if already stands on leaf A)
1
P ( A i| Ai−1 ) =
3
(When standing on leaf A ,the frog can choose 1 of 3 leaves¿ jump¿)
1 1
( 1 ) ↔ P ( Ai ) = P ( A i−1 )= ( 1−P ( A i−1 ) )
3 3
1 1
→ P ( A 2 )=
3
( 1−P ( A 1 ) ) =
3
1
∗2
1 3 2
→ P ( A 3 )= ( 1−P ( A 2 ) ) = =
3 3 9
→ P ( A 4)=
1
3
( ( ) 1 2
1−P ( A 3 ) )= 1− =
3
7
9 27
→ P ( A )= ( 1−P ( A ) )= (1− )=
1 1 7 20
5 4
3 3 27 81
→ P ( A 6 )=
1
3 ( 1
)
20 61
( 1−P ( A5 ) )= 3 1− 81 = 243
Example 7: There are 10 balls, of which 3 are white and 7 are black, in an urn. Three players –
A, B, C – successively draw from the urn: A first, then B, then C, then A, and so on. The winner
is the first one to draw a white ball. What is the winning probability for each player?
(*) Solution
Let A, B, and C be the event that is the turn of players A, B, and C, respectively.
We denote W as the event that a play draws a white ball, and W”, on the vice versa.
Hence, W|A is the event that player A wins the game, and W”|A is the event that the player A
draws a black and then the turn of player B, and so on.
3
∗2
4
∗3
5
∗4
3 6
∗5 ∗5
7 7
∗6 ∗6
8 8
∗7 ∗7
3 9 9
P ( W | A )= + + =0.45
10 10 10
1∗1
∗2
4
∗3
3 5
∗4 ∗4
6 6
∗5 ∗5
7 7
∗6 ∗6
3 8 8
∗7 ∗7 ∗7
9 9 9
P ( W |B )= + + =0.325
10 10 10
3
∗3
5
∗4
6
∗5
3 7
∗6 ∗6
8 8
∗7 ∗7
9 9
P ( W |C )= + =0.225
10 10
Example 8: There are 3 bouquets of flowers. Bouquet I has 8 roses, Bouquet II has 7 lilies, and
Bouquet III has 6 tuberoses. A random selection of 7 flowers is chosen from the 3 bouquets to
be placed in a vase. Calculate the probability that the number of roses in the 7 selected flowers is
equal to the number of lilies.
Example 9: A box contains 8 white balls and 12 black balls. Two balls are randomly drawn
one after another. Calculate the probability that both selected balls are of the same color.
Example 10: There are 13 students from a high school who have achieved the title of
Outstanding Student. Among them, Grade 12 has 8 male students and 3 female students,
while Grade 11 has 2 male students. Three students are randomly selected to receive an award.
Calculate the probability that the selected group of 3 students includes both male and female
students and contains at least one student from each of Grade 11 and Grade 12.
Example 11: A box contains 12 balls of the same size, including 5 blue balls numbered from
1 to 5, 4 red balls numbered from 1 to 4, and 3 yellow balls numbered from 1 to 3. Two balls
are randomly drawn. Calculate the probability that the two selected balls differ in both color
and number.
Example 12: A football tournament consists of 9 teams, including 6 international teams and 3
teams from Vietnam. The organizing committee conducts a random draw to divide the teams into
3 groups (A, B, C), with each group containing 3 teams. Calculate the probability that the 3
Vietnamese teams are placed in 3 different groups.
Example 13: Suppose a box contains 4 red balls and 5 white balls (balls of the same color are
identical). Balls are drawn randomly one by one from the box until all are taken out. Calculate
the probability of drawing all 4 red balls before drawing 3 white balls. (Round the result to the
nearest percent).
Example 14: There is a row of 6 chairs arranged in a horizontal line. Six people, consisting of 3
male students, 2 female students, and 1 teacher, are randomly seated in these chairs, with each
person occupying one chair. Calculate the probability that the teacher sits between the two
female students.
Example 15: In a residential area, the proportion of people who both smoke and have
nasopharyngeal cancer is 15%. There are 25% of the population who smoke and do not have
nasopharyngeal cancer, and 50% of the population do not smoke and do not have
nasopharyngeal cancer, while 10% of people who do not smoke and have nasopharyngeal
cancer. Based on the statistical data above, is it correct to conclude that the risk of
nasopharyngeal cancer among smokers in this residential area is 2.25 times higher than that of
non-smokers?
(*) Solution
Define A as the event that the person smokes and B as the event that the person gets
nasopharyngeal cancer.
Given that
P(A∩B) = 15%
P ( A ∩B )=25 %
P ( A ∩B )=50 %
P ( A ∩B )=10 %
P(A∩B) P ( AB ) 0.15 3
P ( B| A )= = = =
P( A) P ( A B )+ P ( AB ) 0.25+ 0.15 8
P(A|B) = ( Xác suất hút thuốc biết rằng bị ung thư vòm họng)
P ( A B) P( A B) 0.1 1
P ( B| A )= = = =
P(A) P ( A B )+ P ( A ∩ B ) 0.1+ 0.5 6
P ( B|A )
→ =2.25
P ( B|A )
Hence, it is correct to conclude that the risk of nasopharyngeal cancer among smokers in this
residential area is 2.25 times higher than that of non-smokers
Example 16: A factory producing electrical equipment operates three production lines, with
each line contributing 40%, 35%, and 25% of the total product output, respectively. The
probability that a product from line 1 is defective is 2%, from line 2 is 3%, and from line 3 is 4%.
A product is randomly selected from the factory and found to be defective. What is the
probability that this product came from line 1?
Example 17: Suppose a certain country has an experiment to test the COVID-19 vaccine with
two groups: 70% of participants received the vaccine (in group A) and 30% of them did not
receive the vaccine (in group B). After the test, for groups A and B, the probability of “a
negative result” is 90% and 60%, respectively.
(a) What is the probability that randomly choosing a person gets a negative result? What is the
probability that randomly choosing a person will get a positive result?
(b) If a person gets a negative result, what is the probability that she or he came from group A?
(c) If a person gets a positive result, what is the probability that she or he came from group B?
Example 18: You ask your neighbor to water a sickly plant while you are on vacation. Without
water, it will die with a probability of 0.8; with water, it will die with a probability of 0.15. You
are 90 percent certain that your neighbor will remember to water the plant.
(a)What is the probability that the plant will be alive?
(b) If it is dead, what is the probability that your neighbor forgot to water it?
Example 19: Last year at Northern Manufacturing Company, 200 people had colds during the
year. One hundred fifty-five people who did not exercise had colds, and the remainder of the
people with colds were involved in a weekly exercise program. Half of the 1,000 employees
were involved in some type of exercise.
(a) What is the probability that an employee will have a cold next year?
(b) Given that an employee is involved in an exercise program, what is the probability that he or
she will get a cold next year?
(c) What is the probability that an employee who is not involved in an exercise program will get
a cold next year?
B (do exercise) B” ( not do exercise) Total
A (cold) 45/1000 155/1000 200/1000
A” (not cold) 455/1000 345/1000 800/1000
Total 500/1000 500/1000 1000/1000
P(A) = 200/1000
P(A|B) = 45/1000/(500/1000) =
P(A|B”) = 155/1000 / (500/1000) =
Example 20: Suppose we have 3 cards identical in form except that both sides of the first card
are colored red, both sides of the second card are colored black, and one side of the third card is
colored red and the other side is colored black. The 3 cards are mixed up in a hat, and 1 card is
randomly selected and put down on the ground. If the upper side of the chosen card is colored
red, what is the probability that the other side is colored black?
Example 21: By using NLP, I can detect spam emails in my inbox. Assume that the word ‘offer’
occurs in 80% of the spam messages in my account. Also, let’s assume the ‘offer’ occurs in 10%
of my desired e-mails. If 30% of the received e-mails are considered spam, and I will receive a
new message which contains an ‘offer’, what is the probability that it is spam?
Example 22: The New York State Health Department reports a 10% rate of HIV for the “at-risk”
population. Under certain conditions, a preliminary screening test for HIV is correct 95% of the
time. (Subjects are not told that they are HIV infected until additional tests verify the results.) If
someone is randomly selected from the at-risk population, what is the probability that they have
HIV if it is known that they have tested positive in the initial screening? If someone is randomly
selected from the at-risk population, what is the probability that they have HIV if it is known that
they have tested negative in the initial screening?
Example 23: A survey showed that 8% of Internet users age 18 and older report keeping a blog.
Referring to the 18–29 age group as young adults, the survey showed that for bloggers, 54% are
young adults, and for non-bloggers, 24% are young adults.
(a) Develop a joint probability table for these data with two rows (bloggers vs. non-bloggers) and
two columns (young adults vs. older adults).
(b) What is the probability that an Internet user is a young adult?
(c) What is the probability that an Internet user who keeps a blog is a young adult?
(d) Suppose that in a follow-up phone survey, we contact someone who is an older adult. What is
the probability that this person keeps a blog?
P(B|A) = 0.54
P(B|A’’)=0.24
(*) Solution
B (Young) B” ( not young) Total
A (Blogger) 0.0432 0.0368 0.08
A” (Non-Blogger) 0.2208 0.6992 0.92
Total 0.264 0.736 1
P(B) = 0.264
P(A|B) = 0.0432 / 0.264
P(A|B”)= 0.0368/0.736
Example 24: A certain virus infects one in every 200 people. A test used to detect the virus in a
person is positive 90% of the time if the person has the virus and 10% of the time if the person
does not have the virus. (This 10% result is called a false positive.) Let A be the event "the
person is infected" and B be the event "the person tests positive".
(a) Find the probability that a person has the virus, given that they have tested positive. Round
your answer to the nearest tenth of a percent, and do not include a percent sign.
(b) Find the probability that a person does not have the virus, given that they test negative.
Round your answer to the nearest tenth of a percent, and do not include a percent sign.
Example 25: Dollon Web Security Consultants requires all job applicants to submit to a test for
illegal drugs. If the applicant has used illegal drugs, the test has a 90 percent chance of a positive
result. If the applicant has not used illegal drugs, the test has an 85 percent chance of a negative
result. 4 percent of the job applicants have used illegal drugs. If an applicant has a positive test,
what is the probability that he or she has used illegal drugs?
Example 26: During a period in World War II, the U.S. Army Air Forces (AAF) would send
over 300 B-17 bombers daily to raid factories in Germany. These missions, originating in the
U.K., were dangerous. At the peak of the campaign, the return probability for a B-17 crew was
only 80%. In trying to reduce the probability of a failed mission, a Navy statistician, Abraham
Wald, was put in charge of studying the damage patterns in the B-17s that successfully made it
back from a mission. His ultimate goal was to decide where to add extra armor to the planes (you
could not just add heavy armor everywhere, as the planes would be too heavy to fly!). Wald was
able to learn that if a plane made it back from a mission, there was a 67% probability that it was
shot in the fuselage, 15% in the fuel systems, 10% in the cockpit area, and 8% in the engines.
From experiments, Wald was also able to deduce that during combat, a B-17 would be shot in
the fuselage with 56% probability, in the fuel systems with 14%, in the cockpit area with 14%,
and in the engine with 16%. Calculate the probabilities of being hit in the fuel systems and in the
engines, respectively, given that the plane did not return.
(*) Solution
Example 27: In a study on behavior and traffic safety, a team of experts randomly surveyed 800
participants. The collected data revealed that 250 individuals who were of legal driving age and
actively participating in traffic had consumed alcohol. Additionally, 350 participants were under
the legal driving age but were still involved in traffic, while 450 individuals refrained from
drinking alcohol while participating in traffic. Calculate the probability that a randomly selected
individual who participates in traffic does not consume alcohol, given that they are of legal
driving age.
Example 28: You are a corporate investment analyst. After a while of tracking stock A, you
know that the probability of a positive return of over 20% on this stock is 54.25%. Knowing that
Stock A will have a 30% probability of generating a positive return at most 20% when the
market interest rate increases and a 75% chance of generating a positive return at most 20%
when the market interest rate decreases. Suppose your boss avoids investing on days when
market interest rates are down, and you are sure that today, the positive return on stocks will be
over 20%. What advice would you give?
(*) Solution
X = return of stock A
P(X>0.2) = 0.5425
A is the event that the market interest rate increases
A” is the event that the market interest rate decreases
P(X<0.2|A) = 0.3 -> P(X>0.2|A) =0.7
P(X<0.2|A”) = 0.75 -> P(X>0.2|A”) =0.25
P(A”|X>0.2) = P( A ∩X>0.2)} over {P(X>0.2)} = {P left (A ¿)∗P ¿ ¿
Example 29: With the advent of online ordering, shipping has become more important to
consumers. One survey showed that 80% of online consumers want same-day shipping. Another
study showed that 24% of online consumers are shopping lovers who enjoy buying and
purchasing often. In addition, suppose that 61% of online consumers who are shopping lovers
want same-day shipping. If an online consumer is randomly selected, what is the probability that
(a) The consumer wants same-day shipping and is a shopping lover?
(b) The consumer does not want same-day shipping, given that the consumer is a shopping
lover?
(c) The consumer is not a shopping lover and the consumer does want same-day shipping?
(d) The consumer does not want same-day shipping, given that the consumer is not a shopping
lover?
Same day shipping Not same-day shipping Total
Shopping lover 14.64% 9.36% 24%
Not shopping lover 65.36% 10.64% 76%
Total 80% 20% 100%
Example 30: A health survey conducted in a locality showed that 41% of people tested positive
for COVID-19. The survey participants were divided into two groups: vaccinated and
unvaccinated. The false positive rate for COVID-19 was 20%, while the false negative rate was
10%. In the unvaccinated group, the mortality rate was 85% and 60% for those who tested
negative and positive, respectively. In addition, the rate of positive people who died from
COVID-19 in the vaccinated group was 65%. Calculate the probability of randomly selecting a
person who was vaccinated and tested positive for COVID-19, knowing that this person
survived.
II. Descriptive statistics
1. The concept of statistics
When analyzing data series consisting of many observations, employing summary measures to
highlight the most critical characteristics of such series is often highly beneficial. These
measures are commonly known in financial and economic research as summary statistics or
descriptive statistics. Generally, descriptive statistics are calculated based on a specific data
sample rather than inferred or established through existing theoretical frameworks. Before
delving into the discussion of the most widely used summary statistics applied to financial data,
it is crucial first to clarify two fundamental concepts in statistics: population and sample.
The population refers to the entire set of objects or elements of interest in a particular study. For
instance, when assessing the relationship between risk and return for stocks in the UK market,
the relevant population would encompass all stocks listed on the London Stock Exchange (LSE).
In practice, researching this entire population is typically unfeasible due to limitations in
resources, time constraints, or data availability. Therefore, researchers generally choose to
examine a smaller subset of the population, referred to as a sample. A sample is a selection of
items drawn from the population, intended to represent the larger set. The population is defined
as finite when the number of elements it contains is clearly determinable.
Typically, the chosen sample must be representative and appropriate, allowing researchers to
generalize their analytical findings to the original population. A random sample is defined as one
in which each individual within the population has an equal and fair probability of being
selected. In certain cases, researchers may employ a stratified sampling method, meaning the
entire population is subdivided into distinct layers or groups based on specific criteria.
Subsequently, the researcher selects a certain number of observations from each stratum,
ensuring adequate representation of all groups within the population.
Lastly, the sample size refers to the number of observations selected within the sample for
conducting research. The size of the sample can either be predetermined by the researcher or
dictated by technical requirements for estimating particular parameters of the model being
constructed.
2. Measures of Central Tendency
In statistical analysis, the average value of a data series is often referred to as a measure of
location or a measure of central tendency. This measure serves as a representative value,
intended to reflect the ‘typical’ characteristic of the data set. There are three commonly used
methods for measuring central tendency: the arithmetic mean, median, and mode. Among these,
the most widely recognized and extensively utilized method is the arithmetic mean (commonly
known simply as the ‘mean’). For a data series rir_iri consisting of NNN elements, the arithmetic
mean (rˉA\bar{r}_ArˉA) is calculated by dividing the sum of all observed values by the total
number of observations: