ASSIGNMENT FINAL REPORT
Qualification Pearson BTEC Level 5 Higher National Diploma in Business
Unit number and title Unit 42: Statistics for management
Submission date Date Received 1st submission
Re-submission Date Date Received 2nd submission
Student Name Student ID
Class Assessor name
Plagiarism
Plagiarism is a particular form of cheating. Plagiarism must be avoided at all costs and students who break the rules, however innocently, may be penalised. It is
your responsibility to ensure that you understand correct referencing practices. As a university level student, you are expected to use appropriate references
throughout and keep carefully detailed notes of all your sources of materials for material you have used in your work, including any material downloaded from the
Internet. Please consult the relevant unit lecturer or your course tutor if you need any further advice.
Student Declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I declare that the work submitted for
assessment has been carried out without assistance other than that which is acceptable according to the rules of the specification. I certify I have clearly referenced
any sources and any artificial intelligence (AI) tools used in the work. I understand that making a false declaration is a form of malpractice.
Student’s signature
Grading grid
P1 P2 P3 P4 P5 P6 M1 M2 M3 M4 D1 D2 D3
❒ Summative Feedback: ❒ Resubmission Feedback:
Grade: Assessor Signature: Date:
Internal Verifier’s Comments:
Signature & Date:
I. Introduction
In light of the continuous advancements in technology and the increasing
availability of information, the use of statistical methods in information systems
and decision-making processes is becoming crucial for businesses. As a Research
Analyst at a company aiming to enhance its information system and decision-
making capabilities, this report is centered around the application of various
statistical techniques to analyze business data, including financial information
and stock market data, as well as to forecast microeconomic and
macroeconomic trends. The report's main objective is to introduce methods for
gathering and processing data from published sources, while applying different
analytical techniques to present business and economic data effectively. It will
also examine the distinctions between descriptive, exploratory, and
confirmatory analysis methods when applied to such data.
The structure of the report is as follows: First, an evaluation of business and
economic data sourced from published materials will be provided. Following
that, raw business data will be analyzed and assessed using a range of statistical
methods, including descriptive statistics, inferential analysis, and correlation
measurements. Lastly, the report will conclude by emphasizing the significance
of applying statistical methods to improve the business's information system
and decision-making process.
II. Evaluate business and economic data/information obtained from
published sources
1. Data
Definition: Data are the facts and figures collected, analyzed, and summarized
for presentation and interpretation. Data, on their own, have no meaning. All
the data collected in a particular study are referred to as the data set for the
study. Data can be represented in many forms such as text, images, audio,
video, measurements, or any format that can be recorded and stored.
(Newbold.P, 2013)
Ex: Collect information about the heights of 10 participants. Numbers like 160
cm, 170 cm, 165 cm, 180 cm, 175 cm.
2. Type of data
2.1 Quanlitative data
Qualitative data is descriptive information that focuses on concepts and
characteristics, rather than numbers and statistics. The data cannot be counted,
measured or expressed numerically. Researchers collect qualitative data from
text, images, audio and video files, and other sources. After collecting the data,
they analyze it and share it through data visualization tools, such as word
clouds, timelines, graph databases and infographics ( Robert Sheldon, 2023).
Ex: In a survey about food preferences: Responses like "Sweet," "Salty," "Sour,"
"Spicy"
1.1.1. Nominal data
Nominal data is categorical data that cannot be ranked or ordered. The data is
distinguished solely by its labels (Robert Sheldon, 2023).
EX: A dataset might include a category for eye colors, with values such as green,
gray, brown, blue, amber and hazel.
2.1.1 Ordinal data
Ordinal data is categorical data that can be ranked or ordered in a related way.
The data forms a hierarchy that establishes the relationship between the data.
However, the relative differences between the hierarchy's levels cannot be
measured (Robert Sheldon, 2023).
Ex: A category of school grades might include A, B, C, D and F, but there's no
way of knowing if D is twice as bad as C or F is twice as bad as D.
2.2 Quantitative data
Quantitative data is information restricted to numerical values, making it
quantifiable and amenable to statistical analysis. It includes objective and
observable information stated in specified unitsfor example, height,
temperature, income, sales figures, population size, test scores, and weights are
the types of information that is considered quantitative data. These numerical
representations allow quantitative data to be mathematically examined,
allowing patterns and correlations to be identified. (Timonera.K, 2024)
Ex: In a survey about daily water consumption:Responses like 1 liter, 1.5 liters, 2
liters, 2.5 liters.
2.2.1 Interval
Interval data, also called an integer, is defined as a data type which is measured
along a scale, in which each point is placed at equal distance from one another.
Interval data always appears in the form of numbers or numerical values where
the distance between the two points is standardized and equal
Ex: Times of the day (1am, 2am, 3am, 4am, etc.)
2.2.2 Ratio
A Ratio is the fourth type of measurement scale and is quantitative in nature. It
is similar to interval data, where each value is placed at an equal distance from
its subsequent value. However, it has a ‘true zero’ which means that zero
possesses a meaning. The ratio contains the characteristics of nominal, ordinal,
and interval scales and is, therefore, used widely in market research. (Ingram.O,
2023)
Ex: If you have a ratio "4:5", you can express it as a fraction 4/5.
2.3 Data source
2.3.1 Primary data
A primary source provides direct or firsthand evidence about an event, object,
person, or work of art. Primary sources provide the original materials on which
other research is based and enable students and other researchers to get as
close as possible to what actually happened during a particular event or time
period. Published materials can be viewed as primary resources if they come
from the time period that is being discussed, and were written or produced by
someone with firsthand experience of the event. Often primary sources reflect
the individual viewpoint of a participant or observer. Primary sources can be
written or non-written (sound, pictures, artifacts, etc.). In scientific research,
primary sources present original thinking, report on discoveries, or share new
information (Sulbha Wagh, 2023).
Ex: The United Nations monitored the effects of climate change in rural
communities through 10,000 in-person interviews.
2.3.2 Secondary data
Secondary data is a second-hand data that is already collected and recorded by
some researchers for their purpose. Secondary data refers to information that
has already been acquired and recorded by someone other than the user for a
purpose unrelated to the current research challenge. This type of data is
collected from various sources, such as government publications, internal
organizational records, books, journal articles, websites, and reports. (Singh.A,
2024)
Ex: You want to understand how often healthcare concerns were addressed in
political debates during the COVID-19 pandemic. You decide to analyze
transcripts of debates for the frequency of terms such as “pandemic,”
“healthcare,” “virus,” and “vaccination.”
2.4 Information
Information can be seen as data that has been organized or processed in a way
that adds context or meaning. It’s raw facts and figures that have been
structured but not yet interpreted or fully understood. (Hutchinson.C, 2024 )
Ex: Taking a cooking class: You might excel at answering questions on food
nutrition and recipes on a written test but have little practical knowledge of
how to prepare dishes or use cooking techniques properly.
2.5 Knowledge
Knowledge extends beyond mere information; it involves understanding,
interpreting, and applying that information effectively. It represents information
that has been processed and internalized through learning, experience, or
guidance. Knowledge can be viewed as solutions or recommendations derived
by analyzing and interpreting information according to specific rules or
frameworks.
Ex: While "a car travels at 60 km/h" is information, knowing that this speed
allows the car to cover 120 km in 2 hours is knowledge.
2.6 Different method of analysis
2.6.1 Descriptive statistics
Descriptive statistics are brief informational coefficients that summarize a given
data set, which can be either a representation of the entire population or a
sample of a population. Descriptive statistics are broken down into measures of
central tendency and measures of variability (spread). Measures of central
tendency include the mean, median, and mode, while measures of variability
include standard deviation, variance, minimum and maximum
variables, kurtosis, and skewness (Adam Hayes, 2024).
Ex: You calculate the average customer satisfaction score (mean = 8.2 out of
10), determine the most common satisfaction level (mode = 9), and measure
the spread of satisfaction scores (standard deviation = 1.4). These statistics
provide a clear summary of the overall satisfaction levels.
2.6.2 Exploratory analysis
Exploratory data analysis (EDA) is used by data scientists to analyze and
investigate data sets and summarize their main characteristics, often employing
data visualization methods (IBM, 2020).
Ex: You analyze survey data from customers to uncover trends. For instance,
you notice from visualization tools (like scatter plots or heatmaps) that younger
customers tend to rate satisfaction higher than older customers or that
customers prefer weekend visits over weekdays. These findings could guide
future research questions.
2.6.3 Confirmatory analysis
Confirmatory Data Analysis is the part where you evaluate your evidence using
traditional statistical tools such as significance, inference, and confidence
(Sisense Team, 2023).
Ex: You hypothesize that "Customers who receive faster service are more
satisfied." To test this, you collect satisfaction scores and service time data.
Then, you use statistical techniques like correlation or regression analysis to
confirm if shorter service times are significantly associated with higher
satisfaction scores.
III. Analyse and evaluate raw business data using a number of statistical
methods
1. Statistical methods that are used to analyse and evaluate data
1.1 Qualitative data analysis
Qualitative data analysis is the process of making sense of this unstructured
information, uncovering patterns, themes, and insights that shed light on the
human experience. The goal is to transform raw data into actionable, decision-
driving findings that empower you to create products, services, and experiences
that truly resonate with your target audience (Kritika Oberoi, 2024).
1.2 Quantitative raw data analysis
Quantitative data is information that is represented in numerical values or
measurements. This numeric data type is often used in statistical analysis to find
patterns, trends, and relationships between variables (Airbyte, 2024).
Ex: Quantitative data is extensively used to analyze the financial market,
allowing investors to make well-informed decisions. Besides, businesses can
leverage this data to understand consumer behavior, preferences, and market
dynamics
The difference between quanlitative and quantitative raw data analysis
2. Descriptive statistics
2.1 Measures of central tendency
a. Mean
According to Almond (2021), The mean is the average of a data set. It can be
calculated by adding up all of the numbers in the data set and then dividing by
the total number of values in the set.
Formula of mean
Σ xi
x=
n
with x is the point estimator of the population mean
Σ x i is ∑ of the values the n observations
n is number of obvervations in the sample
Ex: Suppose you have a sample of test scores from a class of 5 students: 78, 75
and 63. To find the sample mean?
Mean = (78+75+63) : 3 = 72
b. Median
The median is the middle value (or midpoint) when a data set is ordered from
least to greatest Almond (2021).
Example
Suppose you have the following data set:
{9, 15, 13, 18, 26, 17, 21}
Arrange the data in ascending order:
{9, 13, 15, 17, 18, 21, 26}
The median is the number in the middle of the data set. In this case, the value
17 is the median because there are three numbers on either side.
Therefore, the median of this data set is 17.
c. Variance
Variance is a statistical measurement that is used to determine the spread of
numbers in a data set with respect to the average value or the mean. The
standard deviation squared will give us the variance. Using variance we can
evaluate how stretched or squeezed a distribution is. (Cuemath, 2018)
Formula of varionce
2
∑( x 1−x )
The variance of a sample is: s2=
n−1
The variance for a population is: 𝜎2=
2
∑( x 1−μ)
N
Ex: Suppose we have a data set {5,8,12,7,} and we want to find the
population variance. The mean is given as:
d. Standard deviation
Standard deviation is a statistical measurement of the dispersion of a dataset
relative to its mean. If data points are further from the mean, there is a higher
deviation within the data set. It is calculated as the square root of the variance.
(Marshall.H, 2024)
Formula of varionce
The standard deviation of a sample is: s=√ s 2
The standard deviation of a population is: a=√ a2
Ex: The final semester scores of F, G, H, I, J are: 70, 80, 85, 90, 75.
Calculate the mean: Mean = (70+80+85+90+75):5 =80
Calculate the variance: s 2 = [ (70-80) 2+(80−80) 2+(90−80) 2+(85−80)
2+(75−80) 2] / (5-1) = 62,5
Standard deviation: s= √ 62 ,5 ≈ 7,91
e. Use STATA to calculate
Figure 1
The ROA analysis reveals a significant diversity in the operational efficiency of
companies in the sample. On average, companies generate a profit of about
6.95% on total assets. However, the high standard deviation indicates a wide
dispersion in profitability among businesses. The presence of some companies
achieving very high ROA, while others are incurring losses, highlights the stark
differences in management effectiveness and asset utilization. The distribution
of ROA data tends to be left-skewed, with the median higher than the mean.
This suggests that a few companies have exceptionally high performance,
pulling up the average. However, the majority of companies have average or
below-average ROA. The large standard deviation of ROA also reflects
investment risk. Investors can achieve high returns by investing in companies
with good ROA, but also face the risk of losses if they choose poorly performing
businesses.
The ROA histogram for the telecommunications industry reveals a rather
characteristic data distribution. A majority of companies in this industry cluster
in the ROA range of 0 to 0.2, forming a peak around 0.05 (5%). This indicates
that the asset utilization efficiency of businesses in this sector is primarily at an
average level. However, the chart also shows a significant dispersion of data,
with some companies achieving a maximum ROA of 30% and others incurring
losses with a negative ROA of 10%.
Key features of the chart:
Slight right skew: Although not very pronounced, the chart exhibits a slight
rightward skew. This suggests that a small number of companies in the industry
achieve superior performance compared to the rest.
No clear peak: Instead of a sharp peak, the chart shows a broad peak from 0 to
0.1. This indicates diversity in the performance of businesses within the
industry.
Longer left tail than right tail: The left tail of the chart is longer than the right
tail, indicating a substantial number of companies with low or even negative
ROA. This is also reflected in the minimum ROA value of -10%.
Deeper analysis:
Dispersion: With a standard deviation of 0.02, the dispersion of the data is
relatively large. This means that there is a significant difference in performance
between companies in the industry.
Quartiles: 50% of companies in the industry have an ROA ranging from 3% to
7%. This shows that most companies have average performance.
Reasons for such distribution:
Industry nature: The telecommunications industry requires large initial
investments in infrastructure, leading to long capital recovery cycles and intense
competition. This may explain why some companies have low ROA.
Product life cycle: The telecommunications industry is constantly innovating,
leading to rapid obsolescence of products and services. This requires businesses
to invest continuously in research and development, affecting short-term ROA.
Business size: Larger businesses often have economies of scale, allowing them
to reduce costs and increase operational efficiency, resulting in higher ROA.
No profit: Only 12 businesses fall into this category, indicating a very small
proportion of companies are facing difficulties or may even be operating at a
loss. Low profit: There are 104 businesses in this group, accounting for a
significant proportion. This suggests that a large number of businesses are
operating at a moderate level, with relatively low profits. Medium profit: This
group consists of 88 businesses. This number indicates that a group of
companies have achieved a stable level of profit. High profit: With 164
businesses, this group represents the largest proportion. This shows that a large
number of companies are operating very efficiently and achieving high profits.
Some more detailed observations:
The distribution is uneven: The number of businesses in each group varies,
indicating an uneven distribution of profits among businesses. The majority of
businesses are profitable: The total number of profitable businesses (low-profit,
medium-profit, high-profit) accounts for a very large proportion compared to
the number of unprofitable businesses. This suggests that most businesses are
operating efficiently. Profits are concentrated at the medium and high levels:
Most businesses are concentrated in the medium and high-profit groups,
indicating a generally positive picture of the business situation.
2.2 Inferential statistics
Inferential statistics is a branch of statistics that makes the use of various
analytical tools to draw inferences about the population data from sample data.
Apart from inferential statistics, descriptive statistics forms another branch of
statistics. Inferential statistics help to draw conclusions about the population
while descriptive statistics summarizes the features of the data set. (CueMath,
2023)
Ex: Suppose the mean marks of 100 students in a particular country are known.
Using this sample information the mean marks of students in the country can be
approximated using inferential statistics.
Scatter plot between LEV and Size
The scatter plot illustrates the relationship between the leverage ratio (LEV) and
firm size (SIZE). Based on the chart, we can draw several observations:
Weak positive relationship: The regression line has a positive but relatively flat
slope, indicating a weak positive relationship between LEV and SIZE. This means
that, in general, larger firms tend to have higher leverage ratios compared to
smaller firms.
Large data dispersion: The data points are widely scattered around the
regression line, indicating that the relationship between the two variables is not
entirely strong. There are many large firms with low debt and many small firms
with high debt.
Explanation of the relationship:
Why do larger firms tend to have higher leverage?
Easier access to capital: Larger firms typically have better credit ratings and
easier access to debt financing at lower interest rates.
Diversification of funding sources: Larger firms can utilize a variety of funding
sources (equity, debt) to finance large investment projects, helping to reduce
risk.
Leverage effect: When the return on investment (ROI) is higher than the
interest rate on debt, using leverage can increase returns for shareholders.
Why don't all large firms have high leverage?
Conservative financial policies: Some large firms may have conservative
financial policies, preferring to use equity rather than debt to ensure financial
stability.
Industry characteristics: Some industries are capital-intensive, requiring large
initial investments, so firms in these industries often have lower leverage ratios.
Strategic objectives: Depending on the firm's strategic objectives, they may
choose to use leverage at different levels.
Recommendations from an economic perspective:
For firms:
Carefully consider using leverage: Using leverage can increase profits but also
carries higher risk. Firms should carefully evaluate the profitability of
investment projects, debt repayment capacity, and other risk factors before
deciding to use leverage.
Diversify funding sources: Do not rely too heavily on a single source of funding.
A combination of equity and debt can help mitigate risk.
Develop a robust financial plan: Firms should have a clear financial plan,
including risk scenarios, to ensure their ability to meet debt obligations.
For investors:
Assess the firm's risk management capabilities: When investing in a firm,
investors should carefully consider the firm's ability to manage risk, especially
risks associated with leverage.
Diversify investment portfolios: Do not invest too much in a single firm with a
high leverage ratio. Diversifying the investment portfolio can help reduce risk.
2.3 The differences in application between descriptive statistics, inferential
statistics and measuring association.
2.3.1 Descriptive statistics
Descriptive statistics are brief informational coefficients that summarize a given
data set, which can be either a representation of the entire population or a
sample of a population. Descriptive statistics are broken down into measures of
central tendency and measures of variability (spread). Measures of central
tendency include the mean, median, and mode, while measures of variability
include standard deviation, variance, minimum and maximum
variables, kurtosis, and skewness (Adam Hayes, 2024).
Ex: You calculate the average customer satisfaction score (mean = 8.2 out of
10), determine the most common satisfaction level (mode = 9), and measure
the spread of satisfaction scores (standard deviation = 1.4). These statistics
provide a clear summary of the overall satisfaction levels.
2.3.2 Inferential statistics
Inferential statistics is a branch of statistics that makes the use of various
analytical tools to draw inferences about the population data from sample data.
Apart from inferential statistics, descriptive statistics forms another branch of
statistics. Inferential statistics help to draw conclusions about the population
while descriptive statistics summarizes the features of the data set. (CueMath,
2023)
Ex: Suppose the mean marks of 100 students in a particular country are known.
Using this sample information the mean marks of students in the country can be
approximated using inferential statistics.
2.3.3 Measuring association.
Measures of association refers to a wide variety of statistics that quantify the
strength and direction of the relationship between exposure and outcome
variables, enabling comparison between different groups. The measure
calculated depends on the study design used to collect data. Odds ratios should
be used for case-control and cross-sectional studies, whereas relative risk
should be used in cohort studies. ( Michelle Roberts, 2019)
Ex: In marketing, you can use regression analysis to predict revenue based on
advertising budget and evaluate the relationship between these factors.
2.3.4 The differences
IV. Conclusion
In conclusion, this report has thoroughly examined business and economic data
by systematically analyzing information from various sources. We started by
explaining the basic ideas of data, information, and knowledge, showing how
they are connected and transformed. We then looked at different data analysis
methods—descriptive, exploratory, and confirmatory—detailing their uses and
differences in studying business and economic data. We also applied various
statistical methods to assess raw business data. Descriptive statistics helped us
measure central trends and variability, providing clear insights into the data.
Inferential statistics allowed us to make broader conclusions, and examining the
relationships between variables showed important business connections.
Overall, this report highlights the essential role of careful data analysis in
making wellinformed business decisions, emphasizing the value of both
qualitative and quantitative methods in gaining useful insights.
V. References
Newbold, P., Carlson, W. L. & Thorne, B. M., 2013. Statistics for Business and
economics. 4th ed. England: Pearson. Bluman, A. G. (2017). Elementary
Statistics: A Step by Step Approach (10th ed.). McGraw-Hill Education.
Timonera, K. (2024) What is quantitative data? characteristics & examples,
Datamation.
Airbyte (2024). What is Quantitative Data Analysis? [online] Airbyte.com.
Available at: https://airbyte.com/data-engineering-resources/what-is-
quantitative-data-analysis#quantitative-vs-qualitative-data-analysis [Accessed 1
Dec. 2024].
CueMath (2023). Inferential Statistics - Definition, Types, Examples, Formulas.
[online] Cuemath. Available at: https://www.cuemath.com/data/inferential-
statistics/.
Hayes, A. (2024). Descriptive statistics: Definition, overview, types, example.
[online] Investopedia. Available at:
https://www.investopedia.com/terms/d/descriptive_statistics.asp.
Hillier, W. (2022). What is Secondary Data? [Examples, Sources & Advantages].
[online] CareerFoundry. Available at: https://careerfoundry.com/en/blog/data-
analytics/what-is-secondary-data/.
Hiter, S. (2021). What is Quantitative Data? | Datamation | Big Data.
Datamation. [online] 7 Apr. Available at: https://www.datamation.com/big-
data/what-is-quantitative-data/.
IBM (2020). What Is Exploratory Data Analysis? | IBM. [online] www.ibm.com.
Available at: https://www.ibm.com/topics/exploratory-data-analysis.
National Library of Medicine (2023). Qualitative Data | NNLM. [online]
www.nnlm.gov. Available at:
https://www.nnlm.gov/guides/data-glossary/qualitative-data.
Oberoi, K. (2024). Qualitative Data Analysis: A Complete Guide [2024]. [online]
www.linkedin.com. Available at: https://www.linkedin.com/pulse/qualitative-
data-analysis-complete-guide-2024-kritika-oberoi-bwvsf.
Robert Sheldon (2023). What is Qualitative Data? [online] SearchCIO. Available
at: https://www.techtarget.com/searchcio/definition/qualitative-data.
Roberts, M., Ashrafzadeh, S. and Asgari, M. (2019). Research Techniques Made
Simple: Interpreting Measures of Association in Clinical Research. Journal of
Investigative Dermatology, [online] 139(3), pp.502-511.e1.
doi:https://doi.org/10.1016/j.jid.2018.12.023.
Sisense (2019). Exploratory and Confirmatory Analysis: What’s the Difference? l
Sisense. [online] Sisense. Available at:
https://www.sisense.com/blog/exploratory-confirmatory-analysis-whats-
difference/.
Ingram, O. (2023) Ratio Data: Definition, examples, and analysis,
ResearchProspect.
Available at: https://www.researchprospect.com/ratio-data-definition-
examples-and-analysis/
Hutchinson, C. (2024) Information vs knowledge: What’s the difference and why
does it matter?, Claned.
Available at: https://claned.com/information-vs-knowledge/