0% found this document useful (0 votes)
19 views12 pages

Understanding Probability and Statistics

The document covers the fundamentals of statistics, including the nature and importance of statistics, types of statistics (descriptive and inferential), and various data types and sampling techniques. It also discusses methods of data collection, experimental design, and common misuses of statistics. Additionally, it explains frequency distributions, types of graphs for data visualization, and the appropriate graph types based on data characteristics.

Uploaded by

23-60992
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views12 pages

Understanding Probability and Statistics

The document covers the fundamentals of statistics, including the nature and importance of statistics, types of statistics (descriptive and inferential), and various data types and sampling techniques. It also discusses methods of data collection, experimental design, and common misuses of statistics. Additionally, it explains frequency distributions, types of graphs for data visualization, and the appropriate graph types based on data characteristics.

Uploaded by

23-60992
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Math: Data Analysis

2nd Semester

C Nature of Probability and Statistics


1
Introduction to Statistics

 Statistics: The science of collecting, organizing, summarizing,


analyzing, and drawing conclusions from data.

 Uses of Statistics:

o Analyzing survey results.

o Decision-making in scientific research.

o Used in fields like sports, public health, education, and


business.

 Importance of Studying Statistics:

1. Understand statistical studies in various fields.

2. Conduct research, design experiments, and analyze data.

3. Make informed decisions as consumers and citizens.

1. Descriptive and Inferential Statistics

Descriptive Statistics

 Summarizes and describes the main features of a dataset.

 Focuses on presenting data without making inferences.

 Example:

o A company records test scores of 100 students.

o Descriptive statistics include:

 Mean (Average): 75

 Median: 78

 Standard Deviation: 10

 25% of students scored below 70


Math: Data Analysis
2nd Semester

o These describe the dataset but do not generalize beyond it.

Inferential Statistics

 Makes inferences from samples to populations.

 Includes hypothesis testing, determining relationships, and making


predictions.

 Example:

o Using test scores from 100 students to estimate the average


math score of all students in a district.

o A confidence interval (e.g., 95% confidence that the district’s


average score is between 73 and 77).

o Hypothesis testing to check if the average score significantly


differs from 80.

o Generalizing results from a sample to a larger group.

2. Variables and Types of Data

Definition of Key Terms

 Variable: A characteristic or attribute that can assume different


values.

 Data: The values (measurements or observations) that variables


assume.

 Random Variable: A variable whose values are determined by


chance.

 Data Set: A collection of data values.

 Datum: A single value in a dataset.

Types of Variables

1. Quantitative Variables

o Numerical and can be ordered or ranked.

o Types:
Math: Data Analysis
2nd Semester

 Discrete Variables: Countable values (e.g., number of


students, books on a shelf).

 Continuous Variables: Measurable values, including


fractions and decimals (e.g., height, weight,
temperature).

2. Qualitative Variables

o Non-numerical; categorized based on characteristics.

o Examples: Gender, religion, nationality, eye color.

Levels of Measurement

1. Nominal Level:

o Data categorized without ranking.

o Examples: Gender, blood type, nationality, car brands.

2. Ordinal Level:

o Data can be ranked, but differences between ranks are not


equal.

o Examples: Educational levels, customer satisfaction ratings,


job performance ratings.

3. Interval Level:

o Data is ordered with equal intervals, but no true zero.

o Examples: Temperature (°C, °F), IQ scores, calendar years.

4. Ratio Level:

o Data has equal intervals and a true zero.

o Examples: Height, weight, income, distance traveled.

3. Data Collection and Sampling Techniques

Methods of Data Collection

1. Telephone Survey:

o Less costly, more honest responses.


Math: Data Analysis
2nd Semester

o Some people may not answer or have unlisted numbers.

2. Mailed Questionnaire:

o Covers a larger geographic area, maintains anonymity.

o Low response rates, potential for misunderstood questions.

3. Personal Interview:

o In-depth responses, but costly and may introduce interviewer


bias.

Sampling Techniques

1. Random Sampling:

o Every member of the population has an equal chance of


selection.

o Uses chance methods or random number generators.

2. Systematic Sampling:

o Selecting every kth member from the population.

o Example: Choosing every 40th subject from a population of


2000.

3. Stratified Sampling:

o Population divided into subgroups (strata), then random


samples taken from each.

o Example: Selecting students from different grade levels.

4. Cluster Sampling:

o Population divided into groups (clusters), then entire clusters


are selected randomly.

o Example: Selecting all residents from randomly chosen


apartment buildings.

Other Sampling Methods

 Convenience Sampling: Uses readily available subjects (e.g., mall


surveys).
Math: Data Analysis
2nd Semester

 Volunteer Sampling: Respondents choose to participate (e.g., call-


in surveys).

Types of Sampling Errors

1. Sampling Error:

o Occurs when a sample does not perfectly represent the


population.

o Example: A survey estimates 55% support for a candidate, but


actual votes show 52%.

2. Non-Sampling Error:

o Errors in data collection, recording, or survey design.

o Example: Biased survey questions, misreported income, or


excluding certain groups.

4. Experimental Design

Types of Studies

1. Observational Study:

o Researcher observes subjects without manipulation.

o Types:

 Cross-Sectional Study: Data collected at one time.

 Retrospective Study: Uses past records.

 Longitudinal Study: Data collected over time.

o Example: Studying smoking habits and lung cancer over 10


years.

2. Experimental Study:

o Researcher manipulates variables to observe effects.

o Example: Studying effects of exercise on stress levels with


control and experimental groups.

Advantages & Disadvantages


Math: Data Analysis
2nd Semester

 Observational Study:

o ✅ Natural setting, ethical for sensitive topics.

o ❌ Cannot establish cause-and-effect, expensive.

 Experimental Study:

o ✅ Researcher controls variables, can determine cause-and-


effect.

o ❌ May not apply to real-world settings, subjects may change


behavior due to observation (Hawthorne effect).

5. Uses and Misuses of Statistics

Common Misuses of Statistics

1. Suspect Samples:

o Small, convenience, or volunteer samples may not represent


the population.

2. Ambiguous Averages:

o Different measures of average (mean, median, mode) can be


used to mislead.

3. Changing the Subject:

o Using different values to represent the same data to influence


perception.

o Example: "3% budget increase" vs. "$6 million increase."

4. Detached Statistics:

o Claims without comparisons.

o Example: "This drug is 50% more effective" (compared to


what?).

5. Implied Connections:

o Suggests relationships without proof.

o Example: "Eating fish may lower cholesterol."


Math: Data Analysis
2nd Semester

6. Misleading Graphs:

o Improperly drawn graphs can exaggerate trends.

7. Faulty Survey Questions:

o Poorly worded questions influence responses.

o Example: "Do you support raising taxes to build a stadium?"


vs. "Should a new stadium be built?"

Summary

 Statistics helps collect, analyze, and interpret data.

 Two branches: Descriptive (summarizing data) and Inferential


(drawing conclusions).

 Data types: Quantitative (discrete, continuous) and Qualitative.

 Measurement levels: Nominal, Ordinal, Interval, Ratio.

 Sampling techniques: Random, Systematic, Stratified, Cluster.

 Research methods: Observational vs. Experimental studies.

 Beware of statistical misuse through biased sampling, misleading


graphs, and ambiguous claims.
Math: Data Analysis
2nd Semester

C Frequency Distribution and Graphs


2
1. Frequency Distribution

 Data collected in original form is called raw data.

 Nominal – or ordinal-level data that can be placed in categories is


organized in categorical frequency distributions.

 A frequency distribution is a table that organizes data into classes


or groups, showing the number of observations in each category.

 Purpose:

o Organizes large data sets into a manageable format.

o Makes patterns and trends easier to identify.

o Helps in data analysis and interpretation.

Components of a Frequency Distribution Table

1. Class Intervals – The specific ranges into which the data is


grouped.

2. Frequency (f) – The number of values falling within a class interval.

3. Class Boundaries – The actual limits of a class interval that


separate it from adjacent intervals.
Math: Data Analysis
2nd Semester

4. Class Width – The difference between the lower boundaries of


consecutive classes.

5. Midpoint – The central value of a class interval, calculated as:

Lower Class Limit +Upper Class Limit


midpoint =
2

6. Relative Frequency – The proportion of total observations that fall


into a class, given by:
Class Frequency
Relative Frequency=
Total Frequency

7. Cumulative Frequency – The running total of frequencies,


indicating how many values are below a particular class boundary.

2. Types of Frequency Distributions

1. Ungrouped Frequency Distribution:

o A simple count of occurrences for each individual data point.

o Used for small data sets or discrete data.

2. Grouped Frequency Distribution:

o Data is grouped into class intervals for better readability.

o Used for larger data sets or continuous data.

3. Relative Frequency Distribution:

o Expresses class frequencies as proportions or percentages.

o Useful for comparing different data sets of varying sizes.

4. Cumulative Frequency Distribution:

o Displays cumulative totals, showing how frequencies build up


over intervals.

3. Constructing a Grouped Frequency Distribution


Math: Data Analysis
2nd Semester

The following data represent the record high temperatures for each of the
50 states. Construct a grouped frequency distribution for the data using 7
classes.

112 100 127 120 134 118 105 110 109 112

110 118 117

4. Graphical Representations of Frequency Distributions

 Graphs help in visualizing frequency distributions effectively.

 Types of Graphs:

A. Histogram

 A bar graph representing frequency distribution.

 Characteristics:

o No gaps between bars (unlike bar charts).

o X-axis represents class intervals.

o Y-axis represents frequencies.

o Heights of bars indicate the frequency of observations.

B. Frequency Polygon

 A line graph connecting midpoints of class intervals.

 Steps to construct:

1. Plot midpoints on the X-axis.

2. Plot frequencies on the Y-axis.

3. Connect points with straight lines.

C. Ogive (Cumulative Frequency Graph)

 A line graph that represents cumulative frequency.


Math: Data Analysis
2nd Semester

 Types:

1. Less than Ogive – Shows cumulative frequency below each


class boundary.

2. Greater than Ogive – Shows cumulative frequency above


each class boundary.

 Usage:

o Helps determine median and percentiles.

D. Bar Graph

 Represents categorical data using rectangular bars.

 Bars can be vertical or horizontal.

 Spacing: Bars are separated (unlike histograms).

E. Pie Chart

 A circular graph divided into proportional segments.

 Represents relative frequency of categories.

F. Dot Plot

 Uses dots to show individual data points.

 Suitable for small data sets.

G. Stem-and-Leaf Plot

 Represents data while maintaining original values.

 Splits numbers into stems (leading digits) and leaves (trailing


digits).

4. Choosing the Right Graph

DATA TYPE RECOMMENDED GRAPH


Math: Data Analysis
2nd Semester

QUANTITATIVE Histogram, Frequency Polygon,


(CONTINUOUS) Ogive

CATEGORICAL Bar Graph, Pie Chart

SMALL DATA SETS Dot Plot, Stem-and-Leaf Plot

5. Summary

 Frequency distributions help in organizing data systematically.

 Graphs make it easier to visualize and interpret data.

 Choice of graph depends on the type of data being analyzed.

You might also like