Definitions of some common terms*
Probability **
The extent to which an event is likely to occur. A measure of uncertainty with a value between zero and
one.
Set
A well-defined collection of objects.
Permutation**
An arrangement of all or part of a set where the order of the arrangement is important.
Combination**
An arrangement of all or part of a set where the order of the arrangement is NOT important.
Distribution **
Function or a listing showing all the possible values and the frequency or how often each value occurs.
Frequency
The number (or count) of times a particular event occurs
Relative Frequency
The fraction or percentage of times a particular event occurs
Statistic
A fact or collection of facts about a sample of data
Parameter
A fact or collection of facts about a population
Event
A set of outcomes to which a probability is assigned.
Random Event
An event that is unpredictable and therefore has a probability of occurrence
Population**
A complete set of items or events of interest.
Sample**
A subset of the population
Independence**
The probability of occurrence is not affected by the occurrence of another event. The probability of one
event is the same whether a different event occurs or not.
Venn Diagram**
A pictorial representation of the logical or probabilistic relationships between events
Union**
A combination of sets of events that includes all the elements of the sets combined
Intersection**
A combination of sets of events that includes only the elements that are common to all the sets being
combined
Mutually Exclusive Events**
Events that cannot occur at the same time.
Hypothesis Testing
A method of statistical inference that uses data from a sample to draw conclusions about a
population
Censoring
Labeling certain data/observations as incomplete or only partially known. These data points are still
valuable but some relevant aspects of the observation are unknown. An example is test samples that
complete a test without failing. The non-failure data is still valuable but the time to failure is unknown.
Mean **
A statistic or measure of central tendency of observed data or of a distribution. Used as an estimate of a
population’s average value
Median **
A measure of central tendency of observed data or a distribution. The mid-point of the data such that
half of the data values are smaller and half are larger.
Mode **
A measure of central tendency of observed data or a distribution. The most often observed data value
or the peak of a distribution.
Range**
A measure of spread of observed data or a distribution. The maximum value minus the minimum value.
Variance**
A measure of the spread of observed data or a distribution. Estimates the spread of data from the mean
of the data.
Standard Deviation **
A measure of the spread of observed data or a distribution. This is equal to the square root of the
variance. Often used in practice because the units are the same as that of the data.
Confidence **
A value, associated with an interval, that gives the fraction of times the interval will include the
population parameter if the experiment is repeated many, many times.
Central Limit Theorem **
The Central Limit Theorem states that the sampling distribution of the sample means approaches a
normal distribution as the sample size gets larger.
*Definitions provided are not from any official statistical dictionary. These are drawn from many
sources including personal understanding.
** These topics are covered in more detail in videos.