100% found this document useful (1 vote)
225 views101 pages

Measurement Uncertainty Course

Measurement Uncertainty Course Notes. Offered by reputable organization. Understanding in uncertainty budgets, how uncertainty is expressed, sources of uncertainty.

Uploaded by

adriant3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
225 views101 pages

Measurement Uncertainty Course

Measurement Uncertainty Course Notes. Offered by reputable organization. Understanding in uncertainty budgets, how uncertainty is expressed, sources of uncertainty.

Uploaded by

adriant3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 101

The focus of making quality measurements is to reduce uncertainty where possible, and

to increase confidence in the measurements. It doesn't matter where the


measurements are made: knowing about measurement uncertainty is important in a
calibration laboratory, a medical or industrial testing laboratory, the production area of
a factory, and many other places.

This module will address the internationally accepted concept of measurement


uncertainty.

Measurements and measurement systems are usually monitored and tracked using one
of three methodologies:

1. The Program Control method


2. The Measurement Assurance method
3. The Process Oriented method

Introduction

Each of the three methods is a way of increasing confidence in measurement results,


increasing the quality of the measurements being made, and reducing the risk of
process or product limits being exceeded.

We will also look at basic elements of a calibration program, and a manufacturing


method that relies on knowledge of measurement uncertainty.

Methodologies
Program Control Method

The Program Control method is the traditional method of controlling the calibration of
inspection, test and measuring instruments and tools. It is described in detail in
American National Standard ANSI/ASQ M1 Calibration Systems. It principally applies to
the business management program for the system of calibrating of instruments and
tools that affect the quality of other products and services.

Calibrations are performed periodically on all equipment whose measurements will


affect the quality of a finished product. This may include production-line instruments
and testers, laboratory instrumentation, and instruments and standards used to
calibrate or check other equipment.
Program Control Method

The program control method is primarily concerned with documentation of the system
— the procedures for ensuring that calibration standards have sufficient accuracy, the
calibration environment is controlled, all calibrated items are recalled and recalibrated
at definite intervals, records are maintained, documented calibration procedures are
used, and all calibrated items are labeled to indicate their status. Measurement
uncertainty is mentioned in that it has to be "taken into account" in setting calibration
acceptance limits.

With program control, measuring instruments or whole measurement systems are


calibrated on a regular, periodic basis. History files of these calibrations are kept, so
that if the current results are not consistent with previous records, someone could
investigate to see if there is a problem and perhaps take corrective action.

Of course, instruments do drift and standards do wear. The results don't have to be
exactly the same every time, but large or unexpected changes are a clue that
something might be wrong and should be investigated.

Program Control Method

The Program Control method does have problems, though.

When the cost to recall product that was incorrectly measured is high (imagine recalling
a heart pacemaker that has been surgically placed in a patient's chest), reliance on
program control is very risky.

Program control is often used to calibrate individual test equipment and tools, but rarely
checks out the entire measurement system they may be used in. When a production
measurement is made in a fixture, in a varying environment, by different operators, and
possibly using different preparation procedures, all of these factors can and do
contribute to the overall uncertainty of the result. Yet, program control only calibrates
the test instrument and usually does not take these other factors into account.

Measurement Assurance Method

When Program Control is not adequate, the Measurement Assurance method can be
used.
Measurement Assurance can, and should, be used any time calibrations supporting
important or risky measurements are made.

As described in the standards (next page) the Measurement Assurance method applies
to metrology and calibration, to check the stability and continued correct operation of
the calibration systems. The basic method, under another name, can be used in
manufacturing to check the measurements made of products or processes.

The Measurement Assurance method is a newer method of controlling the calibration of


inspection, test and measuring instruments and tools. It is also described in detail in
ANSI/ASQ M1, and is also principally applied to the business management program for
the system of calibrating instruments and tools that affect the quality of other products
and services, but with an important improvement. The newer ANSI/NCSL Z540.3
standard, Requirements for the Calibration of Measuring and Test Equipment, also
includes requirements for and about Measurement Assurance methods.

The measurement assurance method improves the program control method by adding
elements of statistical process control (SPC). Product samples or designated units of
material are designated as check standards. They are used to monitor the
performance of the calibration system on an ongoing basis.

The check standards are measured frequently as part of the routine measurement
activity going on for regular production, but the data from their measurements are kept
separate from production data and are analyzed for consistency. This is all in addition to
routine calibration of the measurement systems. Greater emphasis is also placed on
knowledge of the measurement uncertainty of each measurement system.

Measurement Assurance Method

Measurement Assurance uses a check standard to keep track continually of how the
measurement system is performing.

Measurements are checked frequently, so that the risk of costly failure between
calibrations is greatly reduced.

The check standard is usually a specific item (or product for manufacturing), and is
measured in the same way as any other item.
Because we do this, all of the factors contributing to the uncertainty are experienced.
Fixture, environment, operator, preparation, etc., all have the same effect on the check
standard as they would on regular product measurements.

Measurement Assurance Method

By statistically analyzing the data from the check standard, usually with SPC control
charts, we can tell very quickly whether the measurement process has changed.

If it has, recalibration or repair will be required, as usual. If analysis shows that the
measurement process has not changed, repair or recalibration are not needed.

We will discuss control charts and give examples of measurement assurance later.

Measurement Assurance Method

Measurement Assurance can be used for continuous control of the measurements made
in a calibration process. Again, a similar method can be used in manufacturing
production and will be detailed later.

It includes monitoring of operator errors, instrument errors, errors due to environmental


effects and overall measurement uncertainty.

Continuous control is established by having operators or automatic testers take


measurements of check standards as part of the normal process. This provides data on
the total measurement system. It also provides statistical information on the values and
offsets of product targets.

Process Based Method

The Process Based method is the newest methodology for control of measurements. It is
fully described in International Standard ISO 10012, Measurement management
systems — Requirements for measurement processes and measuring
equipment. Where the methods described in ANSI/ASQ M1 apply principally to the
calibration activities, the measurement management system of this standard applies
to all other measurements in an organization that affect the quality of products and
services.

ISO 10012 is part of the ISO 9000 family of standards. As such, its basis is a process
model where customer input requirements are transformed into measurement results
that meet the requirements. This standard describes a model for a
complete measurement management system for an organization, in the same way
that ISO 9001 describes a model for an organization's quality management system.

ISO 10012 is an aid to meeting the measurement and measuring process requirements
of standards such as ISO 9001 and ISO 14001.

Process Based Method

The Process Based method, the measurement management system, brings the need to
know about measurement uncertainty to ALL product quality related measurements
that are made in an organization. Briefly, here is what the measurement management
system requires.

 1. Determine what measurements are needed to meet customer


requirements
 2. Determine where (in the process and in the facility) each measurement
will be made
 3. For each measurement, determine the nominal value and tolerance limits
 4. For each measurement and physical location, determine what instrument
will be used to make the measurement. Then determine the measurement
uncertainty taking into account all factors including the physical
environment where the measurement is made.
 5. For each measurement and instrument, evaluate the measurement
uncertainty to be sure it meets the quality requirements
 6. Make sure each measuring instrument is calibrated
 7. Make sure each instrument is labeled to indicate where it is permitted to
be used
 8. Document all of these items
 9. Repeat this process (called "metrological confirmation") at planned
intervals

"Metrological" means something pertaining to metrology, the science and practice of


measurement.

Methodologies
Process Based Method

The purpose of metrological confirmation is to verify that the measurement


characteristics of the measuring equipment satisfy the measurement requirements of
the process where the measurement is actually made.

Metrological confirmation has two major parts. First, it requires the use of calibrated
instruments. Second, it verifies that the right instrument is being used in the right place
for each measurement task in the process of creating the product or service. Neither
part is sufficient by itself.

Calibration of an instrument provides two data: the uncertainty of measurement, and


measurement traceability to the units of measure of the International System of Units.

The calibration process itself is not part of metrological confirmation. Other standards,
such as ISO/IEC 17025 and ANSI/NCSL Z540.3, apply to calibration laboratories.

Calibration
What is Calibration?

"Calibration" is one of many measurement-related terms formally defined in an


international guide: JCGM 200:2008, International vocabulary of metrology — Basic and
general concepts and associated terms (VIM). This document is available as a free
download from the web site of the International Bureau of Weights and Measures
(BIPM).

Less formally, "calibration" is the process of comparing a measuring instrument to a


reference (a measurement standard) with traceable values and known uncertainty. The
comparison establishes the amount of error, if any, in the measuring instrument.
Calibration should always be done using a documented and validated procedure.

If an instrument is found to be outside established limits, it may be adjusted or repaired


as necessary. It must then be re-calibrated as a quality check for the adjustment/repair.
The need for adjustment is a decision made based on the results of a calibration — it is
not part of the calibration itself.

Why Calibrate?

The fundamental reason for calibration inspection, test and measuring instruments and
tools is to provide assurance that they are capable of making correct measurements
when used correctly. It also assures that an instrument made in one country will
measure exactly the same way when used in another country. This is because the
measurement standards in all counties are based on the same units — the International
System of Units.

If an instrument or standard is found to be out of its established tolerance, an


established documented procedure must be followed. The procedure must include
determining if the out of tolerance condition affected any product, and may include
recalling all products tested with that instrument since it was last determined to be
performing satisfactorily. Product recalls are very costly to an organization.

Calibration reduces the risk of making incorrect measurements. It is impossible to


eliminate all risk, but an effective calibration program will allow top management to
know the risk and keep it in acceptable limits.

Calibration is usually performed in a controlled environment, although this may not


always be necessary.

Some companies routinely perform calibration at their customer's site. This usually
reduces equipment downtime and allows the influences of the surrounding environment
to be part of the calibration process.

For a diagram depicting the Calibration Cycle, CLICK on Resources at left, and then
select Calibration Cycle.

Calibration

Careful consideration must be given to the design of a calibration program.

Measurement quality levels must be established (precision, accuracy, resolution) for


each measurement supported.

These should be based on the requirements of the product or process being measured
(the customer requirements). The quality of the calibration system supporting the
production measurements can then be determined.

Calibration
Data collection and/or adjustment rules must be set up where required to allow for the
impact of instruments found to be out of the determined tolerance.

This includes identifying which measurements or which specific products might have
been affected (reverse tracking), and if the effect is significant for that product.

Whenever a decision to calibrate is made (for example, how often should you calibrate
something?), you are balancing the cost of calibrating against the risk (including
financial risk) of finding out later that some product was incorrectly approved or some
incorrect results released.

If the consequences of wrong measurements are expensive, it is extra important to


ensure that measurements are correct.

Calibration
Measurement Traceability

Measurement systems, and the calibrations that are done on them must be traceable. A
traceable calibration is one in which there is an unbroken chain of comparisons
connecting its measurement values and measurement uncertainty to national or
international standards.

Each of the comparison steps in this chain must have a statement describing the
uncertainty of the comparison. This is usually documented on a calibration certificate.

If the traceability conforms to a standard, the uncertainty of the comparisons must be


stated in a standard way as well. The policy identifying the application of traceability
and measurement uncertainty must be established and followed to assure consistent
measurement data in agreement with internal and external customers.

Measurement traceability applies only to a measured value, and is incomplete without


knowledge of the measurement uncertainty. Note that a document, record, instrument
or organization cannot have measurement traceability — only a value can.

Calibration
Measurement Traceability

What is required to show measurement traceability? Standards usually require, and


auditors look for, several things. They should all be documented on a calibration
certificate. The most important three are listed here.

Unique identification of the certificate and a link to the instrument: the


calibration certificate should have a unique identification from the calibration
laboratory. It must identify the customer, the specific instrument calibrated, and when
the calibration was performed.

Measured values: a certificate that only shows "pass/fail" results does not show
measurement traceability.

Measurement uncertainty: preferably a value for each reported measurement, and a


statement in the report about the "coverage factor", usually identified with the symbol
k. The coverage factor gives you information needed to treat the uncertainty as a
statistical value. Optionally, the certificate may state something, such as a test
uncertainty ratio, that allows a rough guess of uncertainty levels.

Note that "NIST Test Numbers" are NEVER acceptable evidence of measurement
traceability. They are not unique, and they do not meet any of the other requirements
above.

Calibration
Calibration Labels

When a regular calibration is performed, the date for the next one is established and a
sticker placed on the tool or system. It becomes the responsibility of each user of that
tool or gauge to check the date on the sticker before using the tool. If the calibration is
out of date, the tool should not be used.

In order to be consistent, every tool or measuring system should have a sticker. If the
regular use of the tool does not require calibration, for example if the measurements
made using the tool do not affect the quality of a product, a sticker should still be used
saying "for reference only" or "calibration not required." Records of these tools should
also be found in the files but they need not be updated because the tool is not
periodically calibrated.

Calibration
Calibration Intervals

All measuring instruments eventually have some amount of wear or drift. So, they all
need to be recalibrated at regular intervals.

Calibration intervals vary widely, from every few weeks to several years. Some types of
instruments should even be calibrated before each use! Intervals are based on the
characteristics of the equipment type, but also on how much it is used, how much wear
is caused by use, and the environment it is used in.

When measurements do drift, calibrating more frequently reduces the associated risk. If
the calibration intervals are short enough, discovery of major problems (and their
associated costs) happens infrequently.

Calculation of calibration intervals is a very complex topic and beyond the scope of this
course.

Process Measurement Assurance

Process Measurement Assurance (PMA) bridges the gap between the calibration process
and manufacturing. It can be used to link the process-based measurement management
system to the calibration function. It uses methods similar to the Measurement
Assurance methods used in calibration laboratories, but applies them to the production
environment.

A control standard (or check standard) is designed that closely mimics the product
features being measured while maintaining long term stability.

This standard is usually developed with joint involvement of a metrologist and


production engineer or personnel with expert knowledge of product requirements and
measurement standards.

Process Measurement Assurance

The resulting control standard is carefully characterized by repeated precision


measurements, and statistical reference limits are established by the metrologist.
When required, corrections can be made for the bias caused by external influences,
such as temperature, humidity, vibration, barometric pressure, local gravity, air density,
and other influences.

This control standard is measured as part of the manufacturing process and the data is
plotted to determine an "in control" condition of the total measurement and control
process used in product manufacture.

Process Measurement Assurance

Process Measurement Assurance (PMA) data can also be used to evaluate and correct
bias in production targets. The process bias is equal to the average of the control
measurements minus the control standard's certified value.

Physical recalibration of the measuring or control equipment in the manufacturing


process may be reduced or eliminated due to the continuous data being taken on the
control standard. Also many measurement processes are improved, because the
operator can identify their influence on the measurement process and correct their
technique for a more consistent measurement.

How often PMA data is taken should also be considered carefully. Any change in
influencing factors to the measurement uncertainty may justify taking a new data point.

Process Measurement Assurance

Using PMA alone or as part of a measurement management system, along with a


calibration program, does not always assure success.

When the measuring system bias and randomness is too great, the product
measurement data variation cannot be separated from the measurement system
variation.

Improvements can be obtained by utilizing tools to identify the major contributors to


variation and take action to reduce these contributors.

Process Measurement Assurance


PMA does not establish the required bias and variation limits for the measuring system.
These need to be designed into the production process. PMA methods merely show you
what the actual performance of the process is.

The quality of the measurement data needs to be defined by measurement traceability


and uncertainty. These become extremely important when multiple production lines or
integrated vendors are involved.

Everyone must be making the same quality measurements to achieve consistent


product quality levels.

Robust Tolerance Manufacturing

The ultimate goal of a manufacturing organization is to produce a quality product that


the customer desires, at a competitive price.

During manufacturing, processes must be controlled and/or product characteristics


must be measured for conformance to limits. These conformance limits are derived
from the desired quality level of the product. The quality level of the product includes
meeting the intended function(s) of the product at a specific performance level.

An example would be a laser printer. It must print the output from a computer in either
black & white or color at a performance level of so many pages per minute. These are
not the only criteria of the laser printer, but they are usually included in the base
criteria for comparison to other printers.

Robust Tolerance Manufacturing

Another example would be a multimeter. It must measure electronic signals in the form
of AC Voltage, DC Voltage, AC Current, DC Current and Resistance. It does this by
displaying the result of the measurement with a certain resolution and accuracy. Again
these are not the only criteria of a multimeter, but they are usually in the base criteria
for comparison to other multimeters.

Still another example would be the hinge on a car door. The hinge must allow the car
door to open and close a minimum amount of times before failure while holding the
door in the position set by the user. The specific performance level in this case includes
a wear factor that includes both functionality and door tension.
Robust Tolerance Manufacturing

All of these examples are end results of a good design and a controlled manufacturing
process.

The design and the ability to manufacture the product, depend on limits set on the
components or characteristics of the product.

To realize the product performance level or product quality, these limits or boundaries
cannot be exceeded.

Robust Tolerance Manufacturing

It is no longer acceptable to base manufacturing quality on the rejection of


nonconforming components or products. Quality must be designed into the product and
the process — it cannot be achieved only by inspection.

The manufacturing process must produce the components or products within target
windows that allow only a certain defect rate.

The allowable defect rate can be based on market studies, reliability requirements,
current manufacturing technology, humanitarian needs and cost to produce. In the final
analysis it boils down to customer need and cost.

Robust Tolerance Manufacturing

By starting at the desired customer need and cost, (Output), you can work your way
back to the design process that sets the Inputs and develops the Process to act on the
Inputs.

Each Input and Process action has targets and limits. These must be
controlled/measured to assure the output remains within the desired limits at an
acceptable defect rate.

Since no process can be monitored or controlled without making a measurement, it


really comes down to how certain are the measurements being made.

Robust Tolerance Manufacturing


An example of how these fit together is shown in the diagrams to the right and on the
next screen.

The diagram at right shows a statistical process tolerance for a process/product output.

The process/product target is "T."

The two distributions are targeted around the upper and lower manufacturing windows
allowing the upper and lower product/process specification limits to be met.

Robust Tolerance Manufacturing

This diagram is a Pareto chart using different width process standard deviation curves
turned on their side to show the relative variation contribution to the overall
process/product output. One of these Input curves is the measurement (Meas) impact to
the process.

The measurement error bias and random variability will contribute to the spread of the
two curves in the top figure and to the height of the measurement curve in the bottom
figure. If this contribution is too great, the process/product limits will be exceeded and
the product/process defect level will go up. A company policy on measurement
traceability and measurement uncertainty must be established and followed to assure
consistent application and accurate assignment of measurement error bias and
variability.

There are other tools available that affect the product design and the ultimate resulting
product. Some of these are:

 Design of Experiments
 Robust Tolerance Analysis
 Taguchi Methods
 Dual Response

These tools will help set the targets and limits for each step in the manufacturing
process that was previously outlined.

Robust Tolerance Manufacturing


The tools, and others, focus on a robust product design while reducing the variability in
the process.

These tools do not bear directly on making good measurements, however making a
good measurement or controlling the variability in a process is part of the Input and
Process blocks set up by the results of using these tools.

It is therefore important to keep these tools in mind to see how the impact of the
measurement process affects the total manufacturing output.

Some of these robust design and tolerancing tools are also parts of the Six Sigma ®
methodology, although all of them are far older than that program.

Concepts and Tools

To understand the approaches to measurement uncertainty, we need to address


concepts and tools available to today's metrologist.

Some of the concepts and tools used in the measurement and control of process
uncertainty include:

 Definitions
 Sources of Uncertainty and error budgets
 The two categories used to express uncertainty in modern standards
 Defining and Expressing Uncertainty correctly
 Risk Analysis
 Measurement Assurance
 Gage R & R, measurement capability studies, SPC, and Analysis of Variance
(ANOVA)
 Standards and Software

 Basics

 If there was no uncertainty, no measurement error, there would be no need for


metrology.
 In fact, all measurements do vary. All of the concepts and tools of metrology
involve the use of statistics, and work by defining and separating the contributors
to variations in measurement and then handling them appropriately.
 In the past, when analyzing measurement and process error we spoke of two
major components; systematic error (bias), and random error (noise). This is
called the Traditional or Error Approach. This distinction was useful in
describing the effects on the system, but often adds confusion when calculating
measurement uncertainty.

 Today, we try to avoid using the concepts of systematic error and random error
in metrology. Systematic and random errors still exist, of course, and are defined
in the VIM (2.17 and 2.19). Consideration of systematic and random effects is
completely appropriate and common in many contexts of quality assurance
(including measurement quality assurance), and they are often discussed in
publications.
 In metrology, though, it is now accepted practice to group uncertainty
contributors by the methods used to determine their values instead of their
action on the system. This is called the Uncertainty Approach, and is the method
presented in the Guide to the expression of uncertainty in
measurement, generally referred to as the GUM.
 [Numbers in parentheses refer to specific definitions in the VIM.]

Basics

The goal is to make measurements that completely satisfy customers' requirements and
needs at a reasonable cost.

ISO 9001, and the rest of the ISO 9000 family of standards, has a lot of measurement
related requirements. The organization must determine measurement activities based
on the product and the criteria for product acceptance by the customer. Use of
measuring equipment is required, and that equipment must have calibration that is
traceable. The measuring equipment must be capable of making the measurements
required to determine the product's conformity to requirements.

ISO 10012 Measurement management systems -- Requirements for measurement


processes and measuring equipment (one of the ISO 9000 family standards) requires
that measurement uncertainty be taken into account and all relevant data be recorded.
Measurement uncertainty must be estimated and recorded for each measurement
process in the measurement management system — that is, in the product realization
process. This means that knowledge of measurement uncertainty is required in all
production processes, not just in the metrology laboratory.

Basics

This means that we're not allowed to just design a measurement system any way we
like. We are required to determine the necessary measurement requirements. Then we
have to use the measuring equipment in such a way that the measurement uncertainty
is known and consistent with what is necessary for the product to conform to
requirements.

Basics

Given unlimited resources, we can make the ultimate measurement. However, this must
be tempered by the speed and cost required to provide a competitive product.

Sometimes the desire to make the best measurement becomes the goal. In fact, people
have been known to go to great lengths to justify and demonstrate how well they can
make a measurement without taking the customers needs into account at all.

This resource and time would be better spent helping the customer make a
measurement or control a process to within the desired limits for component or product
quality.

Requirements

What is the difference between "accuracy" and "measurement uncertainty?"

The VIM describes accuracy as a qualitative term (2.13). It is appropriate for the
traditional error approach to measurement, but not for the measurement uncertainty
approach.

Measurement uncertainty (2.26) is the term used to quantify the inaccuracy of a


measurement. Specifically, measurement uncertainty gives a value to the dispersion
(spread) of the measured value. The dispersion describes the range of values that
believed to contain the measured value.
In calibration, this should include the uncertainty determined for the measuring
instrument being calibrated and the uncertainty of the calibration process itself,
including traceability. In any other measurement process, this should include the
characteristics of what is being measured as well as the measurement instrument and
measuring process.

The definition of traceability (metrological traceability, VIM 2.41) specifies an unbroken


chain of comparisons stretching from the measurement or calibration up to national or
international standards. Each comparison must include a value and state an
uncertainty.

Requirements

We have all been exposed to a barrage of terms alluding to measurement performance,


often lacking specifics and at times misleading. Metrologists have long been aware that
there was a need to address this problem.

Until the late 1990s little guidance was broadly recognized by industry or end users.
Since then several standards documents have facilitated the effort. Significant ones are
ISO/IEC 17025 for calibration and testing laboratories, ISO 10012 for all other industries,
the GUM and VIM for everyone, and a few more specialized standards for specific
scientific and technical professions. The most important standards are listed on the next
page.

The Guide to Expression of Uncertainty in Measurement (GUM) shows the modern way
to categorize uncertainties and provides guidance for analyzing and reporting
measurement performance. This important document is available in three forms, one of
them free.

Requirements

International standards are normally published in two official United Nations languages:
English (UK) and French. Some countries, including the United States, publish national
versions of many standards. The national version may be translated to a local language
and have other minor changes to accommodate local grammar and usage. Otherwise
they are identical in content and applicability to the international version.
For example, the United States' local version of the ISO 9001 standard is translated to
English (US), and is numbered ANSI/ISO/ASQ Q9001. Other common national
translations are Spanish, Arabic, Chinese and Russian. In this course, the international
numbering is used for all international standards documents.

All standards documents are subject to copyright. There is a cost ($$$) for standards
documents, except for publications of the US Government or where "free" is indicated.

Requirements

Resistance to change from past practices, obsolete but previously widely accepted
standards documents, ignorance of or misinformation about new requirements, and
marketing forces have often overshadowed metrology practices that support the
modern, well-defined and standardized statements of measurement performance.

Manufacturer's specifications of instrument performance, generally expected by end


users, can aggravate the problem.

An example is the elusive (and obsolete) 10:1 test accuracy ratio (TAR).

Requirements

In common usage, TAR is the ratio of the accuracy of the standard or test equipment
used to the accuracy of the measurement required of the product or unit being
measured. It is a traditional and "quick and dirty" way to decide if measurement
capability is "good enough".

If a plastic ruler is being used to measure to an accuracy of 2.5 µm (1/16 inch), a


reasonable expectation, then in order to maintain a 10:1 TAR it should be calibrated
with a tool whose accuracy is at least 0.25 µm (1/160 inch).

Requirements

More often than not, 10:1 measurement ratios are seldom achieved in practice,
particularly with the higher level measurements. This is further limited when cost
factors and resource constraints are considered.
Far more important, though, is that the TAR refers only to the accuracy specifications of
the instrument or standard used for the calibration or other measurement.

Metrologists have always realized that there are many other factors besides the
standard that contribute to the overall quality of a measurement result. In recent
decades, they have placed much greater emphasis on understanding and quantifying
these factors. This is why there are new ways to calculate and state measurement
uncertainty.

Requirements

Some international, national and US Government standards permit lower test accuracy
ratios. But even the 3:1 or 4:1 ratios allowed as a minimum by ISO 10012 and
ANSI/NCSL Z540.3 respectively are not readily achievable in some instances.

These newer standards are careful to state that their permitted minimum ratios refer
only to the reference used and not to the uncertainty of the total process.

In any case, TAR is becoming superseded by other concepts, especially the


measurement uncertainty approach to measurement.

Key Terminology

Most of us involved in measurements have seen advertisements for measuring devices


with claims such as those shown at right.

Those statements are incomplete, and/or misleading. There is no reference whether


errors are based on a percentage of reading or percentage of full scale. However, the
purpose of many advertised specifications (as opposed to those in the service manuals)
is to position the product against its competition. While usually not deliberately false, it
is not unusual for advertisements to be less than completely truthful. In measurement,
we need the best truth we can afford.

For example: Linearity by itself provides no assurance of accuracy; neither does


precision. A measurement result can be both precise and inaccurate, or both accurate
and imprecise. Resolution is not necessarily an indication of performance; an instrument
can have eight significant digits of resolution but have so much internal noise that only
five significant digits are actually usable.

Key Terminology
MEASUREMENT

Now we need to define a few terms. They may not be familiar to you yet, but they will
become more so especially as you use the GUM and look up terms in the VIM. You may
also find that the technical definitions of some terms that "everyone knows" are not
exactly what you thought they are. These are technical definitions, but then
measurement is a technical field of work, and the VIM is its approved technical
dictionary.

Measurement — the process of experimentally obtaining one or more quantity values


that can reasonably be attributed to a quantity. (VIM 2.1)

Did you know you are performing an experiment every time you make a measurement?
You are, and you are also taking a statistical sample. Every measurement is one sample
from an infinite population of all possible measurements of that quantity using that
measuring system.

The act of measuring something implies you already know something about what you
are trying to measure, at least a description of it relative to what you intend to do with
the information. It also implies you have at least a basic measurement procedure
(although it may not be documented), a measuring system of some kind, and are
working in some kind of physical environment. All of these can have an effect on the
measurement.

Key Terminology
MEASURAND

Measurand — a quantity intended to be measured. (VIM 2.3)

As defined, the measurand is what you intend to measure. It may or may not be the
same as what is actually measured by the measurement system. Here is an example.

You have a task to measure the value of a resistor, an electronic component with a
value in units of ohms. If you use a a digital multimeter, you will see controls marked in
ohms and the display will likely show the symbol ½ which represents ohms. However,
what the meter (the measuring system) cannot directly measure what you want. So,
what the meter is actually doing is passing a constant fixed current through the resistor,
measuring the voltage developed across the resistor, and internally using Ohm's Law to
show you the value in ohms on the display. (Resistance = Voltage ÷ Current)

The measurand is resistance in ohms, but the quantity measured by the system is
voltage. One is transformed to the other using knowledge of a third quantity value.

We have all seen the use of the term accuracy in trade publications, standards
documents, advertisement, manuals, specifications, etc. This term has been frequently
used and misused, in different ways, to describe the quality of a measurement or
instrument. Let's consider:

 Why do we call it accuracy?


 What is accuracy?
 Is this the same as measuring uncertainty or inaccuracy?
 Is gage repeatability and reproducibility (gage r&r) another way of
specifying measuring performance?
 Is being precise the same as being accurate?
 Can we be precise and not accurate; or, can we be accurate and not
precise?

Key Terminology

ACCURACY

Accuracy — the closeness of the agreement between a measured quantity value and a
true quantity value of a measurand. (VIM 2.13)

Accuracy is a qualitative concept. No measured values (quantities) are associated with a


qualitative term. If one measurement is considered more accurate than another, it is a
matter of opinion.

"Accuracy" is a holdover from the traditional error approach to measurement. The main
reason accuracy is considered a qualitative value is that the "true value" is unknown —
we may have a value assigned to it, but (except for a few special cases) that value also
has an associated and non-zero uncertainty. The Measurement Uncertainty approach to
measurement does not use accuracy, but the specifications of instruments used to
make measurements do.

Accuracy and precision are not the same. The term precision should never be used
when accuracy is meant.

Key Terminology
UNCERTAINTY

We also just noted that accuracy is a qualitative concept, not a quantitative one.
Uncertainty, on the other hand, IS a quantitative concept.

Measurement uncertainty — a non-negative parameter characterizing the dispersion


of the quantity values being attributed to a measurand, based on the information used.
(VIM, 2.26)

Again, uncertainty is quantitative. It has a numerical value. The use of the statistics
term "dispersion" in the definition indicates that some type of analysis is needed to
determine uncertainty. Dispersion means the amount by which we can normally expect
the values to spread, or be different from one another. The definition also indicates that
some kind of information is needed.

As shown in the illustration, measurement uncertainty forms a band on either side of


the measured value. That band is where we expect, with good likelihood, the value of a
theoretically perfect (zero uncertainty) measurement would be located.

CALIBRATION

Calibration — an operation that, under specified conditions, in a first step, establishes


a relation between the quantity values with measurement uncertainties provided by
measurement standards and corresponding indications with associated measurement
uncertainties and, in a second step, uses this information to establish a relation for
obtaining a measurement result from an indication. (VIM, 2.35)

 NOTE 1 A calibration may be expressed by a statement, calibration


function, calibration diagram, calibration curve, or calibration table. In some
cases, it may consist of an additive or multiplicative correction of the
indication with associated measurement uncertainty. NOTE 2 Calibration
should not be confused with adjustment of a measuring system, often
mistakenly called "self-calibration", nor with verification of calibration.
NOTE 3 Often, the first step alone in the above definition is perceived as
being calibration

(continued on next page)

CALIBRATION (continued)

That is a long definition. Here is a simple version.

 Calibration is performed under specified condition (temperature, humidity,


procedure, measuring equipment and so on).
 Calibration is a process that compares the indication of a measuring
instrument to the known value of a measurement standard.
 The result of a calibration is knowledge of the relationship between the
indication of the instrument and the quantity represented by the
measurement standard. The knowledge is usually expressed as a value with
measurement uncertainty.

The notes (as in all definitions) are also important.

 First, the result of a calibration can be a statement, diagram, mathematical


function, graph, table of values, or a set of corrections.
 Second, adjustment of the item being calibrated is not part of calibration.
 Third (and rarely) some places may only perform the comparison and do
nothing else.

(Continued on next page)

CALIBRATION (continued)

Notice that the official technical definition of calibration has gone against a lot of things
that "everybody knows". But that's how it is. "Calibration" is also one of the most used
and misused words in measurement science. However, it is the responsibility of
professionals working in measurement to use this, and other terms, correctly whenever
possible.

Calibration is only a comparison and reporting process.


Calibration never implies or requires a need for adjustment (assuming the device being
calibrated is adjustable.)

Based on the results of a completed calibration procedure, you may determine


that an instrument needs to be adjusted or otherwise repaired to correct its operation or
indications.

After adjustment or repair is complete, the original calibration must always be repeated.

All measurements in a calibration must have the property of "metrological traceability",


but that is a topic for the next page.

TRACEABILITY

Metrological traceability — a property of a measurement result whereby the result


can be related to a reference through a documented unbroken chain of calibrations,
each contributing to the measurement uncertainty. (VIM, 2.41)

The term metrological traceability, or more commonly measurement traceability, is


used to distinguish this from similar concepts in other fields of work. For example, we
can have "traceability" for raw materials, parts, documents, financial transactions, and
many other things. All share the concept that the history or pedigree of something has
been documented and can be followed back in time and location. In our case, when we
speak of measurement traceability we are concerned with the "pedigree" of a measured
value.

Note that the definition says that each calibration (comparison) in the traceability chain
contributes to the uncertainty of the measurement. This means that the measurement
uncertainty increases as your calibration gets further from your national standards (or
other reference). It also means that the calibration certificates must indicate what the
uncertainty is. Finally, it implies that for best results there should be some standard
methods to calculate and report the uncertainty. The GUM provides those standard
methods

Key Terminology
TRACEABILITY (continued)

Notice that measurement traceability is a property of only one thing — the result of a
measurement (a value, a number).
Nothing other than the result of a measurement can have measurement traceability.

Instruments, documents, procedures, and organizations cannot have measurement


traceability. This means that in the context of measurements, it is not possible for an
instrument or a laboratory or a calibration certificate to be traceable to a measurement
standard. (No matter what is said in the advertising.)

Only a measured value can be traceable to a measurement standard.

REPEATABILITY

Measurement repeatability is measurement precision under a set of repeatability


conditions of measurement. (VIM, 2.21)

The repeatability conditions of measurement are those creating a set of conditions


that includes the same measurement procedure, same operators, same measuring
system, same operating conditions and same location, and replicate measurements on
the same or similar objects over a short period of time. (VIM, 2.20)

So, repeatability is the variation in measurements obtained with one measuring


instrument when it is used by one person while making repeated measurements of the
same characteristic, on the same part, in the same place, in a short time.

Repeatability can be considered the short-term variation present in the measurement


system.

REPRODUCIBILITY

Measurement reproducibility is measurement precision under reproducibility


conditions of measurement. (VIM, 2.25)

The reproducibility conditions of measurement are those creating a set of conditions


that includes different locations, operators, measuring systems, and the same
measurements on the same or similar object. (VIM, 2.24)

So, reproducibility is the variation in measurements obtained with different measuring


systems, used by different people, possibly at different places and times, while making
repeated measurements of the same characteristic, on the same or similar part.

Reproducibility can be considered the variation present between different measurement


systems and operator
Repeatability and reproducibility alone do not identify the uncertainty of a
measurement, even though a precondition is that all of the tools are calibrated. They
are elements or components of the overall uncertainty.

Measurements that are both repeatable and reproducible can still lack accuracy.

Key Terminology
LINEARITY

Most measuring instruments are designed to give a linear indication. This means that a
graph comparing the actual values of a range of measurands to the readings of those
values by the instrument will be a straight line. A graph, like the illustration, is often the
best way to see and understand linearity.

Linearity is, most commonly, the difference between actual displayed values on an
instrument compared to an ideal straight line drawn between the minimum and
maximum range points of the instrument. This type of linearity is also called terminal
linearity.

If necessary, one or both scales of the graph can be logarithmic, especially if needed to
cover a very wide range of values.

Type A & Type B Methods of Evaluation

The Guide to the Expression of Uncertainty in Measurement (GUM) identifies two ways
to evaluate (calculate) measurement uncertainty contributors.

TYPE A Method of evaluation Type A methods are those that determine standard
uncertainty by performing statistical analysis of repeated measurements of the
measurand. (GUM, 4.2)

TYPE B Method of evaluation Type B methods are those that determine standard
uncertainty any method except statistical analysis. (GUM, 4.3)

As noted in the list of standards, the GUM or equivalent document is available from at
least three sources, not counting the guides developed from the standard. The easiest
and most economical (it's free!) is to download the JCGM 100:2008 Evaluation of
measurement data — Guide to the expression of uncertainty in measurement directly
from the BIPM web site.

Type A & Type B Methods of Evaluation


Type A Methods of Evaluation

Type A evaluation of measurement uncertainty is statistical analysis of your own data.


For measurement uncertainty, the result of a Type A evaluation is a value for the
standard deviation. This is the common statistical measure of the dispersion (spread) of
a data set.

You can have several different uncertainty contributors evaluated for a single
measurement process. Some examples are repeated measurements at each data point,
or temperature records taken at the measurement location. Modern electronic sensors
and measurement automation systems make this very easy.

The standard deviation has the same units as the quantity being measured. For
example, if a series of measurements of mass was made and expressed in grams, the
standard deviation of those measurements would also be expressed in grams.

This is very convenient and easy to visualize and use.

Remember: Type A and Type B are different methods of evaluating or calculating the
components of measurement uncertainty. They are not, themselves, types of
uncertainties.

Type A & Type B Methods of Evaluation

There are many types of measurement uncertainty sources that can be calculated by
Type A (statistical) methods. A few examples are:

 Repeated measurements of the same measurand. The measurement


should be repeated as many times as needed to get values of the mean
(arithmetic average) and standard deviation good enough for your needs.
This can be as few as three or five; many people make at least ten repeat
measurements.
 The results of measurement assurance processes (control chart data, for
example)
 Results of gage repeatability and reproducibility (gage R&R) studies
 Evaluation of the data from a series of calibration reports. This is especially
good if all of the reports contain "as found" data as well as "as left" data.
 Data from experiments specifically designed to evaluate a parameter or
condition. For example, you might want to record data on the performance
of a measurement system at different temperature, to find out how it really
performs — for example, if you do work at a customer's non-air-conditioned
site instead of in your home office laboratory.

Very occasionally an instrument manufacturer may identify some factor in a


specification as being a contributor evaluated by type A methods. That's fine for
them. For you, the user, it is treated as a Type B uncertainty contributor, because you
or your people did not make the relevant measurements.

Remember: Type A methods of evaluation use statistics on data created by or available


to you. Type A is a method of evaluation, not a type of uncertainty.

Type A & Type B Methods of Evaluation


Type B Methods of Evaluation

Type B methods of evaluating measurement uncertainty components include any


reasonable method except statistical evaluation of your own measurements. The result
of a Type B evaluation is called a standard uncertainty.

For example, the result of a length measurement is often affected by changes in


temperature because parts and gages expand and contract as they get hot or cold. If
we repeatedly measured the length at various actual temperatures and calculated the
uncertainty that resulted, it would be a Type A method. If instead we looked up the
expansion coefficient of the material and multiplied that by the expected range of
temperatures, it would be a Type B method of evaluation. In either case the uncertainty
component would be the thermal coefficient of expansion.

Another uncertainty component evaluated by Type B methods, included in all


measurements and calibrations, is the measurement uncertainty of the standards used
for the calibration of the instrument. The uncertainty of the standard is obtained from
its calibration certificate

Uncertainty Budgets
An uncertainty budget is a listing of all of known sources of uncertainty, their
magnitudes, how they were obtained, and whether they are evaluated by Type A
methods or Type B methods. The end result is a value for the measurement uncertainty
of that specific measurand using that specific measurement system.

An uncertainty budget may also be used as part of a process to think about all of the
different sources of uncertainty, both to figure out how they contribute to the overall
uncertainty and to look for areas that can be improved.

The details of the illustration will be covered later in this course. One thing that should
be obvious, though, is that the reproducibility of the measurement system accounts for
almost all of the measurement uncertainty.

Uncertainty Budgets

In other cases, many sources of uncertainty are quantified during a statistical process
such as measurement assurance (check standards).

The uncertainty that is reported here can appear as a single line in the budget but it
may represent several sources. When we determine uncertainty this way, we can't
figure out the separate individual contributions of these several sources, but much of
the time we don't need to — we only need the total.

In fact, we only need to formalize the separation of sources in a measurement


assurance process if we need to know how big each separate one is. This is only useful
if the overall uncertainty is too large and we plan to design a better measurement. At
that point, we need to know which sources need the most improvement.

Uncertainty Budgets

Type B evaluation of uncertainties may require engineering calculations, tempered by


experience and judgment. No rule states how deep you are required to go in performing
a Type B evaluation, except that "all known significant contributors" should be
evaluated.

In the example below, where temperature affected length, you might consider the
uncertainty in measurement of the temperature as another Type B in the length
equation. Length standards will change depending on whether they are lying down or
standing up. Gravity will shorten a standing gage block due to its own weight (but only a
very small amount).

Including or omitting each of these considerations is optional, but there is an


expectation that any significant factor will be included. If there are known uncertainty
contributors that your analysis determines are not significant (that is, omitting them will
not materially change the result) they may be omitted. Omitted contributors should still
be referred to in a note stating what was considered and omitted, and that they are not
significant.

Specifications

Frequently, we are required to calculate the uncertainty of a measurement system


where we aren't sure about some factors that aren't really under our control. For
example, when we are using a multimeter to measure the resistance of an RTD
(resistance-temperature device, such as a Platinum thermometer or a thermistor).

The uncertainty of the multimeter must be taken into account, but unless we conduct a
statistical experiment and evaluate the uncertainties using Type A methods, how do we
include it?

Of course the manufacturer of the multimeter has given us specifications, but whatever
methods the manufacturer used to determine them, for us they are still something to be
evaluated by Type B methods.

Specifications

In converting manufacturer's specifications for use in Type B evaluations, we are


allowed to use any method that makes good engineering sense.

We will look at four approaches that are acceptable, but they are not the only ones
available. If none of these fit a particular situation, it would be best to consult an expert.

Specifications
Method 1

You know the coverage factor and/or confidence level

The manufacturer specification, or a calibration report, gives you a statement of


uncertainty that indicates the values are based on Standard Uncertainties determined
using methods conforming to the GUM. Either the confidence level, or coverage factor
and confidence level must be given with the specifications. Look for a statement that
the specifications are "for a 95% confidence interval", or "coverage factor k = 2 at the
approximately 95% confidence interval". In the last case you will also usually see the
term "expanded uncertainty".

For this type of statement, find the standard uncertainty by dividing the values given by
the coverage factor. If only percentages are given, divide by 3 if it is 99%, or 2 if it is
95%. You will rarely if ever see 67% (k = 1), but it is possible.

Note that 95% is used as an example here; 99% is also common. Any other given value
that can be unambiguously traced to the standard deviation of a normal distribution
may be used.

The resulting value is a standard uncertainty and can be used as-is as one contributor in
the Type B evaluation section of your uncertainty budget.

Specifications
Method 2

You know the standard deviation.

In some rare cases, the manufacturer's specification may explicitly state that the values
are standard deviations or standard uncertainty, but not give the coverage factor.
(Standard uncertainty can include other factors that have been mathematically
converted to a value equivalent to a standard deviation.)

In this case, assume that the values represent one standard deviation and use them as-
is as one contributor to your Type B evaluation.

Specifications
Method 3

The manufacturer's specification is a tolerance and you have no other knowledge.

Most often, the accuracy of an instrument will simply be stated in what appears to be a
tolerance, as in ...

Accuracy for any range = ± 2 microvolts


In fact, we can only assume what this specification implies. The usual assumption is that
the value could be anywhere in that range with equal probability. This is called a
rectangular distribution.

For this instrument, we must assume that what the specification means is that the
instrument reading will be within ±2 microvolts of the "true value" of the measurand,
with no statement at all about where in the interval the reading is likely to be.

(continued on next screen)

Specifications
Method 3 (continued)

For the rectangular distribution method, you can find the equivalent standard
uncertainty (u) by taking one side of the tolerance and dividing by the square root of 3.
If the specification is ± a, then

This is a powerful method. It gives us the opportunity to quickly and easily calculate an
uncertainty that can be used in GUM evaluation methods, but without knowing very
much about its source. It can be done with confidence that it is reasonable and in good
engineering judgment.

The downside is that the resulting uncertainty is quite large compared to what it might
be if we spent the time and other resources to fully characterize the instrument and
determine a real standard deviation. Of course that's what we lose by not knowing
enough about our uncertainty sources.

Specifications
Method 3 (continued)

There are a lot of things, other than what are normally considered specifications or
tolerances, that are also assumed to be rectangular distributions. Here are a few
examples.

Resolution of digital displays: half of the interval between the least significant digits
of a digital display. This is usually but not always one count. (Intervals of two or five are
not uncommon.) The least significant digit is the one on the right. Example: if the meter
is reading on the 1.000 V range and the interval is one digit, you would take 0.0005 V
(half of 0.001) for use in finding the standard uncertainty. After multiplying by the factor
for rectangular uncertainty, that is 0.00028 V.

Resolution of analog meters: half of the smallest division on an analog meter,


in the vicinity of the reading. Example: if the meter is reading on the 9 V mark
on the 10 V range, and that range has five divisions between each volt mark,
the small divisions represent 0.2 V. So you would take 0.1 V (half of 0.2) for
use in finding the standard uncertainty. After multiplying by the factor for
rectangular uncertainty, that is 0.06 V.

No uncertainty on a calibration report; test uncertainty ratio (TUR) is stated:


in a lot of cases a calibration report will not give uncertainty values but will quote a TUR.
The most common values are 4:1 and 10:1. In those cases, use the manufacturer
specification divided by the ratio as the Type B contributor from the calibration
uncertainty. Example: a calibration report for a digital multimeter does not list
uncertainty but gives a 4:1 TUR. You need to know the uncertainty attributable to
calibration when measuring a 100 Ω resistor on the 400.0 Ω range. The specification of
the meter is ± (1.2% of reading + 4 digits) on that range, so a reading of 100.0 Ω would
be ± 1.6 Ω. You would divide that by 4, and use 0.4 Ω as the calibration uncertainty
contributor in your Type B evaluation. After multiplying by the factor for rectangular
uncertainty, that is 0.23 Ω.

Specifications
Method 4

A tolerance is specified but we know something about the distribution.


If you are given a tolerance, but through engineering judgment you know something
about the characteristics of the instrument, another distribution can be assumed. For
example, if the instrument in question has very little drift compared to its random
variation (noise), you may choose to use a triangular distribution. This approach says
that you know that there is a center value and random variation around it, and it is far
more likely for the value to bear near the center than the edges, but don't know much
else about the variation.

The equation to use to find standard uncertainty (u) from a specification (± a) when
assuming a triangular distribution is

Specifications
Method 4 (continued)

Note that some functions of electrical instruments usually tend to drift slowly during the
calibration interval. Instruments that drift must not be characterized as a triangular
distribution. Use rectangular for this case. One example of this kind of instrument
includes ones that generate or measure frequency or time interval: frequency
standards, signal generators, oscilloscopes or network analyzers, for example.

This may apply to only some functions of an instrument. For example, the timebase of
an oscilloscope (the part that determines time interval and frequency) will always drift,
but the voltage measuring functions may be very stable.

Specifications
Side Trip: Lower-case and Upper-case Symbols

Some symbols have been presented already. A symbol is a convenient short way to
represent a value or type of value, without writing everything out every time. In
measurement uncertainty (and other areas of scientific and technical mathematics), it
can make a huge difference if a symbol is in the wrong case.

For example, consider electrical power: 1 mW is a vastly different amount than 1 MW.
The first, 1 milliwatt or 0.000001 W, might be the power consumed by your MP3 player.
The second, one megawatt or 1,000,000 W, might be the power generated by a small
local power plant — probably more than enough to power every MP3 player in the world
at the same time. Upper-case and lower-case actually mean something!

In the case of measurement uncertainty, differences in meaning because of letter case


also have meaning. For example, the lower-case u that you have already seen
represents the standard uncertainty of an uncertainty contributor. On the other hand,
upper-case U represents something called the expanded uncertainty, which will be
covered soon.

Specifications
Important concepts covered so far:

Specifications
Combined Standard Uncertainty

So far, the discussion has been about individual standard uncertainties — either
standard deviations or other values, such as specifications, converted to equivalents of
standard deviations. These are represented by the symbol u. Notice that this is in lower-
case. After the uncertainty budget has been tabulated, the uncertainties are combined
to yield a quantity called the Combined Standard Uncertainty, or uc. That is, lower-case
"u" with subscript "c".

Since uncertainties are always expressed as standard deviations, they must be


combined using the normal rules for standard deviations. This rule is called root-sum-
square or RSS. It's found by squaring each standard deviation, adding up these squares,
and taking the square root of the result.

So if I had three uncertainties, u1, u2 and u3, the combined standard uncertainty would
be expressed as

Specifications

There are some rare exceptions to this combining rule that happen when the underlying
measurements are interrelated in a fashion that is called correlated.

This rule is also not exactly correct when the measurements are not based on a linear
scale — decibels are the most common example. These exceptions are usually handled
by experienced metrologists and are not often an important factor in daily work.

Reporting
Expanded Uncertainty

When communicating uncertainty in a calibration report or specification, you can simply


state the combined standard uncertainty, uc as long as it is identified that way.
However, the preferred method is to state a value called the Expanded Uncertainty,
represented by the symbol U (upper-case). Expanded uncertainty is the combined
standard uncertainty multiplied by the coverage factor. In symbols, that is:
Reporting

Expanded uncertainty should always be labeled by name, and the coverage factor must
be stated. That way, everyone will know that you have used the standard method for
determining uncertainty and what you are talking about.

Example: The measured length of this gage block is 1.000002 inch, with expanded
uncertainty of 0.00006 inch, k = 2, at 20 °C.

Reporting

Sometimes, it's more convenient or more conventional to specify the uncertainty of the
measurand or the measurement system in the form X ± U. (X is the value of the
measurand, and U is the uncertainty or related value.)

For example, a thermometer manufacturer might specify the "accuracy" of a precision


liquid-in-glass thermometer as ± 0.5 ¡C over its range.

Manufacturers and engineers have used this kind of specification in different ways in
the past. Now, if what is meant is the expanded uncertainty, then when using
uncertainty in this way, we should specify the coverage factor k.

The coverage factor (k) can be any value, but is almost always between 2 and 3. When
used in this way the result is called the expanded uncertainty, and is written ± U, where
the expanded uncertainty U = kuc, k is the coverage factor of __, and uc is the
combined standard uncertainty.

Reporting

Example

Within one year after calibration, this multimeter's uncertainty on the 1.0000 volt range
is ± 0.0015 V, k=2.
By a mentioning the coverage factor k, you are communicating that the specification is
an expanded uncertainty and was calculated in a manner that followed the GUM
methods.

Reporting

If a standard normal distribution (also called Gaussian or bell-shaped distribution) is


assumed for the results, a coverage factor of 2 would provide for a measurement
uncertainty statement with an exact confidence level of 95.45%, and a coverage factor
of 3 would provide an exact 99.73 % confidence level.

The statistical terms "confidence level" and "confidence interval" have exact definitions
in statistics and are based on the standard normal distribution. Since Type B methods of
uncertainty evaluation involve estimates, approximations and assumptions, the exact
values shown above are generally not justified. Also, the real world does not always fit
neatly into the normal distribution.

Reporting

Because the estimates, assumptions and approximations are useful in many


measurement situations, though, we can speak of an approximately 95% level of
confidence for k=2, and approximately 99% level of confidence for k=3.

Be extremely careful when communicating about expanded uncertainty, to make sure


that your audience realizes that these are only approximate statements.

Reporting

Uncertainty statements must also have other limitations when reporting the capability
of measurement and test equipment (M&TE). Since M&TE measures over one or more
specified ranges, the uncertainty statement must be reported separately for each range
of operation.

If the instrument is capable of measuring more than one parameter (voltage, current,
resistance, etc.) and has one or more ranges, the uncertainty statement would need to
be reported for each range and parameter.

Dimensional Example
Reproducibility

The reproducibility was computed by computing the standard deviation from 30


readings taken on a gage block using a good quality micrometer. Each of three
operators took ten readings each on a 0.5 inch Class 1 gage block. The one standard
deviation for the reproducibility on the 0.5 inch block was 30 microinches.

Dimensional Example
Resolution

The resolution of an instrument is defined as the least count of the readout. For a 0 - 1
inch vernier micrometer, it is assumed to be 0.0001 inch. One standard uncertainty is
one-half the resolution divided by the square root of three - or 29 microinches in this
instance. The larger the resolution or the reproducibility was used as the type A
uncertainty in our uncertainty budget.

Dimensional Example
Master Values

The uncertainty due to the calibration uncertainty of the gage blocks used were taken
from the calibration report. All the blocks were Grade 1 blocks, and the calibration
report did not give measurement uncertainties. It is assumed the values of the blocks
could be anywhere within the block tolerance; therefore, the tolerance was divided by
the square root of three to get one standard uncertainty.

Dimensional Example
Temperature

The room is maintained at 20 °C ± 2 °C. However, the thermometer in the room may be
in error. Assuming the room thermometer is good to ± 0.25 degrees, the temperature of
the micrometer could be anywhere between 17.75 and 20.25 °C. It is also possible that
the block and the micrometer may not be at the same temperature. All three factors
can contribute to the uncertainty of the calibration. In our case the operator wore
gloves, we did not measure the temperature, and the difference in temperature
between the block and the micrometer was estimated not to exceed 1.0 °C.

imensional Example
Temperature (continued)

If the block and micrometer are at the same temperature, the temperature could
deviate from the standard 20 °C by 2.25 °C. Even when both are at the same
temperature, possible differences in the thermal coefficient of expansion can cause
uncertainties in the measurement. It is reasonable to assume a difference in coefficient
of expansion of 2 x 10-6 per degree C between the steel in the gage block and the steel
in the micrometer. If the block and micrometer were 2.25 degrees from nominal, the
maximum error would be:

block size x δT x 2 x 10-6 or 2.25 microinches.

If the micrometer is one degree warmer than the block, the maximum uncertainty
would be:

0.5 x 11.5 x 1 or 5.75 microinches.

Dimensional Example
Temperature (continued)

We will take the root mean square of the two temperature values to get a possible
uncertainty caused by temperature to be 6.2 microinches.

But it is unlikely that the worst case will always happen so we will treat is as a
rectangular distribution, divide by the square root of 3 to get 4 microinches as the
standard uncertainty due to temperature.

Dimensional Example

It can be seen from the uncertainty budget that the uncertainty in the calibration of the
micrometer is limited by the resolution of the instrument.
Electrical Example

A high resolution digital voltmeter is used to measure the voltage developed across a
standard resistor and an unknown resistor of the same nominal value as the standard,
when series-connected resistors are supplied from a constant current source. The value
of the unknown resistor RX is given by:

where:

 RS = Calibration value for the standard resistor


 RD = Relative drift in RS since previous calibration
 RT = Relative change in RS due to the temperature of the oil bath
 RX = Voltage across RX
 RS = Voltage across RS

Electrical Example

The calibration certificate for the standard resistor reported an uncertainty of ± 1.5 ppm
at a level of confidence of not less than 95% (k=2).
A correction was made for the estimated drift in the value of R S. The uncertainty in this
correct, RD, was estimated to have limits of ± 2.0 ppm.

The relative difference in resistance due to temperature variations in the oil bath was
estimated to have limits of ± 0.5 ppm.

The same voltmeter is used to measure VX and VS, and although the uncertainty
contributions will be correlated, the effect is to reduce the uncertainty and it is only
necessary to consider the relative difference in the voltmeter readings due to linearity
and resolution, which was estimated to have limits of ± 0.2 ppm for each reading.

Electrical Example
Type A Evaluation

Five sets of voltage measurements were made and used to calculate five values for the
VX / VX ratio, in ppm.

The ratios are:


10.4, 10.7, 10.6, 10.3, 10.5

The average ratio of the five is:

With a standard deviation of:

The uncertainty in the ratio is:


Electrical Example
Reported Result

Assuming the standard resistor was exactly 10,000 Ω, the value of the unknown resistor
is (10,000 + (10,000 × 10.5 ppm)) = 10,000.105 Ω which is rounded to 10,000.10 Ω.

The expanded uncertainty U is (10,000 × 2.836 ppm) = 0.0286 Ω, which is rounded to


0.03 Ω.

So, the reported value of the unknown resistor will be 10,000.10 Ω ± 0.03 Ω.

Click "Back" to review uncertainty budget.

Temperature Example

Uncertainty guidelines follow JCGM 100:2008 (GUM). The expanded uncertainty, U,


assigned to the measurement is calculated by:
where:

 k = the coverage factor


 s = Type A standard uncertainty based on the statistical analysis of a series
of measurements
 ui = the estimated Type B standard uncertainty for each known component
in the measurement process that cannot be directly measured

Temperature Example

The uncertainty computed using the equation at right with k = 2, gives a 94.45% level
of confidence for a normal distribution. This is usually rounded to "approximately 95%",
and is consistent with the GUM methods.

Temperature Example
Type A Uncertainty — Process Standard Deviation

Type A components are characterized statistically using long term process control
charts of bath stability. The Type A uncertainty includes the following component:

Bath Stability

Temperature Example
Other Uncertainty Contributors — Type B methods

 Standard thermometer sensor uncertainty from calibration report


 Standard thermometer meter uncertainty from calibration report
 Temperature uniformity of the thermal bath

Note: An additional uncertainty uncertainty comes from the readability (resolution) of


the thermometer being calibrated. Graduations on a liquid-in-glass thermometer may
be as small as 0.2 °C or less, as large as 5 °C or more, and several values between. For
a digital thermometer, the equivalent value is the display resolution, which may be from
1 °C to 0.001 °C or less. Because this is a general example, the resolution is not
included in the uncertainty budget example. The results are normally rounded to the
resolution of the thermometer being calibrated; in this example the result is rounded to
0.001 °C.
Temperature Example

The uncertainty generally should be calculated for each measured value. Values can be
grouped if the calculations show it is appropriate. For example, the table Temp SOP 40
shows that only three uncertainty budgets would be needed for this calibration system:
one at the triple point of water (0.01 °C, TPW), one for the range from above TPW to
150 °C, and a third from above 150 °C to 400 °C.
Mass Example

The calibration is carried out using a mass comparator whose performance


characteristics have previously been determined, and a weight of OIML Class F2. The
unknown weight is obtained from:

Measured value of unknown weight is

where:

 WS = Weight of the standard


 DS = Drift of the standard since last calibration
 δId = The rounding of the value to the least significant digit of the indication
 δC = Difference in comparator readings
 Ab = Correction of air buoyancy
Mass Example

The calibration certificate for the standard mass gives an uncertainty of ± 30 mg at a


level of approximately 95%.

The monitored drift limits for the standard mass have been set equal to the k=2
(approximately 95% confidence level) uncertainty of its calibration, and are ± 30 mg.

The least significant digit Id for the mass comparator represents 10 mg. Digital
rounding δId has limits of ± 5Id for the indication of values of both the standard and the
unknown weights. Combining these two rectangular distributions gives a triangular
distribution, with uncertainty limits of ± Id, that is, ± 10 mg.

Mass Example

The linearity error of the comparator over the 2.5 g range permitted by the laboratory's
quality system for the comparison was estimated from previous measurements to have
limits of ± 3 mg.

A previous Type A evaluation of the repeatability of the measurement process (10


comparisons between standard and unknown) gave a standard deviation, s(WR) of 8.7
mg. This test replicates the normal variation in positioning single weight on the
comparator, and therefore includes effects due to eccentricity errors.

No correction is made for air buoyancy, for which the uncertainty limits were estimated
to be ± 1 ppm of nominal value, i.e. ± 10 mg.

Mass Example

Three results were obtained from the unknown weight using the conventional technique
of bracketing the reading with two readings for the standard. The results were as
follows:
 Mean difference + 0.02 g
 Mass of standard 10,000.005 g
 Calibration result 10,000.025 g

Mass Example

Since three comparisons between standard and unknown were made (using three
readings on the unknown weight), this is the value of n that is used to calculate the
standard deviation of the mean.

The calculation is:


Reported Result

The measured value of the 10 kg weight is 10,000.025 g ± 0.049 g.

Introduction

The standard document ISO/IEC 17025 General requirements for the competence of
testing and calibration laboratories, requires accredited calibration and testing
laboratories to have and apply procedures to produce estimates of uncertainty of its
measurements, using accepted methods of analysis. Even if not reported to the
customer, measurement uncertainty must still be estimated and recorded. (ISO/IEC
17025:2005, clauses 5.4.6, 5.6.2, 5.10.3, 5.10.4 and others.)

Introduction

The requirements for estimation of measurement uncertainty apply to all results


provided by calibration laboratories.

They also apply to results produced by testing laboratories under the following
circumstances:

 Where it is required by the client.


 Where it is required by the specification to which the test is carried out.
 Where the uncertainty is relevant to the application or validity of the result;
e.g., where the uncertainty affects compliance to a specification or stated
limit.

Classification of Components

The result of a measurement is an approximation or estimate of the value of the specific


quantity subject to measurement (the measurand), and thus the result is complete only
when accompanied by a quantitative statement of its uncertainty.

The uncertainty of the result of a measurement generally consists of several


components which, in the GUM approach, may be grouped into two categories
according to the method used to estimate their numerical values:

 Type A.: those which are evaluated by statistical methods, and


 Type B.: those which are evaluated by any other methods.
Classification of Components

There is no simple relationship between the classification of uncertainty components


into categories A and B and the commonly used classification of uncertainty
components as "random" and "systematic." Both random and systematic components
still exist, of course. Examples are noise (random) and bias (systematic). What is
different is the way they are evaluated: by statistical methods (type A) or by other
methods (type B). There is usually no relationship between "random or systematic" and
"type A or type B methods". For this reason, the terms random and systematic usually
are not used any more in the context of calculating measurement uncertainty.

You will see the terms random and systematic used in older (before the late 1990s)
literature on measurement uncertainty, and in more recent literature in contexts other
than estimation of measurement uncertainty. For example, the terms are commonly
used in the context of process quality assurance.

Classification of Components
Note

The difference between error and uncertainty should always be borne in mind. For
example, the result of a measurement after correction may be very close to the
unknown value of the measurand, and thus have a small error, even though it may have
a large uncertainty.

Because the uncertainty is large, though, we'll never know whether this result is very
close to the unknown. All we know for sure is that the value of the uncertainty gives us
a range of values into which the unknown is very likely to fall.

Basic Calculation of Uncertainty

Basic to the GUM approach is the representation of each component of uncertainty of a


measurement result by an estimated standard deviation.

This standard deviation is called standard uncertainty, represented by the suggested


symbol Ui, and equal to the positive square root of the estimated variance.

Basic Calculation of Uncertainty


An uncertainty component evaluated using Type A methods is represented by a
statistically estimated standard deviation Si, equal to the positive square root of the
statistically estimated variance Si2, and the associated number of degrees of freedom ni.

For such a component the standard uncertainty is Ui=Si.

The evaluation of uncertainty by the statistical analysis of a series of observations is


termed the Type A evaluation (of uncertainty).

Basic Calculation of Uncertainty

In a similar manner, an uncertainty component evaluated using Type B methods is


represented by a quantity Uj that may be considered an approximation to the
corresponding standard deviation; it is equal to the positive square root of Uj2, which
may be considered an approximation to the corresponding variance and which is
obtained from an assumed probability distribution based on all the available
information.

Since the quantity Uj2 is treated like a variance and Uj like a standard deviation, for such
a component the standard uncertainty is simple Uj.

The evaluation of uncertainty by means other than the statistical analysis of series of
observations is termed a Type B evaluation (of uncertainty).

Basic Calculation of Uncertainty

Correlations between components (evaluated by either type of method) are


characterized by estimated covariances or estimated correlation coefficients.

These correlations are almost always ignored in everyday practice in the field, and are
usually left for use when working at the highest levels of accuracy.

This is not always right, as correlations can sometimes make a significant difference in
results, but won't be covered in this lesson. Where they are present (in some
uncertainty budget examples) the value is set to 1, which is mathematically the same
as ignoring it.)

Basic Calculation of Uncertainty


Type A Methods of Evaluation

Type A evaluation of standard uncertainty may be any valid statistical method for
treating data.

Examples are calculating the standard deviation of the mean of a series of independent
observations, using the method of least squares to fit a curve to data in order to
estimate the parameters of the curve and their standard deviations, or carrying out an
analysis of variance (ANOVA) in order to identify and quantify effects in certain kinds of
measurements.

Basic Calculation of Uncertainty


Type A Methods of Evaluation (continued)

Gage repeatability and reproducibility (R&R) studies are similar to ANOVA studies, but
often omit important effects. If the measurement situation is especially complicated,
consider obtaining the guidance of a statistician, or consulting other reference material.
Similarly, it's not a good idea to estimate uncertainty effects from Gage R&R results
without further statistical evaluation.

Basic Calculation of Uncertainty


Type B Methods of Evaluation

Type B evaluation of standard uncertainty is usually based on scientific judgment using


all the relevant information available, which may include:

 Previous measurement data


 Experience with, or general knowledge of, the behavior and property of
relevant materials and instruments
 Manufacturer's specifications
 Data provided in calibration and other reports, and
 Uncertainties assigned to reference data taken from handbooks.

On the following screens are some useful methods for calculating an uncertainty
component using Type B methods.

Basic Calculation of Uncertainty


Type B: Calculation Method 1

Convert a multiple of an estimated standard deviation (e.g.) ± 2σ to a standard


uncertainty by dividing the quoted uncertainty by the multiplier.

Basic Calculation of Uncertainty


Type B: Calculation Method 2

Convert a quoted uncertainty that defines a "confidence interval" such as 95 or 99


percent, to a standard uncertainty by treating the quoted uncertainty as if a normal
distribution had been used to calculate it and dividing it by the appropriate factor for
such a distribution.

These factors are 1.960 for 95% and 2.576 for 99%.

Basic Calculation of Uncertainty


Type B: Calculation Method 3

Model the quantity in question by a normal distribution and estimate lower and upper
limits a-minus and a-plus such that the best estimated value of the quantity is

[(a-minus + a-plus)/2]

(i.e. the center of the limits), and there is a 1 chance out of 2 (i.e. a 50 percent
probability) that the value of the quantity lies in the interval a-minus to a-plus.

Then Uj2 = 1.48a, where a = [(a-minus + a-plus)/2] and is half-width of the interval.

Basic Calculation of Uncertainty


Type B: Calculation Method 4

Model the quantity in question by a normal distribution and estimate lower and upper
limits a-minus and a-plus such that the best estimated value for the quantity is

[(a-minus + a-plus)/2]

and there is about a 2 out of 3 chance (i.e. a 67 percent probability) that the value of
the quantity lies in the interval a-minus to a-plus.

Then, Uj2 = a, where a = [(a-minus + a-plus)/2].

Basic Calculation of Uncertainty


Type B: Calculation Method 5

Estimate lower and upper limits a-minus and a-plus for the value of the quantity in
question such that the probability that the value lies in the interval a-minus and a-plus
is, for all practical purposes, 100 percent.

Provided that there is no contradictory information, treat the quantity as if it is equally


probable for its value to lie anywhere within the interval a-minus and a-plus; that is,
model it by a uniform or rectangular probability distribution.

The best estimate of the value of the quantity is then

a = [(a-minus + a-plus) / 2], with

a = [(a-minus + a-plus) / 2]

Basic Calculation of Uncertainty


Type B: Calculation Method 5 (continued)

If the distribution used to model the quantity is triangular rather than rectangular, then

If the quantity in question is modeled by a normal distribution, there are no finite limits
that will contain 100 percent of its possible values.

Basic Calculation of Uncertainty


Type B: Calculation Method 5 (continued)

However, plus and minus 3 standard deviations about the mean of a normal distribution
corresponds to 99.73 percent limits.

Thus, if the lower and upper limits a-minus and a-plus of a normally distributed quantity
with mean

a = [(a-minus + a-plus) / 2]
are considered to contain almost all of the possible values of the quantity, that is,
approximately 99.73 percent of them.

Then, Uj2 = a/3, where a = [(a-minus + a-plus) / 2].

Basic Calculation of Uncertainty

The rectangular distribution is a reasonable default model in the absence of any other
information.

But if it is known that values of the quantity in question near the center of the limits are
more likely than values close to the limits, a triangular or a normal distribution may be a
better model.

Basic Calculation of Uncertainty

Because the reliability of evaluations of components of uncertainty depends on the


quality of the information available, it is recommended that all parameters upon which
the measurand depends be varied to the fullest extent practicable so that the
evaluations are based as much as possible on observed data.

Whenever feasible, the use of empirical models of the measurement process founded
on long-term quantitative data, and the use of check standards and control charts that
can indicate if a measurement process is in a state of statistical control, should be part
of the effort to obtain reliable evaluations of components of uncertainty.

Type A evaluations of uncertainty based on limited data are not necessarily more
reliable than soundly based Type B evaluations.

Introduction

The classic approach to stabilizing and controlling measurements is the Program Control
method that we discussed earlier. This is the method of periodic calibrations, historical
records of instrument or system performance, and stickers on the equipment.

This method is still almost universally used and is still valid. It's also less expensive for
many applications where calibrations are easy and the consequences of being out of
calibration are not great.
The other, newer, method is called measurement assurance. It's based on the principle
that measurement is a process and can therefore be monitored and controlled by
common process control tools such as statistical process control.

Check Standard

One method of performing measurement assurance is regularly using a check standard.

A check standard is an item similar to or identical to the regular, routine items being
measured by the measurement system being controlled.

If the system usually measures carbon film resistors, the check standards should be
carbon film resistors. If the system usually measures radar transmitters, the check
standard should be a radar transmitter.

Check Standard

If measuring the check standard does not change its value (which is normally true
unless the test is destructive) a perfect measurement system should get the same
answer every time.

Of course, no real measurement system will get the same answer every time because of
the many usual sources of variation.

Because of the assumption that the check standard does not change, any variability
observed in measurements of the check standard over a period of time must represent
the variability of the measurement system during that time.

Check Standard

Measurement assurance does not calibrate the instruments used in the system.

A check standard does not have to be calibrated. It might actually be calibrated, but
calibration is not necessary for use as a check standard. Therefore, use of a check
standard is never a substitute for calibration of the measurement system. When using a
check standard, numerical values of the data collected are most meaningful when
analyzed to demonstrate the stability and variability of the measurement process. If the
process is shown to be stable, it may be possible to reduce the calibration interval of
the instruments in the measurement system. On the other hand, use of the check
standard might instead show that the instruments should be calibrated more often in
order to stay within acceptable limits. Either way, using a check standard gives you
useful information, but is never a substitute for calibration.

It is important that the same check standard always be used with a given measurement
system. For example, if the measurement process is calibration of coaxial attenuators
and a 10 dB attenuator is used as a check standard, the same one - by serial number -
must always be used as the check standard.

Check Standard

By using a check standard and measuring it frequently, considerable history can be


collected on the performance of the measurement system.

In addition, since the check standard is the same as the regular measurement workload,
any influences on routine measurements, such as temperature or an operator
difference, will be accurately reflected in the measurements of the standard.

Statistics

The statistics of the measurement assurance data are useful for two things.

First, any meaningful change in the data, as indicated by process control charts,
indicates a meaningful change in the measurement system. Unless we have been
deliberately improving our system, a change is usually an indication that something is
wrong.

Even a decrease in measurement variation would seem to be good, but if it happened


by accident, it's important to understand why and to make sure that it's not the result of
broken equipment or of a new operator not understanding how to make a particular
measurement.

Statistics
Example:

Measurement assurance was used to track 25 durometers (rubber hardness testers) in a


factory. These devices are troublesome and require frequent calibration.

By maintaining control charts and check standards, the ongoing good behavior of each
durometer was recorded once per day.

When one durometer did fail, the operator caught it right away when she measured the
check standard. The instrument was found to be defective and was replaced
immediately, saving many days of incorrect measurements that would have happened
by waiting until the next calibration cycle.

Statistics

The second use of the measurement assurance data is to record and report the overall
uncertainty of the measurement system.

Since the check standard experiences the same influences as other measurements, its
variation should accurately record variations in the measurement system as a whole.
When expressed as a standard deviation, this is a variation that has been evaluated by
Type A methods.

This uncertainty can then be included in a calculation of combined uncertainty for the
entire measurement process.

Statistics

The best part of using measurement assurance for this purpose is that it is not
necessary to identify and separate all of the various causes of variability.

When doing a complete assessment of uncertainty using Type B methods, each


component must be separately identified and its contribution calculated. This is very
useful when it's necessary to know each source of uncertainty, but more work than
recording and analyzing the check standard data.

When calculating overall uncertainty, though, all we need to know is the total and not
the individual pieces. Measurement assurance provides us with exactly that.
Statistics

When measurement assurance is applied to a system, frequent recalibration may be


reduced if the data supports that conclusion.

Because the control charts that are monitoring the check standard will show if the value
of the standard is drifting (but in fact the check standard does not drift), this can be
used as an indication that the measurement system may need calibration.

More important, if the control chart shows NO drift, it's a good bet that the
measurement system is not drifting and does not need recalibration at this time.
Calibration intervals can often be greatly lengthened while at the same time reducing
the risk that product will be measured incorrectly and require recall, repair, or rework.

The assumption that the check standard does not change should be checked at
intervals. If the check standard is a device that is calibrated anyway, the calibration will
show any change provided it is calibrated using a different measurement system. Other
types of check standards can be periodically measured, again using a measurement
system other than the one it is a check standard for.

Statistics

Of course the check standard might drift even though it's likely to be more stable than
the measurement system.

In addition, the system might drift one way while the standard is drifting the other way
and the control chart will show no change.

If either of these is a concern, one method is to simply use two check standards. The
chances that both will drift the same way are very small indeed.
Frequency

How often should you measure a check standard? It depends on your measurement
process.

If measurements are frequent and easy, five to ten measurements a day is a good
number to start with. It usually requires 30 to 40 points to be recorded before a process
control chart is meaningful, and another 30 points or so to recover from a significant
change and display the new check standard values.

If measurements are difficult or expensive, or it is a type of measurement normally not


done every day, consider how fast you expect the system to drift or otherwise change.
As above, it's a good idea to have about 30 points at the beginning of your charts. If
maintaining the charts is costly, reduce check standard measurement frequency once
the chart is stable.

Frequency

How often should you measure a check standard? It depends on your measurement
process.

If measurements are frequent and easy, five to ten measurements a day is a good
number to start with. It usually requires 30 to 40 points to be recorded before a process
control chart is meaningful, and another 30 points or so to recover from a significant
change and display the new check standard values.

If measurements are difficult or expensive, or it is a type of measurement normally not


done every day, consider how fast you expect the system to drift or otherwise change.
As above, it's a good idea to have about 30 points at the beginning of your charts. If
maintaining the charts is costly, reduce check standard measurement frequency once
the chart is stable.

Use of Data

How do you use measurement assurance data for uncertainty statements?

When calculating limits for the process control charts, you will generate either a
standard deviation or an average range (R-bar). Either of these numbers can, in fact, be
used as an uncertainty evaluated by Type A methods.

Since the usual Uc is a standard deviation, R-bar can be converted to standard deviation
using simple data from a table. The tables can be found in any statistical process
control text, and many other sources. A couple of these sources are:

NIST Engineering Statistics Handbook, chapter 6


(www.itl.nist.gov/div898/handbook/index.htm) Free!
ASTM MNL 7, Manual on Presentation of Data and Control Chart Analysis, 8th edition
(www.astm.org)

Use of Data

Data from measurement assurance is not the whole uncertainty. At a minimum, the
uncertainty inherited from the process by which the measurement system was
calibrated must be estimated using Type B methods and included in an overall
uncertainty.

Some other factors that were not experienced by the check standard may also have to
be figured. For example, if the temperature was pretty stable while the check standard
was in use, but it is known to fluctuate more widely at some other times, the wider
known temperature variation should be used instead of the stable temperature range.

It's important not to double-count your uncertainty. For example if normal temperature
fluctuations were experienced by the check standard and included in the uncertainty
using Type A methods, they should NOT also be calculated and added in using Type B
methods. They are already included.

Introduction
Every measurement decision has the potential of being incorrect. When the results of a
measurement process indicate that the unit under test (UUT) is of acceptable quality,
there is a calculable probability that the UUT is actually of unacceptable quality. The
converse is also possible.

Those probabilities can be referred to as Consumer and Producer Risks, respectively.


Calculating those probabilities is a mathematical challenge, since double integration is
the standard process used.

The main focus of this section is to present an approximation method which can be
more easily intuitively grasped. Elaborate software is not needed.

A tool that can be used for these calculations is a common computer spreadsheet
application. Popular examples include Microsoft® Excel® and OpenOffice.org™ Calc. An
approximation method can easily be made to give consumer and producer risk
estimates within ±0.1% of the more sophisticated double integration method. A
comparison of results verifies the accuracy of those estimates.

Every measurement decision has an associated probability of being correct as well as


incorrect. This is a reality recognized by the practice of measurement decision risk
(MDR) analysis.

If measurement data indicate that the unit under test (UUT) is within specification limits,
there is a finite probability that it actually is not. This is called the probability of false
acceptance (PFA). If a decision is made that the UUT is within specification but actually
is not, the consumer is penalized.

Conversely, if a decision is made that the UUT is not within specification but actually is,
the producer is penalized. This is called the probability of false rejection (PFR).

Introduction

Both cases can jeopardize business relations and profit margins.

It is not enough to calibrate within a quality corridor (85% to 95%) or above a reliability
target (>85%).

It is not enough to calibrate to a 95% level of confidence.


It is not enough to have a low out-of-tolerance history based on evaluation at
recalibration time.

The above characteristics, alone or together, are not sufficient measures of the risk of
accepting a low out-of-tolerance item or of rejecting an in-tolerance item. Measurement
risk itself must be quantified

Introduction

To not quantify those risks, is to accept unnecessary calibration costs in some areas and
not to expend necessary calibration costs in other areas.

Measurement decisions risks are not necessarily better because they are low. They only
need to be as low as requirements specify or needs dictate.

Any risk more or less than that required or otherwise desired to meet a specific
objective is not in the best interests of business operations.

The level of acceptable risk varies according to the application. For example, the risks
associated with an incorrect fuel gauge reading are different if you are driving across
town, in an airplane flying across the Pacific Ocean, or in a spacecraft half-way to Mars.
You would not want the automobile fuel gauge on the spacecraft, but the fuel gauge in
the spacecraft probably costs several times as much as the whole automobile.

Consumer and Producer Risk

There are risks associated with every measurement decision made. The risks
considered in this section are of two types, consumer risk and producer risk.

Consumer risk is defined here to be, "The unconditional probability that a measurand
(measured quantity) is out-of-tolerance, but measured to be in-tolerance." In ANSI/NCSL
Z540.3 this is called the probability of false acceptance (PFA).

Producer risk is defined here to be "The unconditional probability that a measurand is


in-tolerance, but measured to be out-of-tolerance." In ANSI/NCSL Z540.3 this is called
the probability of false rejection (PFR).

Consumer and Producer Risk


Unconditional is defined here to be, "A lack of knowledge that an event has occurred, or
a lack of knowledge that a condition exists."

In the case of consumer risk, this could be a lack of knowledge that a measurand is out-
of-tolerance when it is measured to be in-tolerance.

In the case of producer risk, this could be a lack of knowledge that a measurand is in-
tolerance when it is measured to be out-of-tolerance.

Consumer and Producer Risk


ANSI/NCSL Z540.3:2006, Requirements for the Calibration of Measuring and Test
Equipment

In the USA, there is a national standard that may be a customer requirement in a


contract with your company. ANSI/NCSL Z540.3 has, in subclause 5.3, two specific
uncertainty-related requirements.

Subclause 5.3 a states "Where calibrations provide for reporting measured values, the
measurement uncertainty shall be acceptable to the customer and shall be
documented." This is a mandatory requirement to determine and report measurement
uncertainty. It should be familiar from the sections of this course up

Consumer and Producer Risk


ANSI/NCSL Z540.3:2006, subclause 5.3 b

The first part of subclause 5.3 b states "Where calibrations provide for verification that
measurement quantities are within specified tolerances, the probability that incorrect
acceptance decisions (false accept) will result from calibration tests shall not exceed 2%
and shall be documented."

The final part of subclause 5.3 b states "Where it is not practicable to estimate this
probability, the test uncertainty ratio shall be equal to or greater than 4:1." This was
covered in the second module of this course.

Consumer and Producer Risk


ANSI/NCSL Z540.3:2006, subclause 5.3 b

Subclause 5.3 b applies in situations where calibration procedures require verification


that the measurements made are within a specified tolerance. This is almost always the
case. For example, when applying a stimulus of 100.00 mA to a current meter, the
procedure may require the reading to be within the limits of 99.4 mA to 100.6 mA. This
requirement would not apply in cases where a value is simply reported. For example,
when a gage block is calibrated the report lists the measured value and the
measurement uncertainty. There are no tolerance limits, and the owner determines
usability based on the reported data.

Subclause 5.3 b also requires that, where it applies, the probability of false acceptance
(PFA) must not be greater than 2%, and that it must be documented.

It can be shown that the 2% PFA roughly corresponds to the 4:1 TUR. However, the
standard allows use of the TUR only if it is not practical to estimate the PFA.

Consumer and Producer Risk


ANSI/NCSL Z540.3:2006, <= 2% PFA

This course will not go into the mathematics of determining the probability of false
acceptance. There are at least two sources for additional information on this.

First, Workplace Training offers courses on this subject. Some laboratory accreditation
bodies and some consultants also occasionally provide courses.

Second, NCSL International has published a handbook that explains every aspect of the
Z540.3 standard. Their discussion of subclause 5.3 is very extensive.

Both the standard and the handbook can be purchased from NCSL International. If your
company is a member of NCSLI then your member delegate can purchase it at a
discounted price; you also get the discount if you are and Individual Professional
member. Both the standard and the handbook are available as paper publications; the
standard is also available as an electronic publication.

Consumer and Producer Risk


Assumptions and Related Definitions

In order to evaluate bilateral and unilateral consumer and producer risks, lacking the
ability to treat distribution types, biases or skewnesses, it is necessary to assume
certain conditions:

 Calibration system probability distribution type is approximately normal,


unbiased and unskewed.
 Measurand probability distribution type is approximately normal, unbiased
and unskewed.
 Average quality level of measurand is approximately known.

Consumer and Producer Risk


Spreadsheet Approximation Method

Consumer risk is approximated by summing the products formed by multiplying the


probability that a measurand is out-of-tolerance with the probability that the measurand
is measured to be in-tolerance.

The process is repeated, beginning with one product and increasing the number of
products, resulting in an increasingly accurate estimate of the true consumer risk, given
certain qualifying assumptions.

A set of linked worksheets in a spreadsheet application can be the computer application


used for the process.

Consumer and Producer Risk

To achieve an approximation within 0.1% of the true consumer risk by this summed
product method, it is necessary to sum approximately 160,000 products for the most
demanding case.

That number of products requires ten worksheets, each capable of 16,384 products.
Each spreadsheet row has been designed to calculate one product.

The spreadsheet statistical functions and mathematical operations perform the data
reduction with minimal data inputs listed at the right. Click items to see definitions.

Introduction
Statistical Process Control (SPC) is a powerful tool for understanding processes,
monitoring them, and providing information about them.

Invented in the 1920's by Walter Shewhart of Bell Laboratories, its use slowly increased
in industry. Use of SPC was massively increased after the USA entered World War II,
driven by quality requirements of the US War Department (now the Department of
Defense). Use of SPC methods declined after the war, however. It was not widely used
again in the USA until the quality revolution of the 1980s when the usefulness and
economic value of SPC was rediscovered by industries in the USA and other countries.
By then, however, Japanese industry had been using SPC and applying other quality
lessons since the late 1940s, when many of their leaders were taught the methods by
American experts W. Edwards Deming and J. M. Juran. It was the competitive pressure
from the superior quality of various Japanese-manufactured products that forced
industries in other countries to reconsider and start to apply SPC methods again.

Today, SPC is almost universally used in manufacturing, and has many applications in
service industries, business management, and other areas - anywhere an activity can
be looked at as a process.

Introduction

Measurement is such a process. Eisenhart, Cameron, and Pontius, working at the


National Bureau of Standards (now NIST) viewed measurement as a manufacturing
process whose output is numbers.

If you look at things that way, then you can easily see that all of the tools that are used
to control manufacturing processes can be used for measurement. The application of
quality tools to measurement is called measurement assurance, or sometimes
measurement quality assurance.

We will discuss measurement assurance in detail, but first we need to understand more
about SPC.

Components
To understand SPC, let's break it into its component words and understand each of
them separately.

Process

A process is something that is intended to happen the same way every time. Engineers
and system scientists speak of a process as something that has inputs and outputs, with
something happening in between (a transformation) to produce outputs from the inputs.

This is another good definition, but for our purposes here, let's just say that a process is
intended to happen the same way every time.

Components

Of course, nothing in life is that consistent.

Processes experience both planned and unplanned variation. A car painting process, for
example, has planned variation because it will have different colors.

Different colors may also require different film thicknesses to account for metallic
finishes, and possibly even a different number of coats for different paints.

Still, there's one process, and variations of color, etc. are planned.

Components

Other variations in the process are unplanned and usually undesired.

For example, the coating thickness may be targeted at 1.5 mils, but may vary from 1.3
to 1.6 in different places on a panel. Other panels for the same vehicle may have been
painted in other paint booths and may vary from 1.5 to 1.8 mils (1 mil = 0.001 inch or
25.4 µm).

If this unplanned variation exceeds certain limits, panels intended to be the same color
and intended for the same vehicle may not match!

Components
Process Control

How can we deal effectively with unplanned variation? We need to control the process.

This principle has been known for millennia, and since the early 1900s a considerable
body of information and mathematics has been developed. The principle used is
feedback.

While the process is running, or after its results (products) have been produced, one or
more measurements are made of the product or process to determine whether the
results are as desired.

Components

Information from measurements at the output is sent back into the process input as
feedback for the purpose of correcting any unplanned variation.

If the paint starts to get too thin, the measurement data are returned to the process so
as to increase the thickness in the future. This is called negative feedback because it is
of the opposite sign from the variation: when the paint gets thinner, the feedback says
to get thicker.

Negative feedback can stabilize the process and can minimize some kinds of unplanned
variation. It can be included in instruments and control electronics and therefore
happen automatically. It can reduce costs by improving quality, reducing rejects, and in
our example, save paint by controlling the thickness to the desired value.

Components

Process Control Problems

Process control is not without limitations, though. Every process experiences small
variation due to the random nature of things.

If the feedback mechanism senses this increase, it will send back a signal ordering the
machine setting to compensate by going down.

Components
Well, the increase wasn't due to a real change in the process, just a fluctuation, and the
next measurement shows that the process would have gone down anyway.

Now, though, we've gone down twice: first due to natural changes, and second due to
the action of the feedback system.

Now the process value is twice as low as it should have been, and the feedback will
react more vigorously to turn things up again.

Components

If the feedback mechanism is automatic, this is called overcontrol. If the changes are
applied manually, it's called tampering.

In either case, without finding ways to reduce this effect, overcontrol can actually
double these small unplanned variations. Of course, process control feedback is still
very useful.

Large changes in process values will still be detected and corrected, and the small
variations, even after being magnified by overcontrol, may still be too small to affect
the final results very much.

Components

Random vs. Systematic Variations

In order to make process control work better, we need to find a way to distinguish
between random variations, which should not be changed by feedback, and systematic
variations, for which the corrective action of feedback would be useful.

Consider a system in which we have added a statistical filter.

Using very simple statistical tools, usually ones called process control charts, our
system can now distinguish between the two kinds of unplanned variation experienced
by the process.

Components

In the terms of SPC, we call the random variation either:


 common cause variation, or
 common cause noise.

We say that the non-random variation is due to special causes.

By being able to tell the difference between common causes and special causes just by
looking at the data, SPC provides a powerful tool for monitoring, measuring, controlling,
and eventually improving processes.

For measurement uncertainty, analysis of random variation by use of a process control


chart is a Type A method of evaluation.

Note: in this context, "noise" is a synonym for common cause or random variation. It
does not refer to acoustic noise.

Further Advantages

Control engineers have found other ways to reduce the effects of noise on feedback, but
SPC gives other important benefits as well.

Processes can change in many ways, but the feedback we have studied so far can only
detect a shift in the mean, or center line, of the process values.

A process can also change by getting noisier, or by the noise unexpectedly decreasing.
The noise shows up as changes in the overall levels of random variation.

SPC will detect these changes as well, and will produce signals that can be used as
feedback or as diagnostic information to allow process operators to find problems and
take the necessary corrective action.

Further Advantages

A change in noise, or process variation, is especially important to monitor when SPC is


used to track the operation of a measurement process.

Sometimes a manufacturing process will get noisy because a bearing has worn, or
because an incoming material exhibits changed properties. This could be important, but
it might also be okay. If a measurement process becomes noisier, this will inevitably
mean that its measurement uncertainty has increased.
Knowing this is especially important. If the noise, and therefore the uncertainty of a
measurement decreases, it could be a good thing, but such good luck is not very likely.
It's more probable that something has gone wrong with the measurement system that
appears to decrease the variability, but might also be giving incorrect answers in some
other way.

Further Advantages

All significant changes in a process, whether a change of center line or mean or a


change in noise or variation, are signaled by the process control chart.

Just as important, and sometimes even more useful, if no signal is given by the chart, it
means that there are no significant changes to the process being monitored. So many
times, process operators will worry about an apparent process problem, and may even
tamper with the process trying to get it to 'work better'.

If a process control chart says that the process has not changed, is stable within
acceptable limits, this is strong evidence that any unplanned process adjustment can
only make matters worse. Process changes that actually improve the system usually
take a considerable amount of analysis, study and planning to put in place successfully.

We'll learn about how to make and interpret control charts later in the lesson. Let's look
now at some examples of how SPC and control charts can be used to monitor, control
and improve measurements.

Measurement Assurance

Measurement is a process, and by monitoring this process, we can determine if the


process is stable and reliable.

When the measurement changes, we will know about it as soon as the next monitoring
point is recorded. If we're worried about the measurement, we can just look at the
monitoring data and become confident that the measurement process is the same as it
has been.

This method is called measurement assurance, and the first important component of
measurement assurance is the process control chart on which we can view the ongoing
performance of the measurement system.
Measurement Assurance

How can you monitor a measurement process while it is being used in production?

Suppose, for example, that we have a thickness gage measuring the paint from our
previous example. The paint data is always changing because in fact it's recording
variation in the painting process. Under these conditions, we can't tell the difference
between variation of the paint and variation of the measurement.

The answer to this question is in fact the second important component of measurement
assurance. It's called a check standard.

Measurement Assurance

To monitor the measurement of paint thickness, we keep aside some samples of


painted parts, preferably ones just like the parts being manufactured.

These are the check standards, and we measure them once in a while, mixing them in
with new parts that are the routine target of the paint measuring system.

As long as repeatedly measuring these check standards doesn't change them, we can
measure them once in a while and record those results on our process control chart.

Since measuring check standards doesn't change their paint thickness, any variation
observed on the process control chart must have been due to a change in the
measurement process.

Measurement Assurance

If the measurement process varies, and we know that it will, that variation will be
recorded on and analyzed by the process control chart.

As long as the paint thickness of the check standard stays at about the same value, and
as long as the variation of these measurements doesn't change much over time, we
have strong evidence that the measurement system is also making good measurements
of the thickness of new painted parts.
An important attribute of a check standard is that it must be something that is not
changed by the measurement process. In most cases this is true. For example,
measuring the thickness of the paint film does not damage the part or change the
thickness. If the test is a destructive test, a check standard cannot be used. For
example, a peel test to measure the adhesion of the paint to the metal is a destructive
test and it is not possible to have a check standard that can be used repeatedly.

Application to Calibration

The principles of measurement assurance apply just as well to the workload of a


calibration laboratory.

In fact, this is an ideal way to satisfy a requirement of ISO/IEC 17025 (the standard for
operation of calibration labs) for in-service checks of lab equipment between
calibrations.

By keeping check standards such as stable resistors or gage blocks around, calibrating
them occasionally mixed in with a laboratory's regular work, and control charting the
results separately, ongoing confidence in the lab's work can be established during the
times between calibration of the lab's master and working standards.

Application to Calibration

Note that the check standards do not themselves have to be calibrated or traceable -
only stable over time.

A lab or a production line is required to have a traceable calibration periodically, but this
is usually a separate process and may be quite involved or even require sending the
instrument to an outside service.

Once traceability has been established, the measurement process may be effectively
monitored until the next calibration by the use of check standards and process control
charts.

Application to Calibration

Wouldn't it be wonderful if every measuring instrument had a red light on it to indicate


when it needs to be recalibrated?
In fact, this is possible. If ongoing measurements of check standards show that the
measurement process is stable and reliable (under statistical control), a great deal of
evidence exists that the process has not changed and therefore that recalibration
intervals may be considered for increasing.

If the measurement goes out of control, of course, it indicates that repair and/or
recalibration may be necessary immediately, or that something else is wrong with the
total measurement process.

Application to Calibration
Two Voices and Measurement Capability

Any time a measurement is made of an unknown - whether it's an instrument to


calibrate or a paint sample for a thickness test - there are two 'voices' to be heard about
the results.

One is the voice of the customer - this represents what the process owner
(manufacturer, instrument owner, etc.) would like the process to do in order to do the
job for which it has been designed.

The voice of the customer is expressed in terms of specifications, spec limits,


uncertainties, or measurement tolerances. It is the desired performance of the system.

Application to Calibration

The other voice is the voice of the process - this represents what the process itself is
actually doing, in contrast with what the process owner would like it to do.

The control limits on a Shewhart process control chart represent a calculation of what
process behavior is likely or unlikely, and a control alarm signal happens when
something statistically unlikely occurs.

This notion of likely or unlikely is based strictly on the past behavior of the process and
has NO relation to what the process owner wants the process to do.

Application to Calibration

The concept of process capability connects the voices of the customer and the process
by indicating the extent to which the voice of the process is saying what the customer
wants to hear.
When a process or measurement is capable, the expected values of the product (or
measurement of a check standard) fall within the specification limits or tolerance.

Application to Calibration

When a process or measurement is not capable, some proportion of the results will be
out of spec or out of tolerance in some fashion.

If a measurement is not capable, it can be made capable, but this can only be achieved
by an engineering improvement in the measurement process (or by changing the
tolerances so that the customer agrees with the current process performance).

Making Control Charts

There are several different types of control charts, usually designated by letters such as
X-bar, R, S, NP, P, C, and U.

The choice of a chart depends on several factors, but mostly on the nature of the data.
Attribute data (counts of pass-fail, or number of blemishes per surface, for example)
should be analyzed on attribute charts such as P and U.

Measurement data consisting of measured numbers or variables, should be placed on


variables charts such as X-bar-R.

When measuring check standards for the purposes of measurement assurance, we


often make single measurements at each check rather than organized groups.

The chart most likely to be appropriate for this application is called the individual -X-
moving-range or IX-MR chart. Since this type of chart is very general and is widely used
for measurement applications, we'll study it here.

Examples are on the following screens.

There are many good sources for information about all of the different chart types
should you need to know about others.

Making Control Charts

The diagram shows what a basic process control chart looks like.
The top graph plots each actual data point on a horizontal axis representing time. This
is called an X chart, a run chart, or an Individual X or IX chart. (All the names mean the
same thing. The only difference between an X-bar chart and an Individual X chart is that
each point on the first is an average of several measurements, while each point on the
Individuals chart is only a single measurement).

While the amount of time for each point is not important, the sequence is important.
Each point must have occurred later than the one to its left (this is the normal sequence
of a graph anyway). So the horizontal axis of the graph indicates relative time, while the
vertical axis is the measured values.

Making Control Charts

The bottom graph plots the moving range of the data.

The simplest and most common way to compute the moving range is to subtract the X
value of the last point from the X value of the current point, then make the sign
positive. This difference is plotted on the Moving Range or MR chart.

These two charts make a pair that should not be separated, because both pieces of
information are needed to interpret the behavior of the process.

Each point on the IX chart must have a corresponding point on the MR chart. The only
exception is that when starting a new chart, there cannot be a range value for the first
measurement point because there is no previous point to subtract from.

Making Control Charts


The diagram has several horizontal lines on it as well as the data. They include:

The center line for IX. This is the average (mean) value of IX, sometimes called IX-bar or

The center line for MR. This is the mean value of MR, called MR-bar or

Making Control Charts

They also include upper and lower limit lines for both IX and MR. These are called the
control limits, and are usually drawn so as to be three standard deviations above and
below the center lines.

They are named:

or the upper control limit for the IX chart, and

for the lower control limit for the IX chart.


Making Control Charts

Likewise, we have

and

for the MR chart.

Sometimes, zone lines are drawn at ± one and ± two standard deviations around the
center lines as well. Most of the time, though, we can just imagine that they are there.

Making Control Charts


All of these lines are used to aid in the visual interpretation of the charts.

To calculate the location of these lines, use the simple rules described on this and the
following screens.

The center line of the IX chart is just the mean of the IX values.

Making Control Charts

The center line of the MR chart is the mean of the MR values.

The MR values are calculated as the absolute value of the difference between the
current IX point and the one before it. Note in this formula that MR starts with point 2
because there is no point before point number 1.

If you have ongoing data, you will have a data point before the one you chose as the
first point and you can use all of the data, even the first point, from your sample.

Making Control Charts

The upper and lower control limits of the IX chart may be calculated from the standard
deviation of the IX data, but it is simpler to approximate it using the average moving
range as a measure of spread, and converting to an estimate of the standard deviation.

The exact location of the limit lines is not important anyway, since they are just visual
aids for interpretation.

These limit lines are symmetrical around the IX center line.


The moving range only has an upper limit since we deliberately ignored the sign of the
range when calculating it. (The lower limit is always zero.)

So that's how to make a chart. What does it mean?

Making Control Charts

If the new point is (statistically) NOT consistent with the history, out-of-control, the
process probably HAS changed.

The variation that was experienced is too large, or too small, or somehow exhibits a
pattern that indicates a 'special cause', and was not expected. Any out-of-control
situation means that the process, or the measurement, has changed and is not
behaving the way it used to.

Usually, the appearance of the chart can give you a good indication of how things have
changed as well.

A long drift downwards, for example, will show out-of-control after seven or eight points,
but it's easy to tell from the chart what kind of problem is happening as well as the fact
that there is a significant problem.

Making Control Charts

It's up to the user of the chart to decide on the data used to calculate the limits.

They can be found just by using the last 30 to 50 or so data, and recalculating every
time a new point appears. These are called natural control limits.

Most of the time, we choose a time period during which external evidence tells us that
the process is running in its usual fashion and calculate the limits from 30 to 50 or more
data during that time. These are called historical control limits.
It's especially important to use historical control limits for measurement assurance,
because otherwise slow drifts in IX center line could inadvertently be ignored.

Making Control Charts

Finding out-of-control conditions in a chart is done by visual inspection plus some


attention to a set of rules. These rules can be applied to both the IS and MR charts.

Click descriptions below for illustrations of various distributions.

*SEE SAVED PICTURES IN THE COURSE FILE SEPARATELY

Making Control Charts

After a while and some practice, you will be able to recognize these patterns easily.
There are numerical rules for reading control charts, too.

The rules don't find different control problems than the patterns you saw in the figures -
each of those patterns has violated one or more of the rules, so you can use your
experience, the rules, or both when interpreting control charts.

Making Control Charts

These rules are often referred to as the Western Electric rules (from the company that
developed them), and there are quite a few of them. Here are the most commonly used
ones.

 One point beyond the control limits (upper or lower)


 Two out of three points in a row beyond two standard deviations (up or
down)
 Five out of six points in a row beyond one standard deviation
 Seven in a row on one side of the center line
 Seven in a row increasing or decreasing without interruption

Each of these indicates a statistically improbable event based on the history of the
measurement system.
Making Control Charts

These control chart rules are not cast in stone (nor are they out of date, or incorrect,
they're just one useful set of rules).

Let the process control chart be your servant rather than your master. The choice of
control limits and rules for interpretation is always a tradeoff between a high false-alarm
risk (when the limits are close together), and lack of sensitivity to real signals (for wide
limits). In the first case, output may be delayed by trying to fix problems that are not
important, and the second case may result in shipping defective product. Either one can
cost you money. This was recognized by Walter Shewhart in 1931 when he published
his book on the subject, The Economic Control of Quality of Manufactured Product.

The currently popular rules are a good compromise in that area and a good starting
point, but other limits can be chosen should the situation warrant

SPC in Measurement Assurance

Measurement assurance uses check standards and control charts.

When the process control chart shows only common cause variation, it means that the
measurement system is stable and is behaving the way it did when the control limits
were calculated.

When the process control chart shows an out-of-control condition, the measurement
system should be investigated to see what changed.

SPC in Measurement Assurance

If that change was significant, it should probably be corrected.

Sometimes, it may mean that the measuring instrument needs calibration. Because out-
of-control can indicate out of calibration, and because measurement assurance
indicates quickly when a measurement system is giving suspect answers, the frequency
of full calibrations can sometimes be considered for reduction if the measurement
system stays in a state of statistical control for a long time.
 Gage R&R

Introduction

The purpose of performing a Gage Repeatability and Reproducibility (R & R or R&R)


study is to assess those two sources of variability in the use of a measurement device.

These two sources of measurement variability account for the width or spread in the
measurement error.

The evaluation of sources of measurement error are important in the overall


assessment of the suitability of the device for its intended use and should not be
overlooked or assumed to be insignificant.

Strictly speaking, Gage R&R is not a measurement uncertainty method, because it is


based on the old error model of measurement instead of the modern uncertainty model.
However, it is an important requirement in many industries, and a tool in some
improvement methodologies (such as Six Sigma®), so it is an important topic to know
about.

Introduction

Measurement equipment used in a Gage R&R should meet the following conditions:

1. Have a discrimination that allows at least one-tenth of the expected


process variation to be read directly. For example, if the process variation
is 0.001, then the equipment should be able to read a change of 0.0001.
2. Be within its calibration period.

We will consider three methods for assessing Gage R&R:

1. the Range Method


2. the Average and Range Method
3. the ANOVA method
Range Method

The Range Method provides a quick approximation of measurement variability.

This method will only provide the overall picture of the measurement system. It does
not partition the variability into repeatability and reproducibility.

The Range Method can be useful in obtaining a quick screening of a measurement


system, but is not recommended for a detailed study.

It may be limited to two appraisers and five parts for the study.

An appraiser is a person who regularly makes the type of measurement being studied,
using the system being studied. A part is a device that is normally measured using the
system being studied.

Range Method

The Range method does not include repeated measurements by the same appraiser on
the same part.

The measurement error standard deviation (σm)1 is estimated from the average range
divided by a constant, (σm)1= R/d2, where the constant, d2, is obtained from Table 2
(see References, on the menu bar above).

The combined R & R is calculated by multiplying σm by 5.15 (99% spread) or 6


(99.73% spread). Most commonly used is 5.15.

The symbol, s, is generally accepted as the symbol for a sample standard deviation,
while σ is generally accepted as the symbol for a population standard deviation.

This course uses the σ symbol for Gage R & R sampling standard deviation.

Range Method

The Percent R& R is calculated by dividing the combined R&R by the process variation
or feature tolerance and multiplying by 100, as shown below.

% R & R = [(5.15 x σm) / tolerance] x 100


The process variation may be known from previous experience with the process or from
the data gathered on the parts used in the Gage R&R.

If data from the Gage R&R is used, the experimenter should ensure that the parts
selected for the Gage R&R represent the process range.

The Range Method is not recommended for studies supporting the use of measurement
systems in production or for verification/validation activities.

Average and Range Method

The Average and Range method is an extension of the Range method in that each
operator makes repeated measurements on the same part.

In addition, to improve the statistical power of the study, a minimum of three


appraisers, ten parts, and three repeated measurements per operator per part is
recommended.

The study should be carefully planned in advance. In addition to determining the


number of appraisers, number of sample parts, and number of repeat readings in
advance, there are several other factors to consider when planning the study.

Average and Range Method

Other Factors to Consider

 The appraisers chosen should be selected from those who normally operate
the instrument or measurement system.
 The sample parts must be selected from the process and represent its
entire operating range.
 Assure that the measuring method (that is, appraiser and instrument) is
measuring the dimension of the characteristic and is following the defined
measurement procedure.
 The measurements should be made in random order to ensure any drift or
changes that could occur will be spread randomly throughout the study.
The appraisers should be unaware which numbered part is being checked
in order to avoid any possible knowledge bias.
 In reading the equipment, the readings should be estimated to the nearest
number that can be obtained.
 Each appraiser should use the same procedure - including all steps- to
obtain the readings.

Average and Range Method

As with the range method, the Percent R&R can be determined from the ratio of the
combined R&R to the process variation or feature tolerance.

See the equation below.

% R & R = [(5.15 x σm) / tolerance] x 100

ANOVA

ANOVA stands for Analysis of Variance.

The advantages of using ANOVA are:

 ANOVA is capable of handling any experimental setup, such as more than


three operators, more than three replications per operator, or more than
ten parts.
 ANOVA can detect interaction between parts and appraisers. In some
situations, this can be a substantial contribution to the combined R&R.
Determining the cause of and then eliminating an interaction can be the
important step in improving the measurement system.
 With ANOVA, estimates of variance components are performed more
accurately and are not reliant on tabulated constants.

ANOVA

The planning guidelines listed in the previous section are appropriate for the ANOVA
method.

Good planning is critical, particularly in assuring that statistical independence is


achieved in the study.

Analysis is performed using computer programs with modules specializing in Gage R&R.
ANOVA

The Gage R&R ANOVA method uses the random effects model for analysis.

This assumes that the appraisers and parts were selected at random from a larger
population of potential appraisers or parts. If general ANOVA software is used for
analysis, rather than a specific Gage R&R module, the fixed effects model may be
assumed.

This is the case with a two factor ANOVA with replication available on Microsoft Excel.
Do not use the fixed effects model for Gage R&R analysis.

Standards
Sub-Topics
Introduction

A measurement result is complete only when accompanied by a quantitative statement


of its uncertainty.

The uncertainty is required in order to decide if the result is adequate for its intended
purpose, to ascertain if it is consistent with other similar results, and to determine
measurement traceability to the relevant units of the SI.

International Perspectives

Over the years, many different approaches to defining, evaluating and expressing the
uncertainty of measurement results have been used.

Because of lack of international agreement, in 1977 the International Committee for


Weights and Measures (CIPM), the world's highest authority in the field of measurement
science (metrology), asked the International Bureau of Weights and Measures (BIPM), to
address the problem in collaboration with the various national metrology institutes, and
to propose a specific recommendation for its solution.

International Perspectives
This led to the development of Recommendation INC-1 (1980) by the Working Group on
the Statement of Uncertainties convened by the BIPM, This recommendation was
approved in 1981 and reaffirmed five years later.

The important recommendations of INC-1 (1980) led to the modern GUM. They are
summarized with explanation on the next few screens.

International Perspectives
Expression of Experimental Uncertainties

1. The uncertainty in the result of a measurement generally consists of


several components which may be grouped into two categories according
to the way in which their numerical value is estimated.

 Type A. Those which are evaluated by statistical methods


 Type B. Those which are evaluated by other means

Note that these are methods of evaluating uncertainty components, not types of
uncertainties.

International Perspectives

In many areas of work – Quality Assurance, for example – it is common to refer to types
of uncertainties as "systematic" or "random". A systematic uncertainty is one that can
be identified and minimized, or removed from the system, or compensated for. A
random uncertainty is one where at the present time it is not economically feasible to
identify and remove or control; it is unavoidable variation in the system from an
unknown or known but uncontrollable source.

There is NO correspondence between "random" and "systematic" uncertainties and


evaluation by Type A methods or Type B methods. These are different concepts, with
different terms, used for different purposes. The concept of random and systematic
uncertainties is very useful and appropriate in many areas, such as process
improvement. But when you are going to get a numerical result representing the quality
of the output of that process, those uncertainties are evaluated using Type A methods
or Type B methods according to the available information, not by the labels used
elsewhere in the process.
Any detailed report of uncertainty should consist of a complete list of the components,
specifying for each the method used to obtain its numerical value. An uncertainty
budget is an example of such a detailed report. In most cases, such as on a report of
calibration, only the final value of the measurement uncertainty will be reported. The
organization performing the work should be maintaining the records that contain the
details, in case they are ever necessary.

International Perspectives

2. Components of measurement uncertainty evaluated by Type A methods


(statistical analysis) are represented by their estimated variances s2 (the
squares of the estimated standard deviations s) and the number of degrees
of freedom v.

When appropriate (when there is a correlation between the uncertainty components)


appropriate covariance values should be calculated and given.

International Perspectives

3. Components of measurement uncertainty evaluated by Type B methods


(any method except statistical analysis) are represented by their estimated
values u2 (the squares of the estimated standard deviation equivalents u).
For Type B methods, u2 is equivalent to the variance s2 used in Type A
methods. The degrees of freedom v is considered to be infinite.

When appropriate (when there is a correlation be tween the uncertainty components)


appropriate covariance values should be calculated and given.

International Perspectives

4. The combined uncertainty should be characterized by the numerical value


obtained by applying the usual method for the combination of variances. In
most cases, variances are added to get a total variance.

The combined uncertainty and its components should be expressed in the


form of standard deviations. The standard deviation is the square root of
the sum of the variances, and is called the combined uncertainty.
5. If for particular applications, it is necessary to multiply the combined
uncertainty by an overall uncertainty, the multiplying factor must always be
stated. By convention, this is usually the case.

International Perspectives

Eventually, the CIPM asked the International Organization for Standardization (ISO) to
develop a detailed guide based on recommendation INC-1. ISO was selected because it
could more easily reflect the requirements from the broad interests of industry and
commerce.

International Perspectives

The ISO Technical Advisory Group on Metrology (TAG 4) was given this responsibility.
The TAG 4 Working Group 3 was assigned to develop a guidance document based upon
the recommendation of the BIPM, which provides rules on the expression of
measurement uncertainty for use within standardization, calibration, laboratory
accreditation, and metrology services.

The purpose of such guidance is:

 to promote full information on how uncertainty statements are determined


 to provide a basis for the international comparison of measurement results

International Perspectives
The Guide to the Expression of Uncertainty in Measurement

The end result of the work of ISO TAG 4/WG 3 was the Guide to the Expression of
Uncertainty in Measurement (or GUM). It was published in 1993 in the name of the
seven international organizations that supported its development. (These are listed at
right.) The 1993 GUM was corrected and republished in 1995. Both of these versions are
now replaced by a newer version.

The current version of the GUM that can be purchased from ISO and other standards
organizations is ISO/IEC Guide 98-3:2008, Uncertainty of measurement – Part 3: Guide
to the expression of uncertainty in measurement.

An identical version of the GUM, which is available as a FREE download from BIPM, is
JCGM 100:2008, Evaluation of measurement data – Guide to the expression of
uncertainty in measurement (GUM). Except for the tile pages and cost, it is identical to
ISO Guide 98-3. JCGM 100:2008 (and other publications) can be freely downloaded from
bipm.org.

International Perspectives

The focus of the GUM (ISO/IEC Guide 98-3 or JCGM 100:2008) is the establishment of
"general rules for evaluating and expressing uncertainty in measurement that can be
followed at various levels of accuracy and in many fields – from the shop floor to
fundamental research."

The principles of the GUM are intended to be applicable to a broad spectrum of


measurements, including but not limited to those required for the activities listed below.

 maintaining quality control and quality assurance in production


 complying with and enforcing laws and regulations
 conducting basic research and applied research and development, in
science and engineering
 calibrating standards and instruments and performing tests throughout a
national measurement system in order to achieve traceability to national
standards
 developing, maintaining, and comparing international and national physical
reference standards, including reference materials

International Perspectives
Wide Acceptance of the GUM

Since its original publication in 1993, the GUM has found wide acceptance around the
world. It has been translated into many languages and often published as local national
standards. For example, the version of the 1995 ISO GUM that was translated for use in
the United States is ANSI/NCSL Z540-2-1997, American National Standard: U.S. Guide to
the Expression of Uncertainty in Measurement. It can be purchased from ANSI or NCSL
International. (Why did we say "translated"? The spelling and some terms were changed
from internationally used UK English to American English, number formats were
changed from the internationally used format [123 456,78 for example] to the format
used in the United States [123,456.78], and some other changes.)
The GUM is the accepted guide for measurement uncertainty calculation at all levels of
metrology: BIPM, national metrology institutes (such as NIST, NPL, NRC, PTB and
others), standards laboratories, and many others. All accredited Laboratory
Accreditation Bodies require the calibration and testing laboratories they accredit to use
GUM methods for their measurement uncertainty calculations.

International Perspectives

In many countries, guides to interpretation and application of the GUM have been
published by standards organizations, national metrology institutes, laboratory
accreditation bodies and other organizations. While none of these guides are a legal
substitute for the actual GUM (ISO/IEC Guide 98-3 or JCGM 100:2008), many are very
useful for learning to interpret and apply the recommendations of the GUM. Like JCGM
100:2008, many of these guides are free. Two of the better-known ones are listed here.

UKAS M3003 second edition (2007), The Expression of Uncertainty and Confidence in
Measurement. This guide, published by the UK Accreditation Service, contains a lot of
examples in various measurement disciplines, with worked out solutions. It is free,
based on the corrected 1995 version of the GUM, and intended for general use in
industry.

NIST Technical Note 1297 (TN 1297), Guidelines for Evaluating and Expressing the
Uncertainty of NIST Measurement Results. This guide, published by the National
Institute of Standards and Technology (NIST, an agency of the US Department of
Commerce), is free, based on the 1993 version of the GUM, and intended for internal
use by NIST laboratories. Relative to the current version of the GUM, TN 1297 does not
include the corrections that have been made since.

International Perspectives
JCGM: Joint Committee for Guides on Metrology (JCGM)

The Joint Committee for Guides on Metrology (JCGM) was formed in 1997 to develop and
promote development of accessible and internationally accepted guides for metrology.
The membership was initially the same organizations that originally developed the GUM
and the VIM; in 2005 they were joined by the International Laboratory Accreditation
Cooperation (ILAC).
ISO TAG 4 has been eliminated and re-established as the Joint ISO/IEC TAG, Metrology,
and focuses on metrological issues internal to ISO and IEC; it also represents ISO and
IEC on the JCGM.

Currently, the two main publications from the JCGM, both available as free downloads,
are:

1. JCGM 100:2008, Evaluation of measurement data – Guide to the expression


of uncertainty in measurement (GUM)
2. JCGM 200:2008, International vocabulary of metrology – Basic and general
concepts and associated terms (VIM)

Both can be downloaded for free from www.bipm.org/en/publications/guides/.


Software
Sub-Topics
Introduction

A variety of software packages are available for calculation and reporting measurement
uncertainty statistics.

In these packages, various rules apply depending on the level of traceability and the
Quality System/Accreditation being adopted.

ISO/IEC 17025:2005 implies in section 5.4.6 that methods of determining uncertainty


should conform to the recommendations specified within the ISO Guide to the
Expression of Uncertainty in Measurement.

Introduction

Issues that often arise during the development, use and assessment of uncertainty
statements within commercially available calibration software are:

 The ability of the software to calculate the Expanded Uncertainty of


individual calibration test points in accordance with international standards
 The ability to provide the flexibility to combine Standards (Reference)
uncertainty specifications at 95% and or 99% confidence levels
 The ability to input the total expanded measurement uncertainty from an
external uncertainty analysis and entered within a procedure
 The ability to calculate the Test Uncertainty Ratio using expanded
measurement uncertainty

Introduction

It is not acceptable to directly combine uncertainties which may be at different


coverage factors (Confidence Probability). They can be combined if they can be
converted to a common level.

In some software, calibration uncertainties and specifications for the calibration


standard are converted to a Standard Confidence Level before combination. Conversion
to Standard Confidence Level is performed by dividing the contribution by its Coverage
Factor (k).

The table below shows the coverage factors available for use in one software
application; others may be different.

Introduction

After being combined, uncertainties are converted to an Expanded Uncertainty. This is


based on the user's required confidence level (set to 95%, 99%, or a custom level in the
software system).

The individual components of a specification or calibration uncertainty may be


combined by either linear addition or by the Root Sum of their Squares (RSS). Linear
addition gives a "worst case" solution; RSS gives a "most probable" solution.

The coverage factor and the method of combination are individually stored in the
database for all specification and calibration uncertainty table entries.

The system Manager may alter coverage factor and combination method codes for any
reference instrument.
Introduction

Everything discussed up to this point has been direct analysis of relatively simple
measurement models, with a limited number of influence factors that generally have
known values or close estimates. What if the measurement model does not meet those
constraints? There is another group of methods, described in GUM Supplement 1,
called Monte Carlo methods (MCM) of analysis.

Introduction
Monte Carlo Methods

Monte Carlo methods are advanced mathematical methods that use the statistics of
sampling theory. Each influence factor at the input is described by its known or
assumed probability density function. The measurement model is defined by a
mathematical expression that combines the inputs and produces output values of the
measurand. The set of output values are usually shown as a histogram. From that, you
can determine the estimated value, uncertainty and probability density function of the
measurand. When the program is run, it takes a random sample of each input and
combines them in the model to produce an output value. This process is repeated a
large number of times, often 10,000 or more. MCM analysis is useful when some input
values are not well known, or if the probability density function is known to be
something other than what is used in the GUM. MCM analysis can also be used to check
or validate the results of conventional analytical methods of uncertainty calculation.

Monte Carlo methods are used in a lot of other areas, not just in measurement
uncertainty. MCM is often used in physics, medical and chemical analysis. It is also
useful in a lot of planning areas, such as planning for traffic flow or crowd control, as
well as numerous other applications.

Vendor Information

The following screens have been included as reference to introduce you to some
available software tools. There are several different types listed: stand-alone
measurement uncertainty calculators; calculators designed to work with Microsoft®
Excel®; calibration management software; general-purpose statistical analysis
software; and a mathematical analysis application.
WorkPlace Training makes no endorsement of these packages. Carefully examine the
manufacturers specifications and match them to your particular needs before selecting.
The lists on the following pages are for your information only, do not represent all
available software, and do not state all features or benefits.

Measurement uncertainty calculators, both stand-alone and Excel add-ons, are designed
to do only one thing – perform calculations of measurement uncertainty using the GUM
methods.

Calibration management software generally is designed to do all of the record-keeping


of a laboratory, facilitate quality management of the lab, and some can automate
calibration procedures.

General-purpose statistical analysis software is not specifically designed for


measurement uncertainty. However, it can do a lot of the calculations for you, and
prepare graphs and charts of data. Most can also do measurement system analysis and
Gage R&R analysis.

The general-purpose mathematical application again is not designed for measurement


uncertainty, but can be set up to do the math and Monte Carlo analysis.

Notes:
 $$$ = $100 to $999; $$$$ = $1000 to $9999; $$$$$ = $10,000+
 MUSE and MCM Alchimia only do Monte Carlo analysis.
 UnCalc has an accompanying paper that should also be downloaded.
 Uncertainty Toolbox requires Microsoft Excel to operate.

 $$$ = $100 to $999; $$$$ = $1000 to $9999; $$$$$ = $10,000+


 Gage InSite requires an add-on Uncertainty module to calculate
measurement uncertainty.
 Mudcats calculates measurement uncertainty.
 Calibration Manager and MET/CAL can use measurement uncertainty
calculated externally.
 SureCAL literature does not say anything about measurement uncertainty.

Notes:

 $$$ = $100 to $999; $$$$ = $1000 to $9999; $$$$$ = $10,000+


 Analyse-It requires Microsoft Excel.
 Minitab does measurement system analysis, Gage R&R and Monte Carlo.
 QI Macros requires Microsoft Excel. Does measurement system analysis,
Gage R&R.

Notes:

 $$$ = $100 to $999; $$$$ = $1000 to $9999; $$$$$ = $10,000+


 Analytica can do uncertainty and Monte Carlo

Policy Document for Measurement Uncertainty

An organization that uses measurement uncertainty should have a policy document


addressing the subject. The measurement uncertainty policy establishes guidance in
stating and using measurement uncertainty. The goals of a policy document might be
to:

 allow the user to interpret measurement values consistently


 allow the user to interpret manufacturer's specifications consistently
 allow the user to combine measurement uncertainties consistently
 allow the user to combine data uncertainties in multi-variable measurement
situations
 provide guidance for establishing coverage (k) factors
 provide guidance for accounting for measurement uncertainty when
making decisions about conformance to requirements

Policy Document for Measurement Uncertainty (continued)

 provide guidance for establishing risk analysis of product quality related to


measurement and process assurance
 provide guidance for risk analysis for measurement traceability
 provide guidance for risk analysis of measurement failure and its impact to
product or measurement traceability
 provide guidance on when and how to use statistical analysis tools such as
Gage R&R, ANOVA, SPC, PMAP and so on
 provide guidance in establishing product parameter and process targets,
product parameter boundaries, identify influencing factors that impact the
above conditions, and defect analysis
 provide guidance in establishing, implementing and maintaining a
measurement assurance program
 provide for periodic evaluation of measurement uncertainty by participation
in interlaboratory comparison and proficiency test arrangements

You might also like