Ntroduction: TH e Purpose of Accelerated Testing
Ntroduction: TH e Purpose of Accelerated Testing
INTRODUCTION
The scientific theory of accelerated testing is highly developed, but the application of this theory
has proven difficult, especially in the mobility industries. The required design life for many
components exceeds 10 years, and the application environment is harsh and highly variable.
Vehicles must operate reliably in arctic conditions and in desert conditions. Driving profiles
range from the 16-year-old male to the 90-year-old female. An airliner may fly long-haul ocean
routes for 20 years, while an identical model may fly short-range routes that result in many
more takeoffs and landings over the life of the aircraft. Combining this variety into a realistic
test that can be completed in a reasonable time frame with a reasonable budget is difficult and
requires compromises.
Accelerated tests fall into two categories: (1) development tests, and (2) quality assurance tests.
During research, short inexpensive tests are needed to evaluate and improve performance. The
progress of a product in these development tests is often monitored statistically with a reliability
growth program. Some quality assurance tests are as follows:
• Design verification
• Production validation
• Periodic requalification
Quality assurance tests are often tied to statistical sampling plans with requirements such as a
demonstrated reliability of at least 95% at 10 years in service with a confidence level of 90%.
Statistically, 95% reliability with 90% confidence can be demonstrated by testing 45 units to the
equivalent of 10 years in service. Table 1.1 gives the required sample sizes for some common
reliability requirements.
Before proceeding with a test of 299, 45, or even 16 samples, the purpose of the test should be
investigated. What does it cost to test 299 units? The following costs should be considered:
• Prototype costs
• Instrumentation costs (results monitoring)
–1–
A T
TABLE 1.1
RELIABILITY DEMONSTRATION SAMPLE SIZES
Reliability Confidence Sample Size
99% 95% 299
99% 90% 229
99% 50% 69
95% 95% 59
95% 90% 45
95% 80% 31
90% 90% 22
90% 80% 16
• Setup costs
• Labor costs
• Laboratory costs (many tests take two or more months to complete)
Using the sample sizes shown in Table 1.1 allows no failures. If a failure occurs, do timing
and budget constraints allow changes and a repeat of the test? What are the implications of
bringing a product to market if that product did not demonstrate the required reliability with an
appropriate level of confidence?
Design Life
Determining the design life to be simulated with an accelerated test can be difficult. Many
automobile manufacturers specify a design life of 10 years for brake systems (excluding pads),
but how does 10 years in service translate to an accelerated test? According to a study of brake
system usage for minivans in 1990, the following statements are true:
• The median number of brake applies was 322,000 for 10 years in service.
• Five percent of the vehicles had more than 592,000 brake applies in 10 years of service.
• One percent of the vehicles had more than 709,000 brake applies in 10 years of service.
• The force of the brake apply ranged from 0.1 to 1.0 g-force, with the shape of the distribu-
tion shown in Figure 1.1.
How many times should the brake system be cycled in the accelerated test representing 10 years?
Engineers design for the most stressful conditions; therefore, does this mean that the number of
cycles is determined by the most stressful driver?
User profiles are often defined by percentile. The 95th percentile point is the point with 5% of
the users having a more stressful profile. One percent of the users have a driving profile that is
–2–
I
3.0
2.0
Probability Density
1.5
95th Percentile Customer
1.0
0.5
0.0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Deceleration (g-force)
more stressful than the 99th percentile driver. Table 1.2 gives the number of brake applications
as a function of the user percentile; these data also are shown in Figure 1.2.
TABLE 1.2
PERCENTILES FOR MINIVAN BRAKE APPLICATIONS
Percentile Number of Brake Applications
50th 321,891
60th 361,586
70th 405,155
75th 429,673
80th 457,241
85th 489,671
90th 530,829
95th 592,344
97.5th 646,007
99th 708,571
99.9th 838,987
–3–
A T
900
800
700
Brake Applications (Thousands)
600
500
400
300
200
100
0
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Percentile
As shown in Figure 1.2, the number of brake applications increases dramatically as the percent
of the population covered nears 100%. This is typical of many other applications, such as door
slams, ignition cycles, and trunk release cycles. To increase the percent of the population covered
from 75% to 99.9 % requires an approximate doubling of the number of cycles in the accelerated
test. Not only does this increase the cost and duration of the test, but the cost of the component
increases because the number of cycles in the test is part of the design requirement.
The percent of the population covered is a compromise among development cost, development
time, component cost, and the field performance of the component. For safety-critical items, the
user percentile may exceed 100% to allow a safety margin. For other items, such as glove box
latches, the user percentile may be as low as 80%. In reality, there is no 95th percentile user.
There is a 95th percentile user for number of cycles, a 95th percentile user for temperature, a
95th percentile user for number of salt exposure, a 95th percentile user for vibration, and so forth.
However, determining the 95th percentile user for the combination of conditions is unrealistic.
The worst-case user profile may not be at the high end for the number of cycles of operation.
Consider a parking brake. The worst case may be a brake that is used for the first time after the
vehicle is 10 years old. This type of user profile must be incorporated into a test separate from
a test utilizing the 95th percentile of parking brake applications.
–4–
I
Accelerating a test by eliminating the time between cycles can introduce unrealistic conditions.
Consider a durability test for an automobile door. The door is opened and closed 38,000 times
in 12 hours. Opening and closing the door this quickly does not allow the door hinges or latches
to cool, nor does it give any contaminants that may be introduced in the hinges time to form
corrosion. Consider an automobile engine: the 95th percentile user profile for engine on-time is
approximately 7,000 hours. Does running the engine for 7,000 consecutive hours approximate
7,000 hours of operation over 10 years? Consider an automobile starter: the 95th percentile user
profile for the number of engine starts is approximately 4,000. Starting the engine 4,000 times
as quickly as possible does not stress the starter as much as actual usage conditions because the
engine would be warm for nearly every engine start. To more adequately represent true usage
conditions, the engine would need to be cooled for some of the starts.
It may be possible to obtain a random sample representative of the population for periodic
requalifications, but it is nearly impossible for new product development. Thus, designing tests
to demonstrate reliability with statistical confidence is not always possible. The best alternative
is to test with worst-case tolerances.
But even with this simple system, other tolerances must be accounted for, such as the following:
–5–
A T
Component A Component B
The number of tolerance combinations can become unmanageable. Table 1.3 shows the
number of possible tolerance combinations as a function of the number of dimensions. With
10 characteristics to consider for worst-case tolerancing in this simple two-component system,
there are more than 1,000 combinations of tolerances to consider. Determining which of these
1,000 combinations is the worst case is often difficult.
TABLE 1.3
NUMBER OF TOLERANCE COMBINATIONS
Number of Number of
Characteristics Tolerance Combinations
2 4
3 8
4 16
5 32
10 1,024
20 1,048,576
50 1.126 (1015)
100 1.268 (1030)
Confounding the problem is the fact that the worst-case tolerance combination for a specific
environmental condition may be the best-case tolerance combination for another environ-
mental condition. Manufacturing capabilities also complicate testing at worst-case tolerance
–6–
I
Ideally, if all characteristics are within tolerance, the system would work perfectly and survive
for the designed life. And if one or more characteristics are out of tolerance, the system would
fail. Reality demonstrates that a component with a characteristic slightly out of tolerance is
nearly identical to a component with the same characteristic slightly within tolerance. Toler-
ances are not always scientifically determined because time and budget do not always allow
for enough research. There is a strong correlation between the defect rate in the manufacturing
facility and field reliability. A portion of the reduction in defect rate has been due to a reduc-
tion of manufacturing variability. As manufacturing variability is reduced, characteristics are
grouped closer to the target.
Consider a motor with its long-term durability dependent on the precision fit of three compo-
nents in a housing. The three components are stacked in the housing; historically, the tolerance
stackup has caused durability problems, and the maximum stackup of the three components has
been specified at 110. To meet this requirement, an engineer created the specifications shown
in Table 1.4.
TABLE 1.4
MOTOR COMPONENT TOLERANCES
Component A B C Total
Target Size 30 20 10 60
Maximum Size 50 30 15 95
If the three components are manufactured to the target, the total stackup is 60. However, there
is always variance in processes, so the engineer specifies a maximum allowable size. If the
manufacturing capability for each of the components is 3 sigma (a defect rate of 67,000 parts
per million), the process will produce the results shown in Figure 1.4 for the stackup of the
system.
By increasing the manufacturing capability for each of the components to 4 sigma (a defect
rate of 6,200 parts per million), the process will produce the results shown in Figure 1.5 for the
stackup of the system.
The motor housing has a perfect fit with the three components if the stackup is 60. Any devia-
tion from 60 will reduce the life of the motor. As long as the total stackup is less than 110, the
motor will have an acceptable life; however, motors with a stackup closer to 60 will last longer.
It is easy to see that the reduced variance in manufacturing will increase the life of the motors.
–7–
A T
180
Upper Specification
160
140
Frequency
120
100
80
60
40
20
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110
Lateral Runout (System)
300
Upper Specification
250
200
Frequency
150
100
50
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110
Lateral Runout (System)
–8–
I
Financial Considerations
Priorities for a reliability program are determined the same way as most other programs. The
number one priority is an emergency. If there is a hole in one of the water pipes in your home,
you will find a way to fix it, regardless of your financial situation.
The next level of priority is an obvious need that can be delayed with a risk. Consider again the
leaking water pipe. If the water pipe is patched with duct tape and epoxy, and while fixing the
pipe it is determined that all the water pipes in the home are in danger of bursting, then obvi-
ously there is a need to replace all the water pipes in the house. However, this is an expensive
task and can be delayed. There is no immediate crisis, but by delaying the repair, there is a risk
of an expensive accident. If a water pipe bursts, thousands of dollars of damage will result.
This risk is tolerated because the immediate expense of correcting the problem is perceived to
be greater than the cost of the water pipes bursting weighted by the probability of the water
pipes bursting.
The most dangerous priority is one that is not known. Consider a home that is being consumed
by termites without the owner’s knowledge. Nothing is done to correct the problem because
the owner is ignorant of the problem. For reliability programs, the largest expenses are often
overlooked.
a. Environment.
ii. Humidity.
b. Duty cycle.
c. Load.
i. Pounds of force.
ii. Pressure.
iii. Voltage.
–9–
A T
iv. Current.
d. Reliability goals.
a. FRACAS (failure rate analysis and corrective action system)—Parts from test failures,
internal production failures, external production failures, and field returns must be
analyzed and cataloged.
3. Begin the FMEA (failure modes and effects analysis) process. The FMEA will be
updated during the entire process.
4. Intelligent design.
a. Use design guides—All lessons from previous incidents must be captured in design
guides. This includes all information from the FRACAS.
b. Parameter design—Choose the design variable levels to minimize the effect of uncon-
trollable variables.
a. Early in the development phase, have short, inexpensive tests to provide approximate
results. The purpose of these tests is to provide engineering feedback.
b. Every concept must pass an independent (i.e., not conducted by engineering) verifica-
tion test. The concept should include design limits. For example, the component has
–10–
I
been validated to operate up to 85°C, withstand brake fluid, and 4.8 GRMS random
vibration in a frequency range from 0 to 800 Hz.
a. Early in the development phase, have short, inexpensive tests to provide approximate
results. The purpose of these tests is to provide engineering feedback.
b. Every design must pass an independent (not conducted by engineering) verification test.
Be careful not to burden the company with timing and cost issues when specifying the
test. Build on the results of the concept verification and any other implementations of
the concept.
c. System simulation.
7. Manufacturing.
a. Parts from production intent tooling must pass the design validation test.
ii. For the first week of production, the sampling rate is 100%.
iii. If a Cpk of 1.67 is achieved for the first week, the sampling rate may be reduced.
iv. Each drawing specification must have a control plan that details the critical pro-
cesses affecting the drawing specification. Each of these processes also must be
monitored with an SPC.
–11–
A T
1. Rubber ages.
iii. Are the temperature and vibration profiles during transportation significantly dif-
ferent from those of the vehicle specification?
iv. Is the part protected from corrosion caused by the salt in the air during transporta-
tion on an ocean liner?
After analyzing a reliability program, there appear to be many opportunities for savings.
2. How many prototypes will be used during the parameter design and tolerance design pro-
cesses? What is the cost of a prototype?
a. HALT
b. Design verification
c. Production validation
d. Step-stress testing
Although we agree that the approach of Reliability is correct, we are supporting Finance. The
program is behind schedule, and reducing the reliability effort will improve program timing.
In addition, with recent cutbacks (or the recent increase in business), Engineering lacks the
resources to complete the entire reliability program.
–12–
I
The program proposed by Reliability will ensure initial quality, but it is too costly. With recent
[enter any excuse here], manufacturing cannot meet the initial production schedule with the
restrictions proposed by Reliability. The reliability program would also require additional per-
sonnel to maintain the SPC program.
These responses result in an organization with a strong incentive to gamble that there will be no
consequences for abandoning a thorough reliability program in favor of a program that is less
expensive. This behavior sub-optimizes the finances of the company by assuming any potential
failure costs are near zero. To be effective, a good reliability program must include a financial
assessment of the risks involved if reliability activities are not completely executed.
The urge to reduce the investment in the reliability program can be combatted by visualizing the
failure costs that the reliability program is designed to prevent. In addition to warranty costs,
other failure costs are as follows:
• Customer returns
• Customer stop shipments
• Retrofits
• Recalls
These costs often are ignored because they are not quantified. An effective method for quan-
tifying these costs is to record a score for each element in the reliability program and compare
this to the field performance of the product. This can be done by auditors using a grade scale
of A through F for each element of the program. The grades for each element can be combined
into a grade point average (GPA) for the program using 4 points for an A, 3 points for a B, and
so forth.
Table 1.5 gives an example of how a reliability program may be scored, and Table 1.6 shows
how the field performance is recorded. When quantifying these costs, be sure to include all
associated labor and travel costs. For example, a customer may charge $500 for returning a
single part; however, the associated paperwork, travel, and investigation could easily be several
thousand dollars.
TABLE 1.5
EXAMPLE RELIABILITY PROGRAM SCORES
Reliability Program Item Score (GPA)
Understanding of Customer Requirements B–3
FMEA A–4
FRACAS C–2
Verification C–2
Validation D–1
Manufacturing B–3
Overall Program Score 2.33
–13–
A T
TABLE 1.6
EXAMPLE FIELD RELIABILITY PERFORMANCE
Reliability Performance Item Cost
Customer Returns $8,245
Customer Stop Shipments $0
Retrofits $761,291
Recalls $0
Overall Program Unreliability Cost $769,536
Figure 1.6 is a scatter chart of the results of several programs. The slope of the trend line
quantifies the loss when the reliability program is not fully executed. For this example, mov-
ing the GPA of the overall reliability program by one point is expected to result in a savings of
$755,000 in failure costs. This savings can be used to financially justify the investment in the
reliability program.
$4.5
$4.0
$3.5
Failure Costs (Millions)
$3.0
$2.5
$2.0
$1.5
$1.0
$0.5
$0.0
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Reliability Program GPA
These failure costs are similar to the cost of water pipes bursting in your house. You know
of the risk, and you decide to act on the risk or tolerate the risk, based on the finances of the
situation.
–14–
I
Another method to focus management’s attention on reliability is by presenting the effects of the
data shown in Table 1.6 on corporate profits. The data in Figure 1.7 are examples of the effects
of a poor reliability program. Money was saved years earlier by gambling with a substandard
reliability program, but as shown in Figure 1.7, the short-term gain was not a good long-term
investment.
$45
$40
$35
Millions of Dollars
$30
$25
$20
$15
$10
$5
$0
Potential Profit After Recalls After Stop-Ships After Retrofits After Returns
Similar to termites damaging a home without the owner’s knowledge, hidden reliability costs
are causing poor decisions to be made and are damaging profits. The losses caused by these
hidden costs can be orders of magnitude greater than warranty costs. To illustrate this concept,
consider the automotive industry.
For model year 1998, the average vehicle manufactured by General Motors, Ford, or Chrysler
(the “Big Three”) required $462* in repairs. These automakers sell approximately 13 million
vehicles in North America annually, resulting in a total warranty bill of $6 billion. That may
sound like a lot of money, but it is by far the smallest piece of the cost of poor reliability.
Table 1.7 illustrates the retail value for several 1998 model year vehicles with sticker prices all
within a $500 range.
For lease vehicles, the manufacturer absorbs the $5,715 difference in resale value between
Vehicle B and Vehicle H. For non-lease vehicles, the owner of Vehicle B absorbs the cost. But
* Day, Joseph C., address to the Economic Club of Detroit, December 11, 2000.
–15–
A T
TABLE 1.7
VEHICLE RESALE VALUE
Vehicle Retail Value Consumer Reports
(1998 Model Year) as of July 2001 Reliability Rating*
A $8,430 –45
B $9,500 20
C $9,725 18
D $11,150 25
E $11,150 30
F $13,315 –5
G $14,365 55
H $15,215 50
* The Consumer Reports scale is from –80 to 80, with –80 being the worst
and 80 being the best.
this does not mean the manufacturer is not impacted. The reduced retail value is reflected in the
ability of the manufacturer to price new vehicles. The manufacturer of Vehicle H can charge
more for new vehicles because they depreciate more slowly. Considering that sales for many
of these midsized sedans topped 200,000 units, the $5,715 difference in resale value is worth
more than $1 billion annually. Figure 1.8 shows the correlation of the reliability of a vehicle
and its resale value. Using the slope of the regression line shown in Figure 1.8, a single point
in Consumer Reports’ reliability rating is worth $51.58.
$16
Residual Value (Thousands of Dollars)
$15
$14
$13
$12
$11
$10
$9
$8
-50 -40 -30 -20 -10 0 10 20 30 40 50 60
Reliability Rating
–16–
I
When faced with pressures to compromise on your reliability program, consider the system cost.
For example, Design A uses a material that is a nickel less expensive than Design B, and the
component is expected to be installed on 300,000 vehicles. If Design A is chosen, the savings is
$15,000 in material cost. This is a good decision only if the expected impact to warranty, resale
value, and market share is less than $15,000. Before using Design A, verify the reliability of
the design by doing the following:
Summary
The theoretical science of accelerated testing is exact, but implementing an accelerated testing
plan requires several compromises. Ultimately, the science involved is inexact and serves only
as a guideline for engineering judgment. Reliability demonstration based on statistically deter-
mined sample sizes is often invalid because the samples could not be a random representation
of production parts. Testing to worst-case tolerance limits is difficult because of the number of
combinations and the difficulty of producing parts at the desired tolerance level.
Automobile manufacturers often specify a design life of 10 years for automobiles. Many types
of aircraft have a design life of 25 years, and some B-52 bombers have been in service for more
than 40 years. Accelerating a test for an automobile component by a factor of 10 would yield
a test with a one-year duration. This obviously is unacceptable. Obtaining a test with a dura-
tion of one month requires an acceleration factor of 120. A test lasting 24 hours would have an
acceleration factor of 3,653.
Is this type of acceleration possible? Although this is a slightly controversial subject, many
experts suggest it is impossible to accelerate a test by more than a factor of 10 without losing
some correlation to real-world conditions. This is another of the ambiguities faced when accel-
erating testing conditions. A test is required, and the test is useless if it cannot be completed in
a reasonable time frame. However, the greater the acceleration, the less realistic the test.
Accelerated testing is a balance between science and judgment. Do not let the science cause bad
decisions to be made. For example, if demonstrating 95% reliability at 150,000 miles in service
calls for testing 45 units for 2,000 hours without failure, do not be concerned if only 35 units can
be tested. The sample size of 45 assumes random sampling from a population representative
of production. Violating this assumption is more important than testing with a reduced sample
size. In this situation, try to ensure that the test is representative of real-world conditions, and
secure 35 samples with key characteristics set at worst-case tolerance levels.
–17–