0% found this document useful (0 votes)
63 views50 pages

FAP Slides Unit 1

Failure analysis

Uploaded by

pavanchavhanpsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views50 pages

FAP Slides Unit 1

Failure analysis

Uploaded by

pavanchavhanpsi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Failure Analysis and Prevention

SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention

This subject is extremely important for the heavy engineering industry and
all those involved in manufacturing of the mechanical component and users of the
mechanical component.

The main issue is that whenever there is a failure of any component it leads to the
disruption to the services and sometimes failure of the mechanical components like a
lathe machine or the boiler or the car or aircraft, all these failures lead to the
different kind of the problems issues related with the services, and which these kind
of failures leads to the loss of a life sometimes.

Failure analysis is when an investigation takes place to determine the cause of failure,
usually with the aim of taking corrective action to fix the problem and mitigate
against further failures. Failure analysis is undertaken across all branches of
manufacturing industry to prevent future asset and product fails as well as protecting
against potentially dangerous risks to people and the environment.
Failure Analysis and Prevention

THE IMPORTANCE OF FAILURE ANALYSIS IN ENGINEERING


Modern Psychologists claim that the best way to deal with failure is to focus on variables
within your control. To do this, you must first be able to identify these variables.
This is why smart failure analysis is essential to managing and optimizing everything from
personal goals to multi-billion dollar business projects.

WHAT IS FAILURE ANALYSIS?


Failure analysis is the systematic process of gathering and analyzing data in order to
determine the cause of a failure. Failure analysis is essential to smart engineering for two
reasons.
First, in many cases, the goal of failure analysis is to determine the best corrective action.
Without it, you could make the same mistakes over and over, without even realizing why
you’re making them.
Second, success and excellence are often directly related to your ability to effectively
manage failure. Failure leaves clues, and proper failure analysis helps you follow these clues
to the source of the failure.
Failure Analysis and Prevention

A BASIC OVERVIEW OF FAILURE ANALYSIS


The failure analysis process usually starts with hypothesis development.
This is where you predict the probable causes of failure. Your hypothesis might focus on the
failure of parts, machines, structures or people. Engineers might use a cause-and-effect
diagram to identify and evaluate these variables.
Next, begins the step of recreating the process/conditions which led to the failure.
During this step, engineers might use computer models to simulate the environmental and
operational factors. The results of this step are then analyzed to see how well the cause(s) of
the failure matches the hypothesis.
Once the recreation process is completed, then comes the analysis step.
During this step, experts are called upon to analyze the data gathered during the second
phase. These experts might analyze mechanical, chemical, and/or metallographic components,
on both a macroscopic and microscopic level.
Finally, the process moves to the damage classification, reporting or prevention stage.
Depending on the nature of the failure, and the results of the analysis, the goal of this stage
can be to prevent future failures or to administer disciplinary action. In the case of the latter,
the goal would be to hold any persons responsible for damages caused by the failure.
Failure Analysis and Prevention

THREE BENEFITS OF FAILURE ANALYSIS


1. Better End Results
Whether you’re creating/refining a product or a service, failure analysis can help you make it
better which each new version, model or variation.
It can also help you optimize your budget and your ability to set and reach project deadlines. In
business, improving these results can prove just as important as improving the thing you’re
producing.
2. Failure Prevention
Failure analysis can help gather information that will aid in preventing failures in future
projects. This is because failure analysis often uncovers conditions and/or mechanical
components which are creating problems or weaknesses which you didn’t know about.
3. Planning Future Projects
Every engineer is familiar with the carpenter’s rule: measure twice, cut once. In other words,
every successful building or engineering project starts with a prediction about which actions
are required to produce the desired outcome.
Failure analysis is the perfect way to become better at planning new projects before you
commit time and materials to getting them completed.
Thank you
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention
Failure Analysis and Prevention

Titanic ship failure(1912).


This failure of the ship a lead to the a loss of the life of more than 1500 people and this
failure primarily occurred due to the failure of the rivets this had more than 300 million
rivets to join the different plates of the shape and in this ship these rivets which were
there, it was found that their impact resistance was very poor.

which was found for the failure of the ship was the poor impact
resistance of the of those are rivets are due to the low quality iron which was used and
another problem which was observed in design of the titanic ship was that there were
sixteen water tight compartment which was separated all which was separated from each
other, but all of them were connected near the ceiling and this enabled the water to spill
from one compartment to the another and finally, lead to the sinking of the boat.
Failure Analysis and Prevention

Iron under the low temperature conditions loses its the toughness and
this particular temperature at which loss of toughness is observed is called
ductile to brittle transition temperature and here, we can see the toughness as
we know we measure in terms of the energy absorbed and energy absorbed
reduces with the reduction in temperature for most of the steel this
temperature is about minus 20 to minus 30 degree centigrade almost of the
mild steel in structural steels. basically the iron rivets which were used them
had a low toughness under the low temperature conditions especially when it
hit the iceberg and that lead to the fracture of the rivets and which led to
separation of the ship
Failure Analysis and Prevention
Failure Analysis and Prevention
Failure Analysis and Prevention
Failure Analysis and Prevention
Failure Analysis and Prevention
Thank you
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention
Failure Analysis and Prevention
Failure Analysis and Prevention

Concepts of Failure Analysis and Prevention


Clearly, through the analysis of failures and the implementation of preventive measures,
significant improvements have been realized in the quality of products and systems. This
requires not only an understanding of the role of failure analysis, but also an
appreciation of quality assurance and user expectations. Quality and User Expectations
of Products and Systems. In an era that initially gained global prominence in the 1980s,
corporations, plants, government agencies, and other organizations developed new
management systems and processes aimed at improving quality and customer satisfaction.
Some of these systems include: Total Quality Management (TQM)
Continuous Improvement (CI)
and, more recently prominent, Six Sigma.

TQM and CI represent full organizational commitment to a system focused on “doing the right
thing right the first time” and not merely meeting but exceeding customer requirements .They
are focused on process improvements, generally in a production environment. Six Sigma
adopts these themes and extends the “reach” of the system to all levels of organizations, with
a system to achieve, sustain, and maximize business success
Failure Analysis and Prevention

Six Sigma is founded on the use of measurements, facts, and statistics to move
organizations in directions that constantly improve and reinvent business processes.
companies committed to Six Sigma have reported significant gains in productivity
with simultaneous improvements in organizational culture
The most positive result of these new management systems is that organizations
have responded to the higher expectations of consumers and users and have
provided higher-quality products and systems, with attendant increases in customer
satisfaction. However, this notion of the quality of a product or system is multifaceted.
Juran described quality as “fitness for use” . TQM defines quality as the ability to
satisfy the needs of a consumer These characteristics of quality also apply internally
to those in organizations, either in the services, or in manufacturing, operating, or
administering products, processes, and systems . The intent is to provide not only
products and systems that garner high customer satisfaction, but also that increase
productivity, reduce costs, and meet delivery requirements.
Failure Analysis and Prevention

In general, high quality refers to products and systems manufactured to higher


standards, in response to higher expectations of consumers and users. These
expectations include such attributes as:
• Greater safety
• Improved reliability
• Higher performance
•Greater efficiency
• Easier maintenance
•Lower life-cycle cost
• Reduced impact on the environment

Improvement in all of these areas simultaneously. That translates to reduced product


failure and greater likelihood of preventing failures. It is important to recognize that,
with all the gains achieved under these management systems, the full potential for
maximizing these attributes is yet to be achieved. Though all of the various
improvement systems are unique, they have two aspects in common. They are all
customer focused and are founded on problem solving as a means for improvement.
Failure Analysis and Prevention

The main issue is that whenever there is a failure of any component it leads to the disruption to
the services and sometimes failure of the mechanical components like a lathe machine or the
boiler or the car or aircraft, all these failures lead to the different kind of the problems issues
related with the services, and which these kind of failures leads to the:
• loss of a life sometimes.
• loss of the property
• reduced reliability and the safety
sometimes it also leads to like in line production, the number of machines are installed and
they are working to deliver to process the stock material and deliver the final finished goods at
the end.
the raw material which is entering at this stage 1 and then it is
processed subsequently at the number of stages and at the end we get the final product
in form of the goods. So, after processing from the stage 1 stage 2 stage 3 and then
stage 4 we get the final product. So, any failure of a failure of the any of the machines in
any of the stages, will lead to the stoppage of the production process.
Thank you
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention

To avoid the failures it is important that the way by which any component can fail is
understood and precautionary measures are taken to avoid such kind of the failures. But
whenever there is a failure it is important to understand the way by which it has failed what
were the causes for the failure. So, in order to understand the root causes of the failure, we
need to see that the any failure which is occurring is analyzed properly and for that purpose are
the failure analysis is carried out. In failure analysis basically we follow a systematic approach; a
systematic approach of investigation to identify the potential causes of the failure most
important causes of the failure to determine the most probable causes of failure. So, this kind
of the step is also called the root cause analysis(RCA)

Identify the causes or potential causes or the most possible causes for the failure so that the
corrective action can be taken to avoid the reoccurrence of the failure. This analysis basically
involves the number of things like observations, inspection and extensive as per the case
extensive laboratory testing of the field component from the location where from failure has
taken place or the location which is away from the fractured zone.
Failure Analysis and Prevention

So, basically the failure analysis is a multi disciplinary multi discipline approach which involves
the expertise which for which we need the expertise of the people of the different areas and the
different disciplines in order to conclude something effectively regarding the root causes for the
given failure. Failure analysis not only helps to avoid the reoccurrence of the failure, but it also
helps in improving the quality of the product increasing the reliability, improving the performance
and improving the customer satisfaction.
whenever any the failure analysis is carried out on the failed component, it helps to
avoid the recurrence of the such kind of the failures and that basically helps in improving
the quality of the product, it improves the reliability also; and if the
component is more reliable it will be performing well for the long so, performance of the
product is improved and which in turn improves the customer satisfaction
Failure Analysis and Prevention

Whenever failure occurs what we do? Failure how do we identify?


whenever failure of any component occurs, we will be getting some indicators which you can say as a
symptoms. And then these indicators will be coming up due to certain causes. In presence of only those
causes it the component which is being failed which is failing will be giving certain kind of the
indications and. So, in presence of those causes the certain indications will be coming in, and these
causes will be leading to the existence of the certain mechanisms which will be leading to the failure.

if we take the example of the machine tool. So, during the machining, if the cutting tool fails, in that case
it will start giving lot of chattering lot of noise and vibrations.
noise and vibrations are the indicators and this will be will be occurring due to the like say excessive
the flank wear or the cutter or cutting edge failure under the certain unfavorable conditions like too high
cutting speed feed and depth of cut or the tool geometry is improper or unfavorable for a given set of
the cutting parameters and the work piece which is to be machined.

basically these are the causes which will be leading to the like the failures in form of the flank
wear crater wear or the cutting edge fracture, and this can occur through the number of ways. Like here
cutting edge a failure will involve fracture while the flank and crater wear will involve the wear by
abrasion, adhesion, diffusion. So, all these are the mechanisms
Failure Analysis and Prevention

Now, we have to choose in light of the work-piece material, we have to choose proper tool
geometry and the cutting parameters like cutting a speed, feed and depth of cut need to be
selected properly. So, that the tool performs the intended function in order to avoid such kind of
failures. So, these are the kind of indications that we get and we need to see what are the
causes, what is the mechanism, to establish the complete understanding about the failure so that
the corrective action can be taken in order to avoid the failure.
Failure Analysis and Prevention

Problem-solving model
Failure Analysis and Prevention

The major steps in the model define the problem-solving process:

1. Identify: Describe the current situation. Define the deficiency in terms of the
symptoms (or indicators). Determine the impact of the deficiency on the
component, product, system, and customer. Set a goal. Collect data to provide a
measurement of the deficiency.
2. Determine root cause: Analyze the problem to identify the cause(s).
3. Develop corrective actions: List possible solutions to mitigate and prevent
recurrence of the problem. Generate alternatives. Develop implementation plan.
4. Validate and verify corrective actions: Test corrective actions in pilot study.
Measure effectiveness of change. Validate improvements. Verify that problem is
corrected and improves customer satisfaction.
5. Standardize: Incorporate the corrective action into the standards documentation
system of the company, organization, or industry to prevent recurrence in similar
products or systems. Monitor changes to ensure effectiveness.
Failure Analysis and Prevention

Consider the example of a butterfly valve that fails in service in a cooling water
system at a manufacturing facility
1). Recognizing the indicators, causes, mechanisms, and consequences helps to
focus investigative actions:

· Indicators: Monitor these as precursors and symptoms of failures.


· Cause: Focus mitigating actions on these.
· Failure mechanism: These describe how the material failed according to the
engineering textbook definitions.

If the analysis is correct, the mechanism will be consistent with the cause. If the
mechanism is not properly understood, then all true cause will not be identified
and corrective action will not be fully effective.
· Consequence: This is what we are trying to avoid.
Failure Analysis and Prevention
Failure Analysis and Prevention

Over many years, and across a wide variety of


mechanical and electronic components and
systems, people have calculated empirical
population failure rates as units age over time
and repeatedly obtained a graph such as
shown below. Because of the shape of this
failure rate curve, it has become widely known
as the "Bathtub" curve or life curve of the
product
Failure Analysis and Prevention

the first part is a decreasing failure rate, known as early failures.


The second part is a constant failure rate, known as random failures.
The third part is an increasing failure rate, known as wear-out failures.

The initial region that begins at time zero when a customer first begins to use the product is
characterized by a high but rapidly decreasing failure rate. This region is known as the Early Failure
Period (also referred to as Infant Mortality Period, from the actuarial origins of the first bathtub curve
plots). This decreasing failure rate typically lasts several weeks to a few months.

Next, the failure rate levels off and remains roughly constant for (hopefully) the majority of the useful life
of the product. This long period of a level failure rate is known as the Intrinsic Failure Period (also
called the Stable Failure Period) and the constant failure rate level is called the Intrinsic Failure Rate.
Note that most systems spend most of their lifetimes operating in this flat portion of the bathtub curve

Finally, if units from the population remain in use long enough, the failure rate begins to increase as
materials wear out and degradation failures occur at an ever increasing rate. This is the Wearout
Failure Period.
Thank you
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention
SUPRIYA B
Department of Mechanical Engineering
Failure Analysis and Prevention

Root Cause Analysis Process


Failure Analysis and Prevention

Root Cause Analysis Process

And hidden causes may be in the different forms like improper training, improper motivation,
carelessness on the part of the worker, improper calibration improper
installation of the things.
the second one is the human related causes, and third is latent causes So, physical causes is
about like the design is not perfect or material selection is improper or the service conditions
which has been improper. So, these are the things will be leading to the say the fracture due to
the design deficiency, manufacturing or the material related issues.

There is another category may be procedures and everything is fine, but here what will happen
that the training to the human being involved in use of the product or in manufacturing that the
people who are involved in the manufacturing of the product are not properly trained or their,
carelessness is involved. So, these are the human related factors and there are many latent
factors like improper installation; everything is fine the component has not been installed
properly or improper the training to the workers.
Failure Analysis and Prevention

What Is Root Cause Analysis?

RCA (Root Cause Analysis) is a mechanism of analyzing the Defects, to identify its cause. We
brainstorm, read and dig the defect to identify whether the defect was due to “testing miss”,
“development miss” or was a “requirement or designs miss”.

When RCA is done accurately, it helps to prevent defects in the later releases or phases. If we find,
that a defect was due to design miss, we can review the design documents and can take
appropriate measures. Similarly, if we find that a defect was due to testing miss, we can review our
test cases or metrics, and update it accordingly.

RCA is not only used for defects reported from a customer site, but also for UAT defects, Unit
Testing defects, Business, and Operational process-level problems, day-to-day life problems, etc.
Hence it is used in multiple industries like Software Sector, Manufacturing, Health, Banking Sector,
etc.
Failure Analysis and Prevention

Illustration
Conducting Root Cause Analysis is similar to the work of the
doctor who treats a patient. The doctor will first understand the
symptoms. Then he will refer to laboratory tests to analyze the
root cause of the disease.
If the root cause of the disease is still unknown, the doctor will
refer for scan tests to understand further. He will continue the
diagnosis and study until he narrows down to the root cause of
the patient’s sickness. The same logic applies to Root Cause
Analysis performed in any industry.
So, RCA is aimed at finding the root cause and not treating the
symptom, by following a specific set of steps and associated
tools. It is different from defect analysis, troubleshooting, and
other problem-solving methods as these methods try to find
the solution for the specific issue, but RCA tries to find the
underlying cause.
Failure Analysis and Prevention

Advantages Of Root Cause Analysis


Enlisted below are some of the benefits, you will get:
•Prevent the reoccurrence of the same problem in the future.
•Eventually, reduce the number of defects reported over time.
•Reduces developmental costs and saves time.
•Improve the software development process and hence aiding quick
delivery to market.
•Improves customer satisfaction.
•Boost productivity.
•Find hidden problems in the system.
•Aids in continuous improvement.
Failure Analysis and Prevention

Types Of Root Causes


#1) Human Cause: Human-made error.
Examples:
Under skilled.
Instructions not duly followed.
Performed an unnecessary operation.
#2) Organizational Cause: A process that people use to make decisions that were
not proper.
Examples:
Vague instructions were given from Team Lead to team members.
Picking the wrong person for a task.
Monitoring tools not in place to assess the quality.
#3) Physical Cause: Any physical item failed in some way.
Examples:
The computer keeps restarting.
The server is not booting up.
Strange or loud noises in the system.
Failure Analysis and Prevention

Steps To Do Root Cause Analysis


A structured and logical approach is required for an effective root cause analysis.
Hence, it’s necessary to follow a series of steps.
Failure Analysis and Prevention

#1) Form RCA Team


Every team should have a dedicated Root Cause Analysis Manager [RCA Manager] who will
collect the details from the Support team and initiate the kick-off process for RCA. He will
coordinate and allocate resources who need to attend RCA meetings depending on the stated
problem.
Teams, who attend the meeting, should have personnel from each team [Requirement, Design,
Testing, Documentation, Quality, Support & Maintenance] who are most familiar with the problem.
The team should have people who are directly linked to the defect as well.
For example, the Support engineer who gave an immediate fix to the customer.
Share the problem details with the team before attending the meeting so that they can do some
initial analysis and come prepared. Team members also gather information related to the defect.
Depending on the incident report, each team will trace what went wrong w.r.t to this scenario in
their respective phases. Being prepared will increase the efficiency of the upcoming discussion.
Failure Analysis and Prevention

#2) Define The Problem


Collect the details of the problem like, incident reports, problem evidence (screenshot, logs,
reports, etc.), then study/analyze the problem by asking the below questions:
•What is the problem?
•What is the sequence of events that led to the problem?
•What systems were involved?
•How long the problem existed?
•What is the impact of the problem?
•Who was involved and determine who should be interviewed?
Use ‘SMART’ rules to define your problem:
•SPECIFIC
•MEASURABLE
•ACTION-ORIENTED
•RELEVANT
•TIME-BOUND
Failure Analysis and Prevention

#3) Identify Root Cause


Conduct the BRAINSTORMING session within the RCA team formed to identify the
causes. Use the Fishbone diagram or 5 Why Analysis method or both to arrive at
the root cause/s.
RCA manager should moderate the meeting and set the rules for the Brainstorming
session. For example, the rules can be:
Criticizing/blaming others should not be allowed.
Don’t judge other’s ideas. No ideas are bad they encourage wild ideas.
Build on the ideas on others. Think about how you can build on other’s ideas and
make it better.
Give each participant due time to share their views.
Encourage out of box thinking.
Stay focused.
All ideas should be recorded. RCA manager should assign a member to record the
minutes of the meeting and update of RCA templates.
Failure Analysis and Prevention

#4) Implement Root Cause Corrective Action (RCCA)


Correction action involves giving fix to the solution by identifying the real root cause.
To facilitate this, a delivery manager has to be present who can decide in which all
versions the fix has to be implemented and what should be the delivery date.
RCCA should be implemented in such a way that this root cause will not occur again
in the future. Fix given by the support team will be temporary for the customer site
where the issue is reported. When this fix is merged into an ongoing version, do
proper impact analysis to ensure no existing feature is broken.
Give the steps to validate the fix and monitor the implemented solution to check if
the solution is effective.
Failure Analysis and Prevention

#5) Implement Root Cause Preventive Action (RCPA)


The team needs to come up with a plan for how such a similar issue can be prevented in
the future. For example, Update Instruction Manual, improve skillset, update the team
assessment checklist, etc. Follow proper documents of preventive actions and monitor
whether the team is adhering to the preventive actions taken.
Please refer to this research paper on “Defect Analysis and Prevention for Software
Process Quality Improvement” published in the International Journal of Software
Engineering & Applications to get an idea of the types of defects reported in each
software phase and suggested preventive actions for them.
The information gained from RCA can go as input into Failure Mode and Effect Analysis
(FMEA) to identify points where the solution can fail.
Implement Pareto Analysis with the causes identified during RCA over a period, say half-
yearly or quarterly which will help to identify the top causes which are contributing to the
defects and focus on preventive action for them.
Thank you
SUPRIYA B
Department of Mechanical Engineering

You might also like