0% found this document useful (0 votes)
19 views

What Is Fault Management - Describe Five Steps Process in Fault Management.

Uploaded by

Ben Beny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

What Is Fault Management - Describe Five Steps Process in Fault Management.

Uploaded by

Ben Beny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

(/)

 Search (/local/search/page/) Ask Question (/p/new/post/)

Login (/site/login/?next=/p/3368/what-is-fault-management-describe-five-steps-proce/)

What is fault management? Describe five steps process in fault


management.
written 7.4 years ago by • modified 2.9 years ago

teamques10 (/u/1/teamques  ★ 57k

 telecom network management (/t/telecom network management/)

0 ADD COMMENT FOLLOW SHARE EDIT (/p/edit/3368/)

23k
views
1 Answer
written 7.4 years ago by

teamques10 (/u/1/teamques  ★ 57k

Fault Management:
Fault in a network is normally associated with failure of a network component and subsequent loss
of connectivity. Fault management involves a five-step process:

(1) Fault detection, (2) Fault location, (3) Restoration of service, (4) Identification of root cause of
the problem, and (5) Problem resolution.

i. The fault should be detected as quickly as possible by the centralized management system,
preferably before or at about the same time as when the users notice it.

ii. Fault location involves identifying where the problem is located. We distinguish this from
problem isolation, although in practice it could be the same.

iii. The reason for doing this is that it is important to restore service to the users as quickly as
possible, using alternative means.

iv. The restoration of service takes a higher priority over diagnosing the problem and fixing it.

v. Identification of the root cause of the problem could be a complex process, which we will go into
greater depth soon.
vi. After identifying the source of the problem, a trouble ticket can be generated to resolve the
problem.

vii. In an automated network operations center, the trouble ticket could be generated
automatically by the NMS.

Fault Detection:
i. Fault detection is accomplished using either a polling scheme (the NMS polling management
agents periodically for status) or by the generation of traps (management agents based on
information from the network elements sending unsolicited alarms to the NMS).

ii. An application program in NMS generates the ping command periodically and waits for
response. Connectivity is declared broken when a preset number of consecutive responses are not
received.

iii. The frequency of pinging and the preset number for failure detection may be optimized for
balance between traffic overhead and the rapidity with which failure is to be detected.

iv. The alternative detection scheme is to use traps. One of the advantages of traps is that failure
detection is accomplished faster with less traffic overhead.

Fault Location and Isolation Techniques :


i. Fault location using a simple would be to detect all the network components that have failed.
The origin of the problem could then be traced by walking down the topology tree where the
problem starts.

ii. Thus, if an interface card on a router has failed; all managed components connected to that
interface would indicate failure.
iii. After having located where the fault is, the next step is to isolate the fault (i.e. determine the
source of the problem).

iv. First, we should delineate the problem between failure of the component and the physical link.
Thus, in the above example, the interface card may be functioning well, but the link to the interface
may be down. We need to use various diagnostic tools to isolate the cause.

v. Let us assume for the moment that the link is not the problem but that the interface card is. We
then proceed to isolate the problem to the layer that is causing it. It is possible that excessive
packet loss is causing disconnection.

vi. We can measure packet loss by pinging, if pinging can be used. We can query the various
2
Management Information Base (MIB) parameters on the node itself or other related nodes to
further localize the cause of the problem.
1.0k
views
views
vii. For example, error rates calculated from the interface group parameters, ifInDiscards, ifInErrors,
ifOutDiscards, and ifOutErrors with respect to the input and out-put packet rates, could help us
isolate the problem in the interface card.

Service Restoration:
i. Whenever there is a service failure, it is NOC's responsibility to restore service as soon as possible.
This involves detection and isolation of the problem causing the failure, and restoration of service.

ii. In several failure situations, the network will do this automatically. This network feature is called
self-healing. In other situations NMS can detect failure of components and indicate with
appropriate alarms.

iii. Restoration of service does not include fixing the cause of the problem. That responsibility
usually rests with the I&M group.

iv. A trouble ticket is generated and followed up for resolution of the problem by the I&M group.

Root Cause Analysis (RCA) :


Root Cause Analysis (RCA) is a popular and often-used technique that helps people answer the
question of why the problem occurred in the first place.

It seeks to identify the origin of a problem using a specific set of steps, with associated tools, to find
the primary cause of the problem, so that you can:

1. Determine what happened.


2. Determine why it happened.
3. Figure out what to do to reduce the likelihood that it will happen again.

Problem Resolution:
Correcting the problem (indicates that the problem has been solved) by hardware & software
techniques, managed objects are repaired or replaced, and operations returned to normal.

ADD COMMENT SHARE EDIT (/p/edit/3369/)

Please log in (/site/login/?next=/p/3368/what-is-fault-management-describe-five-steps-proce/) to


add an answer.
FAX ✕

COMMUNITY CONTENT COMPANY

Users (/user/list/) All posts (/t/latest) About (/info/about/)


Levels (/info/levels/) Tags (/t/) Team (/info/team/)
Badges (/b/list/) Dashboard Privacy
(/dashboard/) (/info/privacy/)

Submit question paper solutions and earn money  (/info/solutions/)

You might also like