1 s2.0 S0951832007002207 Main
1 s2.0 S0951832007002207 Main
Abstract
A simple practical framework for predictive maintenance (PdM)-based scheduling of multi-state systems (MSS) is developed. The
maintenance schedules are derived from a system-perspective using the failure times of the overall system as estimated from its
performance degradation trends.
The system analyzed in this work is a flow transmission water pipe system. The various factors influencing PdM-based scheduling are
identified and their impact on the system reliability and performance are quantitatively studied. The estimated times to replacement of
the MSS may also be derived from the developed model.
The results of the model simulation demonstrate the significant impact of maintenance quality and the criteria for the call for
maintenance (user demand) on the system reliability and mean performance characteristics. A slight improvement in maintenance
quality is found to postpone the system replacement time by manifold. The consistency in the quality of maintenance work with
minimal variance is also identified as a very important factor that enhances the system’s future operational and downtime event
predictability.
The studies also reveal that in order to reduce the frequency of maintenance actions, it is necessary to lower the minimum user demand
from the system if possible, ensuring at the same time that the system still performs its intended function effectively.
The model proposed can be utilized to implement a PdM program in the industry with a few modifications to suit the individual
industrial systems’ needs.
r 2007 Elsevier Ltd. All rights reserved.
Keywords: Maintenance quality; Markov chain analysis; Multi-state system (MSS); Restoration factor (RF); Time to replacement (TTR); Time to failure
(TTF); Universal generating function (UGF); User demand
0951-8320/$ - see front matter r 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ress.2007.09.003
ARTICLE IN PRESS
Cher Ming Tan, N. Raghavan / Reliability Engineering and System Safety 93 (2008) 1138–1150 1139
There are other models to account for imperfect The other type is the task processing system [30] in which
maintenance, and readers may refer to Refs. [14,28] for performance is described in terms of processing speed or
detailed information. All the above-mentioned models are response time. Typical examples include server, control and
focused on a binary system which can operate in only two other software systems where the performance index of
discrete performance states, viz. ‘‘functional’’ or ‘‘non- interest is the speed of processing data and instructions,
functional’’. Also, most of the models assume a ‘‘single expressed in mega bits per second. The analysis of these
unit’’ system with negligible maintenance duration and the two types of MSS mentioned above are different owing to
impact of maintenance quality on system reliability is not the different properties being examined in them and the
included. For practical application, MSS should be different nature of their basic functions.
considered [29,30], and the study of the impact of The system examined in this work is a water pipe
maintenance quality is especially useful as it directly hydraulic system which is a flow transmission MSS. The
impacts the TTR of the system. It is with the motivation topology of the system consists of 3 pipe elements as shown
to enable PdM model to be applicable in a practical in Fig. 1. Elements 1 and 2 are in parallel with each other
environment that this work is produced. and they are collectively in series with element 3. The
In this work, we analyze the system from a multi-state degradation index or physical performance property of the
perspective for any generic complex n-component system system is the mass flow rate of water in each pipe element
with any topology (series, parallel, bridge structures, etc.). expressed in the unit of tons/min. The system shown in Fig.
Imperfect PdM is modeled by considering the effect of the 1 is the elevation view implying that all the elements 1, 2
mean and variance in the quality of maintenance work and 3 are at the same height level above the ground and
separately. The quality of maintenance work is modeled by thus there is no influence of gravity effect on the pressure
a parameter called the Restoration Factor (RF) which or speed of water flow in any particular element.
describes the relative degradation in system performance To simplify the analysis, only one failure mode is assumed
for every subsequent operation cycle of the system. Being a for all the elements of the system, and the failure of each
quality index, RF is assumed to have a normal distribution element is considered to be independent of every other.
with its mean (mRF) and variance (s2RF) giving a clear In this work, the water pipe system is analyzed using the
indication of the skill and consistency of maintenance work Markov process [29,30]. Each element has its own markov
performed, respectively. As an initial work, our model state diagram with different states of performance and
assumes no dependency between the elements and con- failure rates between these performance states as shown in
siders only a single failure mode. Fig. 2. The parameters gij represent the jth discrete
Although an imperfect maintenance policy for MSS has degraded state of performance of the ith element in the
been proposed earlier in [31] based on the proportional age system. The symbol li,j(m) is the failure rate of element m
setback model [21], the maintenance duration is assumed to where its performance degrades from the ith state to the jth
be negligible, and the user demand is assumed to be constant. state. The operational lifespan of the system may be
Furthermore, no method for the determination of the TTR classified into different operation cycles. The kth operation
of the system is discussed. For practical applications, the cycle is defined as the operating time interval between the
model proposed in this work considers the finite maintenance (k1)th and kth maintenance actions. Referring to the
duration and its variability, and the system replacement time Markov State diagram shown in Fig. 2, the numerical
is determined using a simple and effective approach. values for the various states of performance of each
The novelty of the work lies in characterizing system element and the failure rate (l) are given in Table 1,
performance variation for different maintenance work extracted from an example given in [29].
quality standards represented by the RF distribution and Each performance rate in Table 1 corresponds to a
different user demands. In other words, system’s perfor- discrete amount of water flow rate that the particular
mance capability is being examined from the user’s element of the water pipe can transmit through it.
perspective. Unlike other approaches, the system is modeled
using a Markov State Diagram in this work, which is found
to be a good choice for modeling complex systems [32].
For example, Element 3 has 3 discrete states of perfor- data of previously operated similar systems using the
mance. It could be transmitting water either at its standard Maximum Likelihood Estimate (MLE) procedure
maximum capacity (performance) of g31 ¼ 4.0 tons/min for the exponential distribution [33].
or intermediate capacity of g32 ¼ 1.8 tons/min or at For example, if n observations on different identical
g33 ¼ 0.0 tons/min implying complete non-functional fail- systems are made and the duration (ti) for transition
ure. Elements 1, 2 and 3 each have 2, 2 and 3 discrete states between consecutive states of performance gi,j and gi,j+1 is
of performance respectively, thus making up a total of measured for an arbitrary r number of these systems, then
2 2 3 ¼ 12 discrete system performance rates. the failure rate lj,j+1 could be estimated using the
Under some reasonably general conditions, the failures exponential MLE as in (1) where ti and ti+ are the failure
of a complex system have been shown to follow the (F) and censor (C) data values for the duration of state
exponential distribution even though the individual com- transition. The censor data (ti+) refer to those systems
ponents in the system may follow other failure distribu- where the transition from state j to j+1 has not yet
tions [32]. Thus, if the water pipe system in Fig. 2 is occurred during the period of observation:
assumed to be complex, it will follow the exponential
^lj;jþ1 ¼ P r
failure distribution pattern, thereby justifying the use of a P þ . (1)
Markov State Diagram, in which the failure rates are all i2F ti þ i2C ti
time-independent constants. Since the impact of RF and user demand (W) on the PdM
During the operation of the system, the elements transit policy is to be studied in this work, let us now discuss these
from one state of performance to another in a period of two factors in detail below.
time. Using the concept of hazard rate, the interstate
transitions can be described using the failure rate, which is 3.1. PdM model parameters
expressed as the number of such state transitions per year
(unit time). The values of the failure rate for each element 3.1.1. Restoration factor (RF)
are shown in Table 1. The values of these parameters in To study the impact of the quality of a maintenance
Table 1 for various state transitions may be extracted from work on system performance quantitatively under the PdM
past maintenance data records and condition monitoring policy, a new term called the RF is introduced. It
represents the percentage recovery of the system’s mean
ELEMENT 1 performance in the kth operation cycle (after the kth
maintenance action) relative to its mean performance
g11 ELEMENT 3 during the previous (k1)th operation cycle. The kth
λ 1,2 (1) maintenance action (cycle) refers to the downtime duration
g31
between the successive kth and (k+1)th operation cycles.
g12 The better the maintaining quality is, the higher the RF.
λ 1,2(3)
Based on the definition of RF, the system mean
g32 performance during the kth operation cycle, denoted by
g21 λ 2,3(3) Gk(t), may be expressed in terms of the corresponding
mean performance in the (k1)th operation cycle, Gk1(t)
λ 1,2(2) g33 using the RF of the preceding (k1)th maintenance cycle,
g22
RF[k1] as follows:
Gk ðtÞ ¼ G k1 ðtÞ RF½k 1. (2)
ELEMENT 2
An RF value of 100% represents the system being
Fig. 2. Markov state diagram of the 3-element water pipe system. maintained to an as-good-as-new (renewal process) condi-
Table 1
Performance rates, transition rates and performance distribution for each element of the pipe system
Element (#) Performance rates Initial condition Degradation rates Performance distribution
(tons/min) (yr1)
Fig. 2, the set of simultaneous differential equations The values for the various failure rates, li,j(m), in the above
describing the state probability expressions may be expressions are also given in Table 1.
extracted. They are listed in Eqs. (6)–(12) below. Every
state is described by its corresponding differential equation. 3.3. Universal generating function (UGF)—U(z)
In Fig. 2, Elements 1 and 2 each have 2 discrete states of
performance while element 3 has 3 discrete states, thus UGF methodology [30,34] is an essential tool to obtain
making up a total set of 2+2+3 ¼ 7 differential equations the performance distribution of the overall system. This
to be solved simultaneously [29]: UGF is a z-transform-based approach first proposed by
dp11 ðtÞ Ushakov (1986) [35]. The UGF is an efficient tool for
¼ lð1Þ
1;2 p11 ðtÞ, (6) complex MSS reliability assessment as it greatly reduces the
dt
problem complexity and computational intensity by
dp12 ðtÞ modularizing a system into its components and analyzing
¼ lð1Þ
1;2 p11 ðtÞ, (7)
dt each component of the system individually, thereby
enabling a complex problem to be broken into subpro-
dp21 ðtÞ
¼ lð2Þ
1;2 p21 ðtÞ, (8) blems, each of which can be solved individually with ease.
dt For the water pipe system in this work, the UGF method
dp22 ðtÞ reduces the total number of differential equations to only
¼ lð2Þ
1;2 p21 ðtÞ, (9) 2+2+3 ¼ 7 in (6)–(12) as compared to using a single
dt
‘‘overall-system’’ Markov Analysis which would have
dp31 ðtÞ required 2 2 3 ¼ 12 differential equations correspond-
¼ lð3Þ
1;2 p31 ðtÞ, (10)
dt ing to 12 discrete system states. As the system becomes
more and more complex, the UGF methodology will be
dp32 ðtÞ increasingly effective and efficient for MSS reliability
¼ lð3Þ ð3Þ
1;2 p31 ðtÞ l2;3 p32 ðtÞ, (11)
dt analysis.
A performance distribution is a probability distribution
dp33 ðtÞ
¼ lð3Þ
2;3 p32 ðtÞ. (12) table listing the various states of performance of the
dt
element/system and their corresponding time-varying state
Each of the performance rates, gij, in the Markov state probability expressions. The element performance distribu-
diagram (Fig. 2) have their respective initial conditions tions for all the 3 elements of the pipe system are described
shown in Table 1. The highest performance rates of each in Table 1 based on the Markov analysis results in the
element (g11, g21, g31) have an initial probability of 100% previous section. The system performance distribution can
and all the other lower performance rates have a zero initial be obtained from the individual element performance
probability at time t ¼ 0. distributions using UGF so that the overall system’s
The seven differential equations in (6)–(12) are solved performance, G(t), can be characterized.
along with their initial conditions listed in Table 1 to obtain The U-function [33], ui(z) for element i is expressed as
the time-dependent state probability expressions for all the follows:
7 states given below in Eqs. (13)–(19). These expressions
denote the probability of the element at the various states u1 ðzÞ ¼ p11 ðtÞzg11 þ p12 ðtÞzg12 , (20)
of performance, for all tX0. In general, pij(t) is the
u2 ðzÞ ¼ p21 ðtÞzg21 þ p22 ðtÞzg22 , (21)
probability that element i is in the jth state of performance
at any time t [29]:
u3 ðzÞ ¼ p31 ðtÞzg31 þ p32 ðtÞzg32 þ p33 ðtÞzg33 . (22)
lð1Þ t
p11 ðtÞ ¼ e 1;2 , (13) To formulate the system’s overall performance distribution
in terms of the individual element performances, a system
lð1Þ t
p12 ðtÞ ¼ 1 e 1;2 , (14) structure function, j, is constructed. This j function
depends on the system topology (series-parallel architec-
ð2Þ
p21 ðtÞ ¼ el1;2 t , (15) ture) and the type of MSS being analyzed (flow transmis-
sion or task processing). The system topology of the water
ð2Þ
p22 ðtÞ ¼ 1 el1;2 t , (16) pipe in Fig. 2 consists of elements 1 and 2 in parallel to
each other and the parallel combination in turn in series
ð3Þ
with element 3. The net amount of water flow rate
p31 ðtÞ ¼ el1;2 t , (17)
(performance), Gs, through the overall pipe system can be
determined by the minimum of the total amount of water
lð3Þ
1;2 ð3Þ ð3Þ
p32 ðtÞ ¼ el1;2 t þ el2;3 t , (18) that can flow through the parallel combination of elements
lð3Þ ð3Þ
1;2 l2;3 {1,2} given by (G1+G2) and the serially connected element
3 having a water flow rate represented by the random
p33 ðtÞ ¼ 1 p31 ðtÞ p32 ðtÞ. (19) variable, G3. Based on this configuration, the system
ARTICLE IN PRESS
1144 Cher Ming Tan, N. Raghavan / Reliability Engineering and System Safety 93 (2008) 1138–1150
structure function, j, for the flow transmission water pipe the corresponding time-varying state probability expres-
system in Fig. 2 is given by sion. Note that although the system has 12 discrete states
of performance, only 5 {g1–g5} out of the 12 performance
G S ¼ jðG1 ; G 2 ; G3 Þ ¼ minfðG 1 þ G 2 Þ; G 3 g, (23)
rates are distinct in value for this particular case study.
where G1, G2 and G3 are the discrete random variables Hence the expression in (25) for the system u-function,
representing the performance values (mass flow rate) of US(z) has only 5 terms. An obvious property of the
pipe elements 1, 2 and 3 respectively and Gs denotes the probability distribution above is thus:
overall system performance variable. p1 ðtÞ þ p2 ðtÞ þ p3 ðtÞ þ p4 ðtÞ þ p5 ðtÞ ¼ 1 8t 2 <þ
0. (26)
Eqs. (20)–(22) only describe the u-functions u1(z), u2(z)
and u3(z) for each individual element of the system. It is In this case, from Table 2, the mass flow rate of
necessary to obtain the system u-function, Us(z), for the g1 ¼ 3.5 tons/min corresponds to the maximum perfor-
entire system in order to extract the system performance mance rate of the ‘‘fully functional’’ system. On the other
distribution of interest. Us(z) can be obtained using the hand, the mass flow rate of g5 ¼ 0.0 tons/min represents the
composition operator approach, Oj [30,34], making use of total failure event where the pipe system is ‘‘completely non-
the individual element u-functions in (20)–(22) and the functional’’ i.e., not able to transmit any water at all. The
system structure function, j in (23). flow rates of g2 ¼ 2.0, g3 ¼ 1.8 and g4 ¼ 1.5 tons/min
Using the element state probabilities in (13)–(19) and the correspond to the intermediate degraded states where the
expressions in (20)–(23), the system UGF represented by system is only ‘‘partially functional and efficient’’ in its
Us(z) is obtained as performance.
U S ðzÞ ¼ Oj fu1 ðzÞ; u2 ðzÞ; u3 ðzÞg 3.4. System reliability and performance
c¼3 X
X b¼2 X
a¼2
¼ p1a ðtÞp2b ðtÞp3c ðtÞnzjðG1 ;G2 ;G3 Þ From the system performance distribution, the Relia-
c¼1 b¼1 a¼1
bility (Survival) Function of the system, R1(t) for the
c¼3 X
X b¼2 X
a¼2
first operation cycle, can be defined as the probability
¼ p1a ðtÞp2b ðtÞp3c ðtÞnzminfg1a þg2b ;g3c g . ð24Þ
that the system’s performance (GS) is above the minimum
c¼1 b¼1 a¼1
user-set demand value, W. This is consistent with our
The number of terms embedded in the summation of (24) is earlier definition of failure in Section 3.1.2 where the
equal to the product of the number of states of system is considered to have failed from the user’s
performance of elements 1, 2 and 3 which is equal to perspective once its mean performance, G(t), drops below
2 2 3 ¼ 12, corresponding to the number of discrete the user demand (W). Therefore, R1(t) is expressed as
states of the entire system. Combining the coefficients of follows:
the terms having the common powers of z and simplifying ( )
further using the rules that p11(t)+p12(t) ¼ 1; X5
R1 ðtÞ ¼ PðG S XW Þ ¼ p ðtÞgi XW ,
i (27)
p21(t)+p22(t) ¼ 1; p31(t)+p32(t)+p33(t) ¼ 1 8t40, we ob- i¼1
tain the system UGF, Us(z) as follows:
where W is the minimum demand setting representing the
X
5 minimum user expectation from the system. Eqs. (6)–(27) in
U S ðzÞ ¼ pi ðtÞzgi , (25) the multi-state UGF theory described above are based on
i¼1
an illustration provided in [29].
where the simplified expressions for pi(t) and gi in (25) are The mean performance of the system for the first
given in Table 2. operation cycle, G1(t), can be modeled from the system
Table 2 completely describes the system performance performance distribution. Since Table 2 is a probability
distribution for the water pipe system where gi denotes the distribution function (p.d.f.) of a discrete statistical random
system performance rate value (mass flow rate) and pi(t) is variable of the system performance, GS, the mean or
expectation of GS, denoted by E(GS), can therefore be
expressed as follows:
Table 2 " #
X
5
System performance distribution of the water pipe system
G1 ðtÞ ¼ EðG S Þ ¼ pi ðtÞgi . (28)
System State probability pi(t) i¼1
performance rate In (28), G1(t) is the system mean performance for the first
(gi) (tons/min)
operation cycle; the summation term is the usual statistical
g1 ¼ 3.5 p1 ðtÞ ¼ p11 ðtÞp21 ðtÞp31 ðtÞ definition for ‘‘expectation of a random variable’’.
g2 ¼ 2.0 p2 ðtÞ ¼ p12 ðtÞp21 ðtÞp31 ðtÞ The performance variation trends of the system for a
g3 ¼ 1.8 p3 ðtÞ ¼ p21 ðtÞp32 ðtÞ general kth operation cycle, Gk(t), may now be described in
g4 ¼ 1.5 p4 ðtÞ ¼ p11 ðtÞp22 ðtÞ½p31 ðtÞ þ p32 ðtÞ
terms of G1(t) by (29) based on the earlier expressions in (2)
g5 ¼ 0.0 p5 ðtÞ ¼ p12 ðtÞp22 ðtÞ þ p11 ðtÞp33 ðtÞ þ p12 ðtÞp21 ðtÞp33 ðtÞ
and (28). The estimated TTF for every operation cycle, k, is
ARTICLE IN PRESS
Cher Ming Tan, N. Raghavan / Reliability Engineering and System Safety 93 (2008) 1138–1150 1145
found by solving (5) numerically, where Gk(t) is described by may not reflect the reality. In other words, the results
described in the following sections serve only to show the
Y
k 1
G k ðtÞ ¼ G1 ðtÞ RF½r. (29) practical usefulness and applicability of the model.
r¼1
4. Results and discussion
3.5. Maintenance cycle model
4.1. Impact of demand (W)
Based on the model proposed in the previous section, the
time to next failure (TTF) of the system is estimated by Figs. 4(a)–(c) show the system mean performance curves
solving (5). The duration for different maintenance actions for three demand (W) values of 3.0, 2.5 and 2.0 tons/min
(downtime duration) in any maintenance policy is always a for a given RF distribution with parameters mRF ¼ 95%
variable due to many factors. For example, the root cause and sRF ¼ 0%. One can see from these figures that the
of each failure could be different; the degree of the damages higher the demand (W), the sooner the TTF and the higher
caused by the failures can be different on different the mean frequency of maintenance actions to be
occasions too. Some of the failure sites might be externally performed. This is because setting a higher demand (W)
accessible and maintenance could be performed without implies that the system’s mean performance would degrade
dismantling the system, thus requiring less repair time; below the demand level in a shorter span of time as
whereas some others could be situated deep inside the illustrated earlier in Fig. 3.
system that requires the system to be opened for failure TTR is defined as the instant when the degrading
analysis and restoration work which could end up to be system’s mean performance can never be restored to above
very time-consuming. As a result of all these variations, the the minimum demand (W) anymore in spite of any further
maintenance duration needs to be modeled by a random maintenance work. In such an event, further repair work is
variable with a stochastic distribution. The Weibull and not beneficial because the user’s minimum expectations can
Gamma distributions are commonly used for downtime or no longer be satisfied, and replacement of the system (which
repair distributions [36]. Here, we use the Weibull distribu- is an expensive process) is therefore the only alternative
tion for downtime event modeling. The shape factor b is option. Fig. 5 clearly illustrates the replacement criteria.
assumed to be 1 to reflect the age-independent randomness Table 3 shows the computed TTR for different demand
in the maintenance durations. Since system availability in values when the RF distribution is fixed at mRF ¼ 95% and
general is expected to be around 95%, downtime durations sRF ¼ 0%. From Table 3, it can be seen that if the demand
are expected to be around 5% of the operation periods. As W is increased by 0.5 tons/min from 2.0 to 2.5 tons/min,
the mean operation cycle durations are found to range TTR drops by approximately 59.8%. Similarly, further
between 0.05 and 0.15 years as revealed by the simulations increase in the demand from 2.5 to 3.0 tons/min again
of the model developed, the mean downtime (Z) is roughly causes the replacement time to drop further by around
taken as 5% 0.10 ¼ 0.005 years in this case study for 76.4%. Therefore, it is important for the user not to choose
illustration purposes. Actual values for Z may in fact be a very high W value close to the maximum performance
determined based on the analysis of downtime durations in capability of the system (3.5 tons/min in this case study).
past maintenance records. Instead, a moderate demand under which the system can
As the system continues to age, the degree of failure and still function effectively should be chosen.
extent of damage of the to-be-maintained system becomes
more pronounced and severe even under the PdM policy, 4.2. Impact of the mean restoration factor (mRF)
due to the effect of irreparable wear-and-tear effects. Thus,
the later stages of system failures are more difficult and Figs. 6(a)–(c) show the typical variation of system mean
time-consuming to maintain and restore as compared to performance curves respectively for various mRF values of
the initial failures. Therefore, it would be appropriate to 90%, 95% and 97.5% keeping the parameters sRF ¼ 0%
model the scale factor, Z, of the downtime distribution as and W ¼ 2.5 tons/min fixed. The figures show that the
an arbitrary increasing function of the operation cycle, k, higher the mRF, the higher the average reliability and
to represent the increased downtime periods during performance at any point in time as expected.
subsequent repair actions, as the system’s rate of degrada- Large values of mRF coupled with low sRF indicate very
tion increases with time for imperfect maintenance. high quality of repair work and imply large restorations in
With the mathematical model developed, we can now the system performance during maintenance. As a result,
simulate the model using MATLAB and study the effect of the system’s initial performance during the start of every
the various factors described, on the system reliability and operation cycle is relatively high and it takes more time
performance characteristics for the MSS PdM policy. Due for the system’s performance to degrade to below the
to the unavailability of real industrial data, the numerical minimum demand (W) for that operation cycle. This
values of the Markov failure rates in this case study of the implies extended TTFs and hence prolonged TTR.
pipe system are fictitiously assumed for the sake of illus- Table 4 shows that the TTR of the system increases
tration, hence the magnitudes of the computation results largely by 33.8% as the mRF is increased from 85% to 90%.
ARTICLE IN PRESS
1146 Cher Ming Tan, N. Raghavan / Reliability Engineering and System Safety 93 (2008) 1138–1150
Table 3
Variation of system ‘‘time to replacement’’ (TTR) for demand (W) ¼ 3.0,
2.5 and 2.0 tons/min
3.0 0.087
2.5 0.368
2.0 0.915
Table 4
Variation of system ‘‘time to replacement’’ (TTR) for mRF ranging from
85% to 97.5%
85 0.145
90 0.194
92.5 0.274
95 0.368
97.5 0.730
3.6
2.8
2.6
SIM A
2.4
2.2
2
SIM B REPLACE
1.8
1.6
0 0.1 0.2 0.3 0.4 0.5 0.6
System Operation Time (years)
(5) and a reliable estimate of the maintenance (downtime) Fig. 8 illustrates the failure of the virtual age model by
schedules can be constructed. Table 5 shows a typical showing a simulation example for the mean system
example of a maintenance schedule derived from the model performance variation when W ¼ 3.0 tons/min, mRF ¼ 90%
for the following four different cases: and sRF ¼ 0%.
mRF ¼ 95%; sRF ¼ 0%; W ¼ 3:0 tons=min,
mRF ¼ 95%; sRF ¼ 0%; W ¼ 2:0 tons=min; The water pipe system analyzed in this study is
considered to have only one failure mode and all the
mRF ¼ 90%; sRF ¼ 0%; W ¼ 2:5 tons=min. elements’ degradation trends are assumed to be independent
of one another. In most real-life systems however,
4.5. Failure of the virtual age model significant dependencies exist between the elements due to
their close proximity and multiple failure modes have been
Kijima’s virtual age model [8,20] is frequently used to detected. Therefore, the model needs to be extended to
describe the effect of maintenance quality on the effective incorporate the effect of dependent multi-modal failure
restored age of the system. It is given by (30) where Age[k] modes which is currently ongoing. Further investigation is
is the effective age of the system immediately after the
(k1)th maintenance action is completed
Age½k ¼ ðAge½k 1 þ TTF½k 1Þð1 RF½k 1Þ. (30)
An attempt to make use of this Virtual Age model for the
proposed PdM MSS model failed because the definition of
‘‘failure’’ in this study is different from the conventional
definition. This is elaborated as follows.
Since ‘‘failure’’ is defined as the instant at which system
performance, G(t), drops below the set demand of W, as
the factor Age[k1] in (30) increases for progressive system
degradation, the values of TTF[k1] successively decrease
due to imperfect restoration. As a result, the sum Age½k 1
þTTF½k 1 is approximately constant at the beginning of
every operation cycle. Thus, the value of Age[k] at the
beginning of every new operation cycle remains constant,
resulting in performance trends being repetitive and indica-
tive of an infinite replacement time (TTR-N) which is
Fig. 8. System performance variation trends illustrating the failure of the
unrealistic and impossible. Hence, the virtual age model was virtual age model in being applicable to the proposed PdM MSS
not considered in favor of the expression in (2) which maintenance framework. The replacement time (TTR) tends to infinity,
provides a more realistic outlook to imperfect restoration. which is unrealistic and impossible.
Table 5
Determination of the downtime schedule from model simulation for four different cases
Downtime # (i) W ¼ 3.0 tons/min (ii) W ¼ 2.5 tons/min (iii) W ¼ 2.0 tons/min (iv) W ¼ 2.5 tons/min
mRF ¼ 95% mRF ¼ 95% mRF ¼ 95% mRF ¼ 90%
sRF ¼ 0% sRF ¼ 0% sRF ¼ 0% sRF ¼ 0%
also required to develop possible analytical methods of maintenance work quality and user demand (W), which
extracting RF from past maintenance data records. The represents minimum user expectations were identified as
proposed model, though proven to be useful based on the important PdM parameters and their impacts on the
results shown in Section 4, has not been optimized yet in system performance, downtime schedule and replacement
this study. The use of Genetic Algorithms [31,37] and time was quantitatively examined.
Simulated Annealing [37] as a tool for optimizing the Using the stochastic model for the RF, system perfor-
proposed maintenance model will be taken up. mance variation trends for various mRF, sRF and W values
The PdM model-based maintenance scheduling in this were simulated and presented graphically. The results
case study is from a ‘‘system-perspective’’. In other words, clearly indicate the significant impact of mRF, sRF and W
maintenance schedules are predicted based on the overall on system reliability. A highly skilled maintenance crew
performance degradation trends of the ‘‘system’’ and not (high mRF) can help improve the system reliability and
by analyzing the individual performance trends of each maintainability to a large extent, thus saving costs and
‘‘component’’ or ‘‘element’’. As a more efficient strategy, reducing wear and tear of the system and in turn
the same PdM model could be extended to analyze the prolonging its useful lifespan. Consistent performance of
system from the ‘‘component-perspective’’ wherein main- maintenance (low sRF) is also very essential for more
tenance schedules are devised accounting for the individual accurate predictability of future downtime schedules and
degradation trends of each component. In this case, times to system replacement (TTR) which in turn assist the
although the maintenance durations are expected to be management to precisely pre-plan the production activities
shorter due to maintenance being performed only on so as to meet the timely customer market demands.
certain components, the frequency of maintenance actions Throughout this study, the model developed and the
is however expected to be much higher thus having a results shown were all based on the case study of a simple 3-
negative impact on the system availability and production element flow transmission water pipe MSS. However, it is
output capacity. A comparative study of the effectiveness important to take note that the exact same procedure
of system-based and component-based PdM scheduling is described in this work could be applied to any generic n-
therefore necessary. element MSS of any type (flow transmission or task
Although reduction in the fixed demand has been processing) with any arbitrary topology to construct the
proposed to be a useful strategy in prolonging the system’s PdM model regardless of the system complexity. The only
operational lifespan as revealed in Section 4.1, this feature to take note of is that the system structure function,
technique may not be applicable in all cases because GS ¼ j(G1,G2,y,Gn), will vary for every system depending
reduced demand implies that the user has to compromise on its MSS classification and its topology [30]. Therefore, the
the longer lifespan for a lower mean efficiency in the model and results prescribed in this study are not just
system’s performance capability, which many users are confined to the 3-element pipe system examined, but
unlikely to do so. Lower user expectations may have applicable in general to all operating systems in the industry.
negative economic consequences, due to user dissatisfac- A company’s long-term financial position hinges largely
tion. In such cases, the negative economic effects may more on its ability to reduce plant operational and maintenance
than offset the savings incurred by a prolonged lifespan. costs, which currently accounts for as much as around
Therefore, the most optimal maintenance schedule and 15–70% of its overall production expenses [1]. Mainte-
parameter values needs to be determined on the basis of a nance cost reductions to lower levels can be partly achieved
more complete model, which considers the various implicit by implementing the new PdM policy proposed in this
and explicit cost components involved in any maintenance- study and ensuring continuous sustained improvements in
related decision. Such an analysis necessitates the combi- mRF and sRF.
nation of ‘‘maintenance modeling’’ with ‘‘maintenance
economics’’. Acknowledgments
It is expected that, even for very complex systems
(410–15 components), the UGF-based PdM model could The authors would like to thank the management team of
be computationally efficient because it modularizes a the Office of Research of Nanyang Technological University
system into its components and analyzes each component (NTU), Singapore for funding this research work. The useful
as a separate entity, thereby reducing the computational comments provided by our reviewers which helped improve
complexities due to reduction in the number of differential the quality of this work, is very much appreciated.
equations to be solved. Investigations to confirm the
validity of this claim must however be taken up. References
[3] Moya CC. The control of the setting up of a predictive maintenance [33] Ebeling C. An introduction to reliability and maintainability
program using a system of indicators. Int J Manage Sci 2004:57–75. engineering. International ed. New York: Mc-Graw Hill Publica-
[4] Dieulle L, Berenguer C, Grall A, Roussignol M. Continuous time tions; 1997.
predictive maintenance scheduling for a deteriorating system. Annu [34] Levitin G, Lisnianski A, Beh-Haim H, Elmakis D. Redundancy
Reliab Maintainability Symp 2001:150–5. optimization for series-parallel multi-state systems. IEEE Trans
[5] Jaw L. Putting CBM and EHM in perspective. Scientific Monitoring Reliab 1998;47(2):165–72.
Inc., Maintenance Technology. [35] Ushakov I. Universal generating function. Sov J Comput System Sci
[6] Grall A, Berenguer C, Dieulle L. A condition-based maintenance 1986;24(5):118–29.
policy for stochastically deteriorating systems. Reliab Eng System Saf [36] Source: www.weibull.com: Introduction to repairable systems-down-
2002;76(2):167–80. time distributions.
[7] Nakagawa T. Optimum policies when preventive maintenance is [37] Mohanta DK, Sadhu PK, Chakrabarti R. Deterministic and
imperfect. IEEE Trans Reliab 1979;28(4):331–2. stochastic approach for safety and reliability optimization of captive
[8] Kijima M. Some results for repairable systems with general repair. power plant maintenance scheduling using GA/SA-based hybrid
J Appl Probab 1989;26:89–102. techniques: a comparison of results. Reliab Eng System Saf
[9] Wendai W, Daescu DD. Reliability quantification of induction 2007;92(2):187–99.
motors–accelerated degradation testing approach. Annu Reliab
Maintainability Symp 2002:325–31.
[10] Kane MM. Intelligent motors moving to the forefront of predictive
maintenance. In: 47th annual petroleum and chemical industry Cher Ming Tan (M’84–SM’00) was born in
Singapore in 1959. He received the B.Eng. degree
conference, 2000. p. 217–23.
[11] Hoang P, Hongzhou W. Imperfect maintenance. Eur J Oper Res (Hons.) in electrical engineering from the Na-
1996;94(3):425–38. tional University of Singapore in 1984, and the
[12] Nakagawa T. Optimum policies when preventive maintenance is M.A.Sc. and Ph.D. degrees in electrical engineer-
ing from the University of Toronto, Toronto,
imperfect. IEEE Trans Reliab 1979;28(4):331–2.
[13] Nakagawa T. Imperfect preventive maintenance. IEEE Trans Reliab Ont., Canada, in 1988 and 1992, respectively. He
1979;28(5):402. joined Nanyang Technological University (NTU)
[14] Brown M, Poschan F. Imperfect repair. J Appl Probab 1983;20: as an academic staff in 1997, and he is now an
Associate Professor in the School of Electrical
851–9.
[15] Block HW, Borges WS, Savits TH. Age dependent minimal repair. and Electronic Engineering. His current research areas are reliability data
J Appl Probab 1985;22:370–85. analysis, electromigration reliability physics and test methodology, and
quality engineering such as QFD. He also works on silicon-on-insulator
[16] Block HW, Borges WS, Savits TH. A general age replacement model
structure fabrication technology and power semiconductor device physics.
with minimal repair. Naval Res Logist 1988;35/5:365–72.
[17] Makis V, Jardine AKS. Optimal replacement policy for a general Dr. Tan was the Chair of the IEEE Singapore Section in 2006. He is also
model with imperfect repair. J Oper Res Soc 1992;43(2):111–20. the Chairman of the Certified Reliability Engineer Board of Singapore
Quality Institute, and Committee member of the Strategy and Planning
[18] Malik MAK. Reliable preventive maintenance policy. AIIE Trans
1979;11(3):221–8. Committee of the Singapore Quality Institute. He has also been elected to
[19] Chan JK, Shaw L. Modeling repairable systems with failure rates that the Research Board of Advisors of the American Biographical Institute
depend on age and maintenance. IEEE Trans Reliab 1993;42:566–70. and was elected International Educator of the Year 2003 by the
International Biographical Center, Cambridge, UK. He is now appointed
[20] Kijima M, Morimura H, Suzuki Y. Periodical replacement problem
without assuming minimal repair. Eur J Oper Res 1988;37/2:194–203. as a Fellow of the Singapore Quality Institute and a Fellow of the
[21] Martorell S, Sanchez A, Serdarell V. Age-dependent reliability model Singapore Institute of Manufacturing Technology.
considering effects of maintenance and working conditions. Reliab He is currently listed in Who’s Who in Science and Engineering as well as
Who’s Who in the World due to his achievements in science and
Eng System Saf 1999;64(1):19–31.
[22] Kijima M, Nakagawa T. Accumulative damage shock model with engineering.
imperfect preventive maintenance. Naval Res Logist 1991;38:145–56.
[23] Kijima M, Nakagawa T. Replacement policies of a shock model with
imperfect preventive maintenance. Eur J Oper Res 1992;57:100–10. Nagarajan Raghavan was born in Bangalore,
[24] Wang H, Pham H. Optimal maintenance policies for several India in 1985. He completed his Bachelor degree
imperfect maintenance models. Int J Systems Sci 1996. with a First Class Honors at the School of
[25] Wang H, Pham H. Optimal age-dependent preventive maintenance Electrical and Electronic Engineering (EEE) in
policies with imperfect maintenance. Int J Reliab, Qual Saf Eng 1996. Nanyang Technological University (NTU), Sin-
[26] Wang H, Pham H. A quasi renewal process and its application in the gapore in May 2007. He specializes in the field of
imperfect maintenance. Int J Systems Sci 1996. semiconductor device reliability, failure physics
[27] Wang H, Pham H. Availability and optimal maintenance of series and reliability and maintenance engineering. He
system subject to imperfect repair. IE Working Paper, Rutgers has worked in the past on statistical analysis of
University, 1996. p. 96–101. reliability and maintenance data under the
[28] Wang H. A survey of maintenance policies of deteriorating systems. ‘‘Undergraduate Research on Campus (URECA)’’ program organized
Eur J Oper Res 2002;139(3):469–89. and funded by NTU. He is the recipient of the prestigious Nanyang
[29] Levitin G, Lisnianski A. Multi-state system reliability: assessment, Scholarship and NTU President Research Scholar awards. He has also
optimization and applications. Singapore: World Scientific Publish- been offered the Singapore-MIT Alliance (SMA) Graduate Fellowship for
ing Co. Pvt. Ltd.; 2003 Chapters 1 and 4. pursuing a Double Master degree in Materials Science and Engineering.
[30] Levitin G. The universal generating function for reliability assessment His technical interests include reliability statistics, reliability and failure
and optimization. Berlin: Springer Series Publishing Co. Pvt. Ltd.; 2004. physics, semiconductor device modeling and characterization, quantum
[31] Levitin G, Lisnianski A. Optimization of imperfect preventive physics and predictive maintenance modeling.
maintenance for multi-state systems. Reliab Eng System Saf 2000; He has 7 International Conference papers and 3 Journal Papers to his
67(2):193–203. credit based on his undergraduate research initiatives. He is currently a
[32] Drenick RF. The failure law of complex equipment. J Soc Ind Appl Member of IEEE and a Student Member of Materials Research Society
Math 1960;8(4):680–90. (MRS).