Methods For Automated Design of
Methods For Automated Design of
Dissertations, No 1448
Carl Svärd
Carl Svärd
Linköping Studies in Science and Technology
Dissertations, No 1448
Carl Svärd
Carl Svärd
[email protected]
www.vehicular.isy.liu.se
Division of Vehicular Systems
Department of Electrical Engineering
Linköping University
SE–581 83 Linköping, Sweden
Svärd, Carl
Methods for Automated Design of Fault Detection and Isolation Systems
with Automotive Applications
ISBN 978-91-7519-894-1
ISSN 0345-7524
v
Populärvetenskaplig Sammanfattning
Syftet med denna avhandling är att utveckla metoder för automatiserad design av diag-
nossystem för att upptäcka och isolera fel i stora komplexa tekniska system. Att upptäcka
och isolera fel är viktigt för att garantera ett systems pålitlighet och driftsäkerhet. Ett exem-
pel är tunga lastbilar där förmågan att upptäcka och isolera fel är avgörande för att uppnå
och bibehålla exempelvis låga avgasemissioner, hög nyttjandegrad, hög fordonssäkerhet
och effektiva reparationer.
Ett sätt att upptäcka fel i ett system är att använda så kallade modellbaserade residualer.
En modellbaserad residual kan skapas genom att bilda skillnaden mellan en observation
från systemet och dess virtuella motsvarighet som skapas genom att simulera systemets
felfria beteende med hjälp av en matematisk modell. En residual skild från noll indik-
erar att det kan finnas något fel i systemet. Genom att använda residualer baserade på
observationer från olika delar av systemet så kan ett upptäckt fel dessutom isoleras till
en specifik komponent i systemet. Detta är framförallt viktigt för effektiva reparationer.
Design av ett komplett diagnossystem för ett stort komplext system är en utmanande
uppgift som kräver en ansenlig mängd utvecklingsarbete. För att erhålla en optimal
lösning fodras väldefinierade krav med avseende på exempelvis robusthet och de fel som
skall upptäckas och isoleras. Dessutom behövs detaljerad kunskap om systemets beteende,
dels för det felfria fallet, men framförallt för alla tänkbara felfall. Denna typ av information
är dock sällan tillgänglig åtminstone inte i början av en utvecklingsprocess. Med en
automatiserad designmetodik så kan kontinuerliga förbättringar hos diagnossystemet
göras snabbt och effektivt då nya krav och mer kunskap tillkommer. Detta innebär en
systematisering och effektivisering av utvecklingsprocessen vilket i förlängningen också
borgar för högre kvalité.
I avhandlingen utvecklas ett antal generella och teoretiskt välgrundade metoder för
att upptäcka och isolera fel i komplexa tekniska system med hjälp av modellbaserade
residualer. För att stödja en automatiserad designmetodik är metoderna utvecklade
för att kräva minimal användarinteraktion. Stora komplexa system ställer höga krav
på metodernas beskaffenheter. Exempelvis så beskrivs dessa system ofta utav stora dy-
namiska och olinjära modeller vilka måste kunna hanteras. Vidare så leder dessa systems
mångfacetterade egenskaper och komplexitet till att modellerna inte alltid är kapabla att
beskriva systemens beteende i alla situationer. Metoderna är utvecklade för att hantera
dessa svårigheter på ett systematiskt sätt.
De utvecklade metoderna, såväl som potentialen hos en automatiserad designmetodik,
utvärderas genom omfattande applikationsstudier. Metoderna appliceras med god fram-
gång för att utveckla kompletta diagnossystem för såväl en dieselmotor i en tung lastbil
som en vindkraftturbin. Slutsatsen är att metoderna kan användas för att designa ett
diagnossystem med bra prestanda till en mycket liten arbetsinsats.
vii
Acknowledgments
With this thesis I have accomplished one of my goals in life, namely to write a book. It has
been five years filled with hard but foremost inspiring and rewarding work. Neither the
writing nor the work would have been possible without a number of individual persons.
First of all, I would like to express my sincere gratitude to my supervisor Mattias
Nyberg for his guidance, devotion, and ability to inspire. His effort and capability to
continuously push things a little bit further have been invaluable. Mattias may be more
of a perfectionist than me, and I did not think that was possible.
This work has been performed as a part of a collaborative industrial research project
between Scania CV AB in Södertälje and the division of Vehicular Systems, Department
of Electrical Engineering, Linköping University.
I would like to thank my assistant supervisors Erik Frisk and Mattias Krysander for
giving discussions, and valuable comments and input. Special thanks goes to Erik for his
support and for helping me structuring this thesis, and to Mattias for his alert and astute
comments. I would also like to thank Lars Nielsen for letting me join his research group
Vehicular Systems.
Many thanks also goes to all my colleagues at Scania and Vehicular Systems for
contributing to a nice working atmosphere. Special thanks goes to Erik Höckerdal for
help with LATEX issues. Henrik Flemmer is thanked for being a supportive manager.
I also thank my managers Niklas Karpe and Peter Vansölin for letting me be a part
of this project and do research work. My former managers Mats Jennische and Peter
Madsen also deserve acknowledgments. The steering group, with chairman Nils-Gunnar
Vågstedt, are also thanked.
The work has been jointly financed by Scania CV AB and Vinnova, Swedish Govern-
mental Agency for Innovation Systems, who are also acknowledged.
Finally, I thank my family and friends for their support. Special and sincere thanks
goes to my parents, Åsa and Kjell, and sister Anna, for their understanding and encour-
agement. Last but not least, I would like to express my utmost gratitude and love to
Emma for her great support, patience, and love.
Carl Svärd
Stockholm, April 2012
ix
Contents
1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
xi
xii Contents
Publications 37
A Residual Generators for Fault Diagnosis using Computation Sequences with
Mixed Causality Applied to Automotive Systems 39
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2 Preliminaries and Background Theory . . . . . . . . . . . . . . . . . . . 44
2.1 Integral and Derivative Causality . . . . . . . . . . . . . . . . . . 45
2.2 Structure of Equation Sets . . . . . . . . . . . . . . . . . . . . . . 45
2.3 Structural Decomposition . . . . . . . . . . . . . . . . . . . . . . 46
2.4 Differential-Algebraic Equation Systems . . . . . . . . . . . . . 47
3 Sequential Computation of Variables . . . . . . . . . . . . . . . . . . . . 48
3.1 BLT Semi-Explicit DAE Form . . . . . . . . . . . . . . . . . . . 48
3.2 Computational Tools . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Computation Sequence . . . . . . . . . . . . . . . . . . . . . . . 53
4 Sequential Residual Generation . . . . . . . . . . . . . . . . . . . . . . . 54
4.1 Proper Sequential Residual Generator . . . . . . . . . . . . . . . 55
4.2 Finding Proper Sequential Residual Generators . . . . . . . . . 57
5 Method for Finding a Computation Sequence . . . . . . . . . . . . . . . 58
5.1 Illustrative Example . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Summary of the Method . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6 Application Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.1 Implementation and Configuration of the Method . . . . . . . 62
6.2 Performance Measures . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3 Automotive Diesel Engine . . . . . . . . . . . . . . . . . . . . . . 65
6.4 Hydraulic Braking System . . . . . . . . . . . . . . . . . . . . . . 66
6.5 Realization of a Residual Generator for the Diesel Engine . . . 68
7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
A Proofs of Theorems and Lemmas . . . . . . . . . . . . . . . . . . . . . . 72
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6 Greedy Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.1 Greedy Heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.2 Greedy Selection Algorithm . . . . . . . . . . . . . . . . . . . . 98
6.3 Properties of the Greedy Selection Algorithm . . . . . . . . . . 99
7 Sequential Residual Generation . . . . . . . . . . . . . . . . . . . . . . . 101
7.1 Computation Sequence . . . . . . . . . . . . . . . . . . . . . . . 102
7.2 Sequential Residual Generator . . . . . . . . . . . . . . . . . . . 102
7.3 Residual Generation Method . . . . . . . . . . . . . . . . . . . . 102
7.4 Fault Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.5 Necessary Realizability Criterion . . . . . . . . . . . . . . . . . . 104
8 Application Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.1 The Automotive Engine System . . . . . . . . . . . . . . . . . . 105
8.2 Appliance of the MHS-Based Algorithm . . . . . . . . . . . . . 106
8.3 Appliance of the Greedy Algorithm . . . . . . . . . . . . . . . . 108
8.4 Analysis of the Cardinalities of Greedy Solutions . . . . . . . . 108
8.5 Case Study of Fault Sensitivity . . . . . . . . . . . . . . . . . . . 111
9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Introduction
1
2 Chapter 1. Introduction
complex systems often contain many physical interconnections which implies that the
effect of a fault may propagate in the system and that the effect will be visible in many
of the sensor measurements. This, in combination with the small number of sensors,
makes fault isolation in these systems a non-trivial problem. For instance, the problem
of fault decoupling in residual generators must be handled which in addition is further
complicated by the properties of the involved models.
Furthermore, the complexity of the systems in combination with their often many
operating modes, imply that models typically not are able to fully describe the behaviors
of systems in all operating modes. Regardless of a substantial modeling work, this
results in model-errors of time-varying nature and magnitude. In order to be able to
detect small faults in a robust way, model errors and additional uncertainties must be
handled. Specifically, this issue must be handled by the method used for design of residual
evaluators.
1.2 Objective
In an industrial context, and with the challenges and difficulties discussed above in mind,
it is clear that design of a complete model-based FDI-system for a complex real-world
system is an intricate task that demands a substantial engineering effort. To obtain an
optimal design, it is required to have well-defined requirements regarding for example
robustness and the faults to detect and isolate. In addition, it is required to have detailed
knowledge of the behavior of the supervised system. Both in the no-fault case, but in
particular also in all fault cases. This kind of information is however seldom available for
real-world systems, at least not during early stages in the design process. To conform to
this situation, an iterative design process is adopted in this thesis. In this way, continuous
improvements of the FDI-system can be made as more knowledge is obtained and
additional requirements arise along the design process.
The overall objective of the thesis is to develop generic, systematic, and theoretically
sound methods for design of model-based FDI-systems for complex real-world systems.
In addition, in order to facilitate the adopted iterative design process, the methods are
aimed at supporting an automated design methodology and require a minimum amount
of human interaction. By means of an automated design methodology, the FDI-system
can be rapidly redesigned and reconfigured which makes the iterative design process
more efficient and systematic, and also contributes to higher quality. All these issues are
essential in an industrial context.
1.3 Outline
The thesis is divided into two parts. The first part aims at providing the information
necessary for placing the contributions of the second part in a scientific and industrial
context. The first part consists of Chapters 2, 3, and 4. Chapter 2 discusses FDI in
automotive systems with the aim to provide an application oriented background and
motivation to the work carried out in the thesis. Chapter 3 considers design of FDI-
1.3. Outline 3
systems, both in a general and theoretical context, and in an industrial context. Finally,
Chapter 4 summarizes the main contributions of the thesis.
The second part consists of five papers enclosed as Papers A - E. Papers A and B
consider residual generation, and Paper C residual evaluation. Papers D and E contain
application studies in the form of an automotive diesel engine system and wind turbine
system, respectively. These papers demonstrate and evaluate the applicability of the
methods developed in Papers A, B, and C, in particular, and the potential of an automated
design methodology in general.
Chapter 2
This chapter discusses fault detection and isolation (FDI) in the context of automotive
systems. The overall aim is to provide an application oriented background and motivation
to the work carried out in this thesis. The chapter is structured as follows. Section 2.1
presents some automotive systems where FDI is important, and discusses some of their
characterizing properties of significance in this context. Section 2.2 elaborates on the
importance of FDI as a mean to fulfill a set of requirements on automotive systems.
Different activities involving FDI aimed at guarantee fulfillment of these requirements
are also discussed. Finally, Section 2.3 presents a set of requirements for FDI in automotive
systems. This is done from an industrial perspective, taking the properties of automotive
systems in Section 2.1, as well as the properties of the different activities in Section 2.2,
into account.
2.1.1 Examples
A modern automotive vehicle is a complex cyber-physical system that contains electrical,
mechanical, chemical, and thermo-dynamical, sub-systems. Of particular interest for
heavy-duty vehicles is the diesel engine, which is frequently used as an application
example in this thesis. In order to meet requirements in terms of fuel economy, emissions,
5
6 Chapter 2. Fault Detection and Isolation in Automotive Systems
Figure 2.1: A Scania 13-liter, 6-cylinder diesel engine equipped with EGR and VGT.
(Courtesy of Scania CV AB. Illustration by Semcon Informatic Graphic Solutions.)
and driveability, a modern diesel engine is equipped with for example Exhaust Gas
Recirculation (EGR), Variable Geometry Turbocharger (VGT), and intake manifold
throttle, see Figures 2.1, 2.2, and 2.3a. To purify exhausts, diesel engines interact with,
and are dependent on, one or several advanced after-treatment systems such as a Diesel
Particulate Filter (DPF), and a Selective Catalytic Reduction (SCR) system, see Figure 2.3b.
In addition, to further increase driveability and meet safety requirements, they interact
with other complex systems in the power train like an automatic gearbox and an auxiliary
hydraulic braking system, see Figure 2.4.
2.1.2 Faults
All of the above mentioned systems are, due to their function and complexity, vulnerable
to faults. To investigate which faults to detect and isolate, Failure Mode Effect Analysis
(FMEA) (Stamatis, 1995) and Fault Tree Analysis (FTA) (Haasl et al., 1981) may be
carried out. For the specific case of automotive engines, emission critical faults are
of special interest. Much effort is therefore spent on testing the engines in test-beds
where faults can be injected and emissions measured. Typical emission critical faults are
faults affecting the fuel-injection system, the cooling system, and the gas-flow system,
faults in all sensors and actuators, and faults affecting after-treatment systems like the
SCR-system and the DPF. Specific examples are gas-leakages in the VGT- or EGR-system,
bad UREA quality in the SCR-system, broken or missing filter substrate in the DPF,
or a bias- or gain fault in a sensor. Sensors and actuators are in themselves complex
cyber-physical systems, and are particularly sensitive to faults, in comparison with for
example purely mechanical systems. It is therefore important that especially faults in
sensors and actuators in automotive systems can be detected and isolated.
2.1. Automotive Systems 7
(a) Exhaust Gas Recirculation (EGR). (b) Variable Geometry Turbocharger (VGT).
Figure 2.2: To meet requirements in terms of fuel economy, emissions, and driveability,
a modern diesel engine is equipped with EGR and VGT. (Courtesy of Scania CV AB.
Illustration by Semcon Informatic Graphic Solutions.)
Cooled recirculated gas
Urea
Air
Recirculated gas
Catalytic Exhaust
converter gas
Engine
Figure 2.3: Usage of EGR and/or SCR in diesel engines reduces the generation of NOx.
(Courtesy of Scania CV AB. Illustrations by Semcon Informatic Graphic Solutions.)
Few Sensors Automotive systems are typically designed for low cost and high func-
tionality, and not primarily to facilitate FDI. Foremost, this means that there are
few sensors in general, and in particular that there is limited, or no, hardware
redundancy in the form of multiple sensors measuring the same physical quantity.
Many Operating Modes Automotive system are typically designed to operate in a num-
ber of different operating modes and normal operation usually involves several of
these. For the example of a diesel engine, operating modes are typically determined
by engine torque and engine speed. One operating mode is characterized by low
8 Chapter 2. Fault Detection and Isolation in Automotive Systems
Figure 2.4: Scania GR875R 8-speed gearbox with a retarder. The retarder is a hydraulic
braking system used on heavy duty trucks for long continuous braking, for example
to maintain constant speed down a slope. (Courtesy of Scania CV AB. Illustration by
Semcon Informatic Graphic Solutions.)
engine speed and high engine torque, and another mode by high engine speed,
but low engine torque.
Highly Interconnected Automotive systems often contain many physical interconnec-
tions. For an example, the exhaust and intake parts of the diesel engine depicted
in Figure 2.1 are coupled by means of the shaft connecting the turbine and the
compressor. This implies that the effect of a fault may propagate in the system and
effects will be visible in many of the measurements.
Complex Models Typically, physical modeling based on first principles of physics is
utilized for modeling of automotive systems. As a consequence of the inherent
complexity of automotive systems, as well as their multi-domain features, modeling
typically results in large-scale, highly non-linear, differential-algebraic equations.
In addition, due to the many interconnections in the systems, models are often
highly coupled.
Availability
Uptime
Reliability
Emissions
Dependability Safety
Safety
Integrity
Repair
Maintainability
Figure 2.5: High vehicle uptime, low exhaust emissions, high vehicle safety, as well as
efficient repair, are important for the dependability of an automotive vehicle.
• efficient repair,
• high driveability.
High vehicle uptime together with efficient repair, in the sense that the time at the work-
shop is minimized, maximizes the possible revenue for a vehicle operator. Good fuel
economy and efficient repair, in the sense that no unnecessary parts are changed, mini-
mizes the vehicle cost. Vehicle uptime, repair, and fuel economy, are thus all important
factors in order to minimize the overall life-cycle cost of an automotive vehicle. This, in
combination with high safety and high driveability, is of great importance for vehicle
operators. Requirements on low exhaust emissions are mainly driven by legislations.
The properties high vehicle uptime, low exhaust emissions, high safety, as well as
efficient repair, are all examples of the more general dependability (Laprie, 1992; Storey,
1996) attributes availability, reliability, safety, integrity, and maintainability, see Figure 2.5.
A fault in the vehicle or any of its sub-systems may lead to a failure in the form of an
impairment of any of the required properties listed above, for instance in the form of a
standstill vehicle, increased exhaust emissions, or a non-functional braking system. Such
consequences may be prevented, or at least reduced, if the fault can be detected, isolated,
and accommodated. Thus, FDI is a mean in order to achieve the properties above.
To ensure achievement of the required properties, FDI is performed by means of the
three activities:
• off-board diagnosis,
For an illustration, see Figure 2.6. These activities may be performed independently,
but typically there are dependencies. For instance, results from legislative on-board
diagnosis may be exploited for off-board diagnosis at the workshop. Nevertheless, the
ability to be able to detect and isolate faults, to some extent, is important for all three
activities. Next, the different activities will be discussed.
10 Chapter 2. Fault Detection and Isolation in Automotive Systems
Uptime
Safety
Off-Board Diagnosis Fault Detection and Isolation
Repair
Driveability
Figure 2.6: Legislative on-board diagnosis, off-board diagnosis, and on-board fault
accommodation, are important activities in order to achieve properties such as high
vehicle uptime, low exhaust emissions, high safety, efficient repair, good fuel economy,
and high driveability. All these activities involve fault detection and isolation.
Table 2.1: EU Emission Standards for HD Diesel Engines, g/kWh (smoke in m−1 )
isolation (Jensen and Nielsen, 2007; Schwall and Gerdes, 2002; Pernestål and Warnquist,
2012). Examples of additional knowledge and information may be measurements and
on-board diagnosis results from all ECUs in the vehicle, and history from previous
workshop visits, etc. These issues greatly contribute to better and more precise FDI
results. Nevertheless, despite the quite different prerequisites, FDI is of great importance
also in the context of off-board diagnosis.
Existing Hardware Due to cost reasons and space limitations, it is not a desired option
to mount additional hardware in the form of for instance multiple sensors, in order
2.3. Requirements on FDI in Automotive Systems 13
to detect and isolate faults. Thus, FDI in automotive systems should be performed
by using existing hardware only.
Small Faults As said, the OBD-legislations require detection of all faults that may lead
to increased exhaust emissions. Typically, this require detection of small faults in
particularly sensor and actuators. For instance, many emission related automotive
systems, e.g., the SCR-system, are dependent on correct sensor values for control
and, as said in Section 2.1.2, sensors are particularly prone to faults. Even such a
small fault as a deviation of a sensor value by 10 % may lead to incorrect control of
these systems, which in turn may lead to increased emissions.
On-Board Implementation Apart from the particular case of off-board diagnosis, FDI
is to be performed in an on-board environment subject to constraints on com-
putational power and memory, and in some cases also on strict computational
deadlines, i.e., real-time. Thus, it is desirable that the FDI can be performed in this
environment.
Robustness The many operating modes of automotive systems, as discussed in Sec-
tion 2.1.3, in combination with the urge to be able to handle different vehicle
configurations and vehicle individuals, pose strict requirements on the robustness
of the FDI.
Systematic Design In order to obtain an FDI-system of high quality, and at the same
time enable reconfiguration, redesign, and an efficient overall design process, it is
desirable that the methodology used to design the system is systematic.
These requirements will be further considered in the next chapter, in which design of
FDI-systems is considered.
Chapter 3
15
16 Chapter 3. Design of Fault Detection and Isolation Systems
Detection Test 1
Fault Isolation
Detection Test 2
Observations Diagnosis Statement
⋮
Detection Test n
Figure 3.1: A typical FDI-system consists of a set of fault detection tests and a fault
isolation scheme.
to employ hardware redundancy. For instance, if two sensors are used to measure the
same physical quantity, it is possible to test if one of the sensors is faulty by comparing
the values of the sensors. Another approach, providing potentially increased diagnosis
performance and in which the need of additional, redundant, hardware is avoided, is to
use detection tests based on residuals. Detection tests based on residuals will be further
discussed in Section 3.2.
f1 f2 f3
τ1 1 1
(3.1)
τ2 1 1
τ3 1 1
shows which tests that are sensitive to which faults, i.e., test τ 1 is sensitive to faults f 2 and
f 3 , and so on. Now assume a situation where tests τ 1 and τ 2 , but not τ 3 , have alarmed.
The outcome from the detection tests are thus d 1 = 1, d 2 = 1, and d 3 = 0, which combined
with the fault signature matrix (3.1) results in the sub-diagnosis statements D 1 = { f 2 , f 3 },
D 2 = { f 1 , f 3 }, and D 3 = { f 1 , f 2 , f 3 }. The latter is due to a common convention, saying
that nothing can be deduced regarding the status of the system if a test has not alarmed.
The diagnosis statement D then becomes
D = D1 ∩ D2 ∩ D3 = { f2 , f3 } ∩ { f1 , f3 } ∩ { f1 , f2 , f3 } = { f3 } ,
4
x 10
6
0
r
−2
−4
−6
600 650 700 750 800 850
1500
λ
1000
500
Figure 3.2: A residual r (top) and test quantity λ (bottom) created for fault detection in
an automotive diesel engine. The red dashed line is the detection threshold J. A fault
occurs at t = 700 s. Note the non-ideal behavior of the residual and its subtle response to
the fault. By an appropriate residual evaluation by means of the test quantity λ, the fault
can nevertheless be detected.
Figure 3.3: An FDI-system with fault detection tests based on residuals by means of
residual generation and residual evaluation.
parity-space methods, e.g., Chow and Willsky (1984); Nyberg and Frisk (2006); Varga
(2003), and frequency domain methods, e.g., Frank and Ding (1994).
Fault Decoupling
To achieve a specific fault signature matrix, for example one similar to (3.1), decoupling
of faults in residuals is needed. The faults that are decoupled are referred to as non-
monitored faults, whereas the faults not decoupled are called monitored faults. In the
example of Section 3.1.1, fault f 1 is decoupled in τ 1 , which means that for τ 1 , fault f 1 is a
non-monitored fault and f 2 and f 3 are monitored faults. Decoupling of faults in a set of
tests based on residuals, means that the residuals must be sensitive to different subsets of
faults.
In the context of fault isolation, fault decoupling is a fundamental problem in residual
generation. In most of the observer-based residual generation methods mentioned
above, decoupling of faults is obtained by transforming the original model into a sub-
model where only the faults of interest are present. In sequential residual generation
methods, the original model is often divided into sub-models with specific properties
and residual generators are then designed for each sub-model. Since a residual generator
only is sensitive to those faults affecting its corresponding sub-model, all other faults are
decoupled.
Uncertainties
Typically, and as was illustrated in Figure 3.2, residuals are not perfectly zero in the no-
fault case due to uncertainties in the form of for example model errors and measurement
noise. This may decrease the ability to detect faults and also lead to false detections.
The approach used to design the test quantity and threshold in (3.2) are thus important
means in order to handle uncertainties and thus guarantee good fault detection. For
both statistical and norm-based residual evaluation, adaptive thresholds (Clark, 1989;
Frank, 1994; Sneider and Frank, 1996) is a traditional approach to handle uncertainties.
The non-ideal behavior of the residual r in Figure 3.2 is a direct consequence of uncer-
tainties in the form of model errors. As illustrated by the fact that the fault nevertheless
can be detected by means of the test statistic λ, these uncertainties are handled by proper
residual evaluation.
Fault Decoupling
As said earlier, fault decoupling is essential in order to obtain fault isolation. The fact
that automotive systems typically not are equipped with multiple sensors from start, in
combination with the requirement to only use existing hardware for FDI, implies that it
is necessary to employ analytical redundancy and model-based FDI in order to obtain
good performance. This typically leads to an FDI-system with detection tests based on
model-based residuals, as was considered in Section 3.2.
In addition, the many physical interconnections in an automotive system implies
that the effect of a fault may propagate in the system and that the effects will be visible in
many of the measurements. This fact, in combination with the small number of sensors,
makes decoupling of faults a non-trivial problem. Thus, it is of great importance that the
methods used to design an automotive FDI-system, in particular the residual generation
method, are able to handle this issue. Regarding the requirement concerning systematic
design, it is important that the residual generation method facilitates fault decoupling in
a systematic manner.
3.3. Design Challenges for Automotive Systems 21
20
40
60
80
Equations
100
120
140
160
180
200
1 20 40 60 80 100 120 140 160 180 200
Variables
Figure 3.4: The structure of a part of a model of an automotive diesel engine where the
rows correspond to model equations and columns to variables in the model. A black
square in position (i, j) indicates that equation i contains variable j. The red square
illustrates a coupled part of the model corresponding to a differential-algebraic loop. It
may be noted the loop involves almost 50% of the equations. A fault affecting any of the
equations in the coupled part of the model will influence all other equations in that part.
Model Complexity
As said, automotive systems in general, and automotive diesel engines in particular, yield
models in the form of large-scale, non-linear, and coupled differential-algebraic equations.
The methods used in the design of the FDI-system, in particular the residual generation
method, must thus be able to handle such models in a systematic manner. Moreover,
regarding the requirement concerning on-board implementability of automotive FDI-
systems, it is important that the output of the residual generation method, i.e., the set of
residual generators, is suitable for implementation in an on-board environment despite
the complexity of the model used as input.
As said, models of automotive systems are often coupled due to the many intercon-
nections in these systems. In particular, this results in algebraic and differential loops or
cycles (Blanke et al., 2006; Katsillis and Chantler, 1997) comprised of sets of equations
that contains the same set of unknown variables. This is illustrated in Figure 3.4 which
shows the structure, i.e., which equations that contain which unknown variables, of a
part of a model of an automotive diesel engine. It may be noted that the loop shown in
22 Chapter 3. Design of Fault Detection and Isolation Systems
25
δpic [%]
20
15
10
5
40
δpim [%]
30
20
10
20
δpem [%]
15
10
5
Figure 3.5: Relative model errors for the intercooler manifold pressure pim , intake man-
ifold pressure pim , and exhaust manifold pressure pem , for a model of an automotive
diesel engine during a part of the World Harmonized Transient Cycle (WHTC). Note
that the magnitude of the model errors vary with time.
Uncertainties
Due to the inherent complexity of automotive systems, in combination with their many
operating modes, models are typically not capable of capturing the behaviors of systems
in all different operating modes. This results in uncertainties in the form of model
errors, in particular stationary errors (Höckerdal et al., 2011a,b), regardless of substantial
modeling work. In addition, due to the typically unfriendly environment in terms of for
example high temperatures in or around automotive systems, there are also uncertainties
in the form of measurement errors and noise in sensors.
Typically, the magnitudes and nature of these uncertainties are different for different
operating modes. For example, the model may be more accurate in one operating mode
than another, and a sensor may be more or less sensitive to noise in different operating
modes. Since the operating mode of the system varies with time, so does the magnitudes
and nature of the uncertainties. This is illustrated in Figure 3.5, which shows relative
model errors for three state-variables in a model of an automotive diesel engine during a
part of the World Harmonized Transient Cycle (WHTC). Clearly, the magnitude of the
model errors vary with time. To meet the posed requirements regarding small faults and
robustness, this issue must be handled by the FDI-system. In particular, uncertainties
may lead to residuals with the non-ideal behavior illustrated in Figure 3.2 and in order to
3.4. Automated Design of FDI-Systems 23
be able to detect small faults, it is important that uncertainties are handled in the residual
evaluation.
Requirements
and Data
Evaluation
Data
Requirements
Figure 3.7: The considered two-step approach for design of residual generators.
It is noted that the available amount of fault data typically is substantially lower than
the available amount of no-fault data for a number of reasons. First of all, this is due
to the fact that faults are rare. To create fault data, one alternative is to inject faults in
the real system. This is however considered to be expensive, both in terms of time and
money, since it typically require hardware modifications and active usage of the system.
Another alternative is to create fault data by simulation. To give realistic results, this
on the other hand requires models capable of describing the faulty system, which in
turn require detailed knowledge regarding the behavior of the faulty system and possibly
also its environment. This kind of information is seldom available for real applications.
Consequently, it may not be possible to exploit fault data in all stages of the design
methodology, even though this is highly desirable.
Chapter 4
The overall contribution of this thesis is a set of generic and theoretically sound methods
for design of FDI-systems, aimed at supporting an automated design methodology.
Specifically, this thesis contributes to the part of the design methodology enclosed in
the dashed area of Figure 3.6. The developed methods, as well as the overall design
methodology, are evaluated through extensive application studies.
In particular, theoretical and methodological contributions are made in the areas
of model-based residual generation and statistical residual evaluation in form of three
papers enclosed as Paper A, Paper B, and Paper C. Technological contributions, by means
of state-of-practice illustrations and proof-of-concept demonstrations, to the field of
model-based FDI are made in the form of application studies in two papers enclosed as
Paper D and Paper E. In addition, the application studies performed in these two papers
together serve as evaluations of the methods developed in Papers A, B, and C.
In the context of the design challenges discussed in Section 3.3, model complexity
and fault decoupling are considered in Papers A and B, and uncertainties in Paper C.
4.1 Summaries
Brief summaries of the main contributions of Papers A - E are given below.
25
26 Chapter 4. Summary of Main Contributions
mixed causality is utilized and the analytical properties of the equations in the model, as
well as the available tools for algebraic equation solving, are taken into account.
In the context of the two-step approach for design of residual generators, see Figure 3.7,
additional contributions are made. Firstly, it is proven that the set of residual generators
that can be realized, i.e., created, with the method by necessity is a subset of the set of
candidate residual generators based on all Minimal Structurally Over-determined (MSO)
sets of equations (Krysander et al., 2008; Gelso et al., 2008; Pulido and Alonso-González,
2004; Travé-Massuyès et al., 2006) in the given model. Secondly, it is empirically shown
that the combination of the ability to handle mixed causality and loops substantially
increase the amount of realizable candidate residual generators. This is done by means of
application of the method to models of two different automotive systems, a diesel engine
and a hydraulic braking system.
Paper A relies partly on work presented in Svärd and Nyberg (2008a); Svärd and
Nyberg (2008).
model errors and noise. The test quantity used in the method is based on an explicit
comparison of the probability distribution of the residual, estimated online using current
data, with a no-fault residual distribution. The no-fault distribution is based on a set
of a-priori known no-fault residual distributions, and is continuously adapted to the
current situation.
The comparison is done in the framework of statistical hypothesis testing, by means
of the Generalized Likelihood Ratio (GLR). To be suitable for on-line implementation in
an on-board environment, a computational efficient version of the test quantity is derived
by considering a properly chosen approximation to one of the likelihood maximization
problems in the GLR. As a second contribution, an algorithm is proposed for learning
the required set of no-fault residual distributions off-line from no-fault training data.
This algorithm is based on a formulation of the learning problem as a K-means clustering
problem. The residual evaluation method is demonstrated and extensively evaluated by
application to a residual designed for fault detection in an automotive diesel engine.
A preliminary version of Paper C was presented in Svärd et al. (2011c).
4.2 Publications
The research work leading to this thesis is presented in the following publications.
Journal Papers
• C. Svärd and M. Nyberg. Residual generators for fault diagnosis using computation
sequences with mixed causality applied to automotive systems. IEEE Transactions
on Systems, Man and Cybernetics, Part A: Systems and Humans, 40(6):1310–1328,
2010 (Paper A)
28 Chapter 4. Summary of Main Contributions
• C. Svärd and M. Nyberg. Automated design of an FDI-system for the wind turbine
benchmark. Journal of Control Science and Engineering, vol. 2012, 2012. Article ID
989873, 13 pages (Paper E)
Submitted
• C. Svärd, M. Nyberg, and E. Frisk. Realizability constrained selection of residual
generators for fault diagnosis with an automotive engine application. Submitted to
IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans,
2011b (Paper B)
• C. Svärd, M. Nyberg, E. Frisk, and M. Krysander. Data-driven and adaptive
statistical residual evaluation for fault detection with an automotive application.
Submitted to Mechanical Systems and Signal Processing, 2012b (Paper C)
• C. Svärd, M. Nyberg, E. Frisk, and M. Krysander. Automotive engine FDI by
application of an automated model-based and data-driven design methodology.
Submitted to Control Engineering Practice, 2012a (Paper D)
Conference Papers
• C. Svärd, M. Nyberg, and E. Frisk. A greedy approach for selection of residual
generators. In Proceedings of the 22nd International Workshop on Principles of
Diagnosis (DX-11), Murnau, Germany, 2011a
• C. Svärd, M. Nyberg, E. Frisk, and M. Krysander. Residual evaluation for fault
diagnosis by data-driven analysis of non-stationary probability distributions. In
Proceedings of the 50th IEEE Conference on Decision and Control and European
Control Conference (CDC-ECC 2011), 2011c
• C. Svärd and M. Nyberg. Automated design of an FDI-system for the wind turbine
benchmark. In Proceedings of 18th IFAC World Congress, Milano, Italy, 2011
• M. Nyberg and C. Svärd. A service based approach to decentralized diagnosis and
fault tolerant control. In Proceedings of 1st Conference on Control and Fault-Tolerant
Systems (SysTol’10), Nice, France, 2010b
• M. Nyberg and C. Svärd. A decentralized service based architecture for design
and modeling of fault tolerant control systems. In Proceedings of 21st International
Workshop on Principles of Diagnosis (DX-10), Portland, Oregon, USA, 2010a
• C. Svärd and M. Nyberg. A mixed causality approach to residual generation
utilizing equation system solvers and differential-algebraic equation theory. In
Proceedings of 19th International Workshop on Principles of Diagnosis (DX-08), Blue
Mountains, Australia, 2008a
• C. Svärd and M. Nyberg. Observer-based residual generation for linear differential-
algebraic equation systems. In Proceedings of 17th IFAC World Congress, Seoul,
Korea, 2008b
References 29
References
M. Abid, W. Chen, S. X. Ding, and A. Q. Khan. Optimal residual evaluation for nonlinear
systems using post-filter and threshold. International Journal of Control, 84(3):526 – 39,
2011.
I. M. Al-Salami, S. X. Ding, and P. Zhang. Statistical based residual evaluation for fault
detection in networked control systems. In Proceedings of Workshop on Advances Control
and Diagnosis, Nancy, France, November 2006. Nancy Université Henri Poincaré de
Nancy.
M. R. Blas and M. Blanke. Stereo vision with texture learning for fault-tolerant automatic
baling. Computers and Electronics in Agriculture, 75(1):159 – 68, 2011.
California EPA. Sections 1971.1, 1968.2, and 1971.5 of title 13, cal-
ifornia code of regulations: HD OBD and OBD II regulations.
http://www.arb.ca.gov/msprog/obdprog/hdobdreg.htm, 2010. California Envi-
ronmental Protection Agency, Air Resources Board.
J. P. Cassar and M. Staroswiecki. A structural approach for the design of failure detection
and identification systems. In Proceedings of IFAC Control Ind. Syst., pages 841–846,
Belfort, France, 1997.
J. Chen and R. J. Patton. Robust Model-Based Fault Diagnosis for Dynamic Systems.
Kluwer Academic Publishers, 1999.
E. Y. Chow and A. S. Willsky. Analytical redundancy and the design of robust failure
detection systems. IEEE Transactions on Automatic Control, 29(7):603–613, July 1984.
S. X. Ding, P. Zhang, and E. L. Ding. Fault detection system design for a class of stochas-
tically uncertain systems. In Hong-Yue Zhang, editor, Fault Detection, Supervision and
Safety of Technical Processes 2006, pages 705 – 710. Elsevier Science Ltd, 2007.
P. Fogh Odgaard, J. Stoustrup, and M. Kinnaert. Fault tolerant control of wind turbines
- a benchmark model. In Proceedings of the 7th IFAC Symposium on Fault Detection,
Supervision and Safety of Technical Processes, pages 155–160, Barcelona, Spain, 2009.
P. M. Frank. Residual evaluation for fault diagnosis based on adaptive fuzzy thresholds.
In Qualitative and Quantitative Modelling Methods for Fault Diagnosis, IEE Colloquium
on, pages 4/1 –411, April 1995. doi:10.1049/ic:19950512.
P. M. Frank and X. Ding. Survey of robust residual generation and evaluation methods
in observer-based fault detection systems. Journal of Process Control, 7(6):403 – 424,
1997.
Z. Gao and S. X. Ding. Actuator fault robust estimation and fault-tolerant control for a
class of nonlinear descriptor systems. Automatica, 43(5):912 – 920, 2007.
References 31
M. Krysander. Design and Analysis of Diagnosis Systems Using Structural Methods. PhD
thesis, Linköpings universitet, June 2006.
W. Li, Z. Zhu, and S. X. Ding. Fault detection design of networked control systems. IET
Control Theory and Applications, 5(12):1439 – 49, 2011.
M. Nyberg and E. Frisk. Residual generation for fault diagnosis of systems described
by linear differential-algebraic equations. IEEE Transactions on Automatic Control, 51
(12):1995–2000, 2006.
M. Nyberg and M. Krysander. Statistical properties and design criterions for AI-based
fault isolation. In Proceedings of the 17th IFAC World Congress, pages 7356–7362, Seoul,
Korea, 2008.
M. Nyberg and C. Svärd. A decentralized service based architecture for design and
modeling of fault tolerant control systems. In Proceedings of 21st International Workshop
on Principles of Diagnosis (DX-10), Portland, Oregon, USA, 2010a.
References 33
M. Nyberg and C. Svärd. A service based approach to decentralized diagnosis and fault
tolerant control. In Proceedings of 1st Conference on Control and Fault-Tolerant Systems
(SysTol’10), Nice, France, 2010b.
R. J. Patton and M. Hou. Design of fault detection and isolation observers: A matrix
pencil approach. Automatica, 34(9):1135–1140, 1998.
D. N. Shields. Observer design and detection for nonlinear descriptor systems. Interna-
tional Journal of Control, 67(2):153–168, 1997.
D. H. Stamatis. Failure Mode and Effect Analysis: FMEA from Theory to Execution. ASQ
Quality Press, 1995.
M. Staroswiecki. Fault Diagnosis and Fault Tolerant Control, chapter Structural Analysis
for Fault Detection and Isolation and for Fault Tolerant Control. Encyclopedia of Life
Support Systems, Eolss Publishers, Oxford, UK, 2002.
United Nations. Regulation no. 49: Uniform provisions concerning the measures to
be taken against the emission of gaseous and particulate pollutants from compres-
sionignition engines for use in vehicles, and the emission of gaseous pollutants from
positive-ignition engines fuelled with natural gas or liquefied petroleum gas for use in
vehicles, 2008. ECE-R49.
United States EPA. 40 CFR Part 86, 89, et al: Control of air pollu-
tion from new motor vehicles and new motor vehicle engines; final rule.
http://www.epa.gov/obd/regtech/heavy.htm, 2009. United States Environmental Pro-
tection Agency.
A. Varga. On computing least order fault detectors using rational nullspace bases. In
Proc. Safeprocess 2003, pages 229–234, Washington DC, 2003.
A. Willsky and H. Jones. A generalized likelihood ratio approach to the detection and
estimation of jumps in linear systems. IEEE Transactions on Automatic Control, 21(1):
108 – 112, feb 1976.
H. Yang, B. Jiang, and V. Cocquempot. Fault tolerant control design for hybrid systems.
Springer Verlag, 2010.
39
Residual Generators for Fault Diagnosis using
Computation Sequences with Mixed Causality
Applied to Automotive Systems
Carl Svärd and Mattias Nyberg
Abstract
41
42 Paper A. Residual Generators for Fault Diagnosis using . . .
1 Introduction
Fault diagnosis of technical systems has become increasingly important with the rising
demand for reliability and safety, driven by environmental and economical incentives.
One example is automotive engines that are by regulations required to have high precision
on-board diagnosis of failures that are harmful to the environment (United Nations,
2008).
To obtain good detection and isolation of faults, model-based fault diagnosis is neces-
sary. In the Fault Detection and Isolation (FDI) approach to model-based fault diagnosis,
residuals are used to detect and isolate faults present in the system, see, e.g., Blanke et al.
(2006). Residuals are signals that are ideally zero in the non-faulty case and non-zero
else, and are typically generated by utilizing a mathematical model of the system and
measurements.
In this paper, we have the view that design of diagnosis systems is a two-step approach,
as elaborated in Nyberg and Krysander (2008); Nyberg (1999). In the first step, a large
number of candidate residual generators are found, and in the second step the residual
generators most suitable to be included in the final diagnosis system are picked out.
Since different residual generators have different properties regarding fault and noise
sensitivities, it is for the second step important that there is a large selection of different
residual generator candidates to choose between. Thus, the initial set of candidate residual
generators should be as large as possible.
A residual generator design approach (Staroswiecki and Declerck, 1989) which has
shown to be successful in real applications (Dustegor et al., 2004; Izadi-Zamanabadi, 2002;
Cocquempot et al., 1998; Svärd and Wassén, 2006; Hansen and Molin, 2006) is to compute
unknown variables in the model by solving equation sets one at a time in a sequence, i.e.,
according to a computation sequence, and then evaluate a redundant equation to obtain a
residual. To determine from which equations and in which order the unknown variables
should be computed, structural analysis is utilized. In addition to (Staroswiecki and
Declerck, 1989), similar approaches have been described and exploited in, e.g., Cassar
and Staroswiecki (1997); Staroswiecki (2002); Blanke et al. (2006); Pulido and Alonso-
González (2004); Ploix et al. (2005); Travé-Massuyès et al. (2006).
In the works mentioned above, the approach is to apply either integral or derivative
causality (Blanke et al., 2006) for differential equations. However, as will be illustrated
in this paper through application studies, it is advantageous to allow simultaneous
use of integral and derivative causality, i.e., mixed causality. Furthermore, real-world
applications involve complex models that give rise to algebraic and differential loops
or cycles (Blanke et al., 2006; Katsillis and Chantler, 1997), corresponding to sets of
equations that have to be treated simultaneously. Thus, it is desirable that a method for
residual generation is able to handle mixed causality and equation sets corresponding to
algebraic and differential loops. The intention with the following simple example is to
1. Introduction 43
e1 ∶ ẋ 1 − x 2 = 0
e2 ∶ ẋ 3 − x 4 = 0
e3 ∶ ẋ 4 x 1 + 2x 2 x 4 − y 1 = 0 (1)
e4 ∶ x3 − y3 = 0
e5 ∶ x 2 − y 2 = 0,
x3 x4 x2 x1
e4 1
e2 1 1 (2)
e3 1 1 1
e1 1 1
This structure reveals the order and from which equations, marked with bold, the un-
known variables should be computed. It is clear that computation of the variables will
involve handling of the differential loop arising in the equation set {e 1 , e 3 }, since to
compute x 2 the value of x 1 is needed and vice versa. Furthermore, computation of the
variables according to (2) will require use of mixed causality: derivative causality when
solving for x 4 in e 2 , and integral causality when solving for x 1 in e 1 .
The main contribution of this paper is a novel method for residual generation that
enables simultaneous use of integral and derivative causality, and is able to handle equa-
tion sets corresponding to algebraic and differential loops in a systematic manner. In this
sense, the proposed method also generalizes previous methods for residual generation,
e.g., Staroswiecki and Declerck (1989); Dustegor et al. (2004); Izadi-Zamanabadi (2002);
Cocquempot et al. (1998); Cassar and Staroswiecki (1997); Staroswiecki (2002); Blanke
et al. (2006); Pulido and Alonso-González (2004); Ploix et al. (2005); Travé-Massuyès
et al. (2006). To achieve this, a formal framework for sequential computation of variables
is presented. In this framework, tools for equation solving and approximate differenti-
ation, as well as analytical and structural properties of the equations in the model, are
essential.
In Section 2 some preliminaries, basic theories and references regarding structural
analysis and differential-algebraic equation systems are given. Section 3 presents the
framework for sequential computation of variables, in which the concepts Block-Lower
Triangular semi-explicit Differential-Algebraic Equation form (BLT semi-explicit DAE
form), tools, and computation sequence are important. Tools, or more precisely algebraic
equation solving tools, are crucial for the ability to handle loops. In Section 4, it is shown
44 Paper A. Residual Generators for Fault Diagnosis using . . .
how a computation sequence is utilized for residual generation. The resulting residual
generator is referred to as a sequential residual generator. Motivated by implementation
aspects, the concept of a proper sequential residual generator is introduced as a sequential
residual generator in which no unnecessary variables are computed and in which com-
putations are performed from as small equation sets as possible. A necessary condition
for the existence of a proper sequential residual generator is derived, connecting proper
sequential residual generators with Minimal Structurally Over-determined (MSO) equa-
tion sets (Krysander et al., 2008). An algorithm able to find proper sequential residual
generators, given a model and a set of tools, is outlined. A key step in the algorithm is to
find minimal and irreducible computation sequences, which is considered in Section 5.
In Section 6, the proposed method for residual generation is applied to models of an
automotive diesel engine and an auxiliary hydraulic braking system. The application
studies clearly show the benefits of using a mixed causality approach and handling al-
gebraic and differential loops. Finally, Section 7 concludes the paper. For readability,
proofs to all lemmas and theorems are collected in Appendix A.
∂ fi ∂ fi
varX (E′ ) = {x j ∈ X ∶ ∃e i ∈ E′ , ≡/ 0 ∨ ≡/ 0} ,
∂x j ∂ ẋ j
∂ fi
varD (E′ ) = {ẋ j ∈ D ∶ ∃e i ∈ E′ , ≡/ 0} .
∂ ẋ j
Consider the model (1) and let X = {x 1 , x 2 , x 3 , x 4 } and D = {ẋ 1 , ẋ 2 , ẋ 3 , ẋ 4 }. For instance,
it holds that
Let G = (E, X, A) be a bipartite graph where E and X are the (disjoint) sets of vertices,
and
the set of arcs. We will call the bipartite graph G the structure of the equation set E with
respect to X. Note that with this representation, there is no structural difference between
the variable x j and the differentiated variable ẋ j . An equivalent representation of G is
the m × n biadjacency matrix B defined as
1 if (e i , x j ) ∈ A
Bi j = {
0 otherwise
Return to the model (1). The structure of the equation set {e 1 , e 2 , e 3 , e 3 } with respect to
{x 1 , x 2 , x 3 , x 4 } is given by the biadjacency matrix (2). The result in (5) corresponds to
the third row of (2).
We will also consider the structure of E with respect to D which refers to the bipartite
graph Ḡ = (E, D, Ā), where
X+ X0 X-
E+ 0 0
E 10
0
E0 0
E s0
E-
In the example outlined in Section 1, the structure (2), which in fact is the result
of a DM-decomposition, revealed three SCCs which are bold-marked. The SCCs are
({e 4 } , {x 3 }),({e 2 } , {x 4 }), and ({e 1 , e 3 } , {x 1 , x 2 }) of size 1, 1, and 2 respectively. The
latter corresponds to a differential loop.
Differential Index
A common approach when analyzing and solving general DAE-systems, is to seek a
reformulation of the original DAE into a simpler and well-structured description with
the same set of solutions (Kunkel and Mehrmann, 2006; Brenan et al., 1989). To classify
how difficult such a reformulation is, the concept of index has been introduced. There
are different index concepts depending on the kind of reformulation that is sought. In
this paper we will use the differential index, which is defined as the minimum number of
48 Paper A. Residual Generators for Fault Diagnosis using . . .
times that all or parts of the DAE must be differentiated with respect to time in order to
write the DAE as an explicit Ordinary Differential Equation (ODE), ẋ = g (x, y), see for
example Brenan et al. (1989).
Semi-Explicit DAEs
An important class of DAEs are semi-explicit DAEs
ż = g (z, w, y) (6a)
0 = h (z, w, y) , (6b)
where z and w are vectors of unknown variables, and y a vector of known variables. A
semi-explicit DAE is of index one if and only if (6b) can be (locally) solved for w so that
w = h̃ (z, y), see, e.g., Brenan et al. (1989). An explicit ODE can easily be obtained from
a semi-explicit DAE of index one by substituting w = h̃ (z, y) into (6a).
said in Section 2.4, a semi-explicit DAE of index one can trivially be transformed to an
explicit ODE. Explicit ODEs are suitable for real-time simulation in embedded systems,
for example Engine Control Units (ECUs), because real-time simulation often require
use of an explicit integration method, e.g., forward Euler (Ascher and Petzold, 1998),
which assumes an explicit ODE. For a detailed discussion regarding real-time simulation,
see Cellier and Kofman (2006).
Motivated by these arguments, we consider a partitioning of the equation set so that a
block-lower triangular form is achieved, where each block corresponds to a semi-explicit
DAE of index one.
ż1 = g1 (z1 , w1 , y)
ż2 = g2 (z1 , z2 , w1 , w2 , y) (7)
⋮
żs = gs (z1 , z2 , . . . , zs , w1 , w2 , . . . , ws , y)
where
for i = 1, 2, . . . , s, and where z i and w i are vectors of unknown variables, all pairwise
disjoint, and y a vector of known variables, is in Block-Lower Triangular semi-explicit
Differential-Algebraic Equation form (BLT semi-explicit DAE form).
Note that it is not necessary that both z i and w i are present in (7) for every i =
1, 2, . . . , s. In particular, the system
w1 = h1 (y)
w2 = h2 (w1 , y)
⋮
ws = hs (w1 , w2 , . . . , ws−1 , y) ,
where w1 = (w11 , w12 ) and w2 = (w21 , w22 ), which is in BLT semi-explicit DAE form with
s = 2 and p 1 = p 2 = 2. By studying the system (12), we can deduce some properties of the
BLT semi-explicit DAE form;
Mixed Causality The form generalizes the use of integral and derivative causality, since
for example integral causality is used in (12a) and derivative causality in (12e).
Blocks are DAEs of Index One or Zero Each block, e.g. (12a)-(12c), corresponds to a
semi-explicit DAE of, at most, index one with respect to the unknown variables in each
block, i.e., z1 and w1 in the first block and z2 and w2 in the second block. Note that in
accordance with the note above, vectors z1 , z2 , w1 , and w2 must not all be present in (12).
If, for instance, w1 is missing and hence also (12b) and (12c), the first block is an explicit
ODE, i.e., a DAE of index zero. If both z1 and w1 are present, the first block corresponds
to a semi-explicit DAE of index one.
ż1 = g1 (z1 , w1 , y)
= g1 (z1 , [h11 (z1 , y) , h21 (z1 , h11 (z1 , y))] , y)
= g̃1 (z1 , y) ,
and then repeat the procedure for the second block to obtain
Blocks are SCCs Each block in the BLT semi-explicit DAE form is a SCC of the
structure of the corresponding equations with respect to the unknown variables in that
block. This can be seen by studying the structure1 of the equations in (12) with respect
to the variables {z1 , w11 , w12 , z2 , w21 , w22 }, which is shown in (13). In this structure, the
equation in (12a) has been named e 1 , the equation in (12b) has been named e 2 , and so
forth.
z1 w11 w12 z2 w21 w22
e1 1 1 1
e2 1 1
e3 1 1 1 (13)
e4 1 1 1 1 1 1
e5 1 1 1 1 1
e6 1 1 1 1 1 1
Efficiency Recall the discussion regarding efficiency in the beginning of Section 3.1.
As a consequence of the previous property, the original set of equations is partitioned in
as small blocks as possible, in the sense that there are no dependencies between blocks,
i.e., no loops occur.
tools are essential for handling models containing algebraic loops. If, for example, the
available algebraic equation solving tool only can solve scalar equations, loops can not
be handled.
More precisely, an algebraic equation solving tool (AE tool) is a function taking a
set of variables V i ⊆ X ∪ D and a set of equations E i ⊆ E as arguments, and returning a
function g i , which can be a symbolic expression or numeric algorithm, taking variables
from {X ∪ D} ∖ V i and Y as arguments and returning a vector corresponding to the
elements in V i . Now assume that g i is the function returned by an AE tool when V i
and E i are used as arguments, and that the equation set Ē i corresponds to v i = g i (u i , y),
where v i is a vector of the elements in V i , u i a vector of the elements in U i ⊆ {X ∪ D}∖Vi ,
and y a vector of known variables. A natural assumption regarding an AE tool, whatever
algorithm or method it corresponds to, is that the AE tool should not introduce new
solutions. That is, a solution to Ē i should also be a solution to the original equation set
E i . Moreover, an AE tool should neither remove solutions, i.e., solutions to E i must also
be solutions to Ē i . Furthermore, motivated by the idea of using sequential computation
of variables for residual equation, we are interested in unique solutions. This discussion
justifies the following assumption.
Assumption 1. Given U i and y, the solution sets of Ē i , obtained from the AE tool, and E i ,
with respect to V i , are equal and unique.
AE tools giving unique solutions generally assume that the given set of equations
contains as many equations as unknown variables. One example is Newton iteration,
which is a common numerical method for solving non-linear equations, see, e.g., Ortega
and Rheinboldt (2000). In addition, under- and over-determined sets of equations
for which an unique analytical solution exists are rare. This motivates the following
assumption.
Assumption 2. An AE tool requires that its arguments V i and E i correspond to a just-
determined equation set.
In this work, we assume that tools for algebraic equation solving are available through
existing standard software packages like, e.g., Maple or Mathematica, and design and
implementation of such tools will not be considered. For solving algebraic loops, also
tearing (Steward, 1965; Kron, 1963) can be a successful approach. In the following, we
also assume that AE tools fulfill the properties stated in Assumptions 1 and 2.
physical systems, object-oriented modeling tools, e.g., Modelica (Mattson et al., 1998),
are frequently used to build models. Often, this leads to models in which state variables
correspond to physical quantities such as pressures and temperatures and then initial
conditions may have clear physical interpretations. For example, in an engine model a
variable corresponding to the intake manifold pressure should be equal to the ambient
pressure when the engine starts.
If all equilibrium points of the considered ODE are (globally) asymptotically stable,
or by using, e.g., state-feedback (Khalil, 2002) can be made so, the effect of the initial
conditions is neglectable. However, the computed trajectory will in this case differ from
the true trajectory for some time due to transients.
Recall from Section 3.1 that each block in a BLT semi-explicit DAE system can be
transformed to an explicit ODE. In the following, we assume that differential equation
solving tools are always available and that an explicit ODE can be solved, i.e., that
trajectories of the state variables in the ODE can be computed, if the initial conditions of
the state variables are known and consistent. Of course, this assumption is not always
valid and numerical solving of ODEs involves difficulties and problems such as stability
and stiffness, but this is not in the scope of this paper.
Differentiation Tools
A differentiation tool is for example an implementation of a method for approximate
differentiating of known variables. There are several approaches, e.g., low-pass filtering
or smoothing spline approximation (Wei and Li, 2006). An extensive survey of methods
can be found in Barford et al. (1999). Methods for approximate differentiation is not in
the scope of this paper, and will not be further considered.
In the following, we assume that differentiation of a set of known variables either is
possible or not possible. That is, if a tool for approximate differentiation is available, we
assume that the quality of the measurements of the involved variables are good enough
to support the tool.
One alternative to differentiate variables directly, is to propagate unknown differenti-
ated variables through a set of equations so that these can be expressed as derivatives of
measured variables only. Assume for example that we want to compute the derivative ẋ 1
and we also have that x 1 = y 1 . To compute ẋ 1 , we use a differentiation tool to compute ẏ 1
and then use ẋ 1 = ẏ 1 .
For instance, we have that Diff ({ẋ 1 , x 2 }) = {ẋ 1 , ẋ 2 } and unDiff ({ẋ 1 , x 2 }) = {x 1 , x 2 }.
54 Paper A. Residual Generators for Fault Diagnosis using . . .
Now consider the model M(E, X, Y), where E is the set of equations specified in (3),
X the set of unknown variables, and Y the set of known variables.
Definition 3 (Computation Sequence). Given a set of variables X′ ⊆ X, an AE tool T ,
and an ordered set
C = ((V1 , E1 ) , (V2 , E2 ) , . . . , (V k , E k )) ,
x3 = y3 (17a)
x 4 = ẋ 3 (17b)
ẋ 1 = x 2 (17c)
−ẋ 4 x 1 + y 1
x2 = , (17d)
2x 4
is obtained by sequentially calling T with elements from C as arguments.
Note that the obtained BLT semi-explicit DAE system (17) has three blocks; the first
block corresponds to (17a), the second to (17b), and the third to (17c) and (17d). Also
note that the equation set {e 1 , e 3 }, containing a differential loop, corresponds to a semi-
explicit DAE of index one given by (17c) and (17d). Furthermore, derivative causality is
used in (17b) and (17d), and integral causality in (17c).
C = ((V1 , E1 ) , (V2 , E2 ) , . . . , (V k , E k ))
′
be a computation sequence for the variables X′ ⊆ X with the AE tool T , and let Ē be the
set of equations in BLT semi-explicit DAE form obtained from C with the AE tool T . Then
′
the solution sets of Ē and E1 ∪ E2 ∪ . . . ∪ E k , with respect to V1 ∪ V2 ∪ . . . ∪ V k , are equal
and unique.
With this lemma, the following important result can be proved.
Theorem 1. Let M(E, X, Y) be a model, T an AE tool, and
C = ((V1 , E1 ) , (V2 , E2 ) , . . . , (V k , E k )) ,
Return to the model (1) in Section 1. Consider the last two equations in the model,
e4 ∶ x3 − y3 = 0
e5 ∶ x 2 − y 2 = 0,
for {x 2 , x 3 } with T is minimal. The resulting BLT semi-explicit DAE form is given by
x3 = y3 (19a)
x2 = y2 . (19b)
C = ((V1 , E1 ) , (V2 , E2 ) , . . . , (V k , E k )) ,
′
for X with T is irreducible, if no element (V i , E i ) ∈ C can be partitioned as V i = V i1 ∪ V i2
and E i = E i1 ∪ E i2 , such that
C ′ = ((V1 , E1 ) , . . . , (V i1 , E i1 ) , (V i2 , E i2 ) , . . . , (V k , E k ))
C = ((V1 , E1 ) , (V2 , E2 ) , . . . , (V k , E k )) ,
then the equation set E1 ∪ E2 ∪ . . . ∪ E k ∪ e i is an MSO set with respect to varX (E1 ∪ E2 ∪
. . . ∪ Ek ∪ e i )
Note that Theorem 2 establishes a link between structural and analytical methods.
This is done without the use of any assumptions of generic equations as in, e.g., Krysander
et al. (2008), instead assumptions have been placed on the tools.
Recall again the model (1) and consider the computation sequence C, given by (16),
with the corresponding BLT semi-explicit DAE form (17). The computation sequence
C together with the equation e 5 is a sequential residual generator for the model (1), if
we assume that the initial condition of x 1 is known and consistent and the derivatives
ẋ 3 and ẋ 4 can be computed with the available differentiation tools. As a matter of fact,
the residual generator is a proper sequential residual generator since the computation
sequence C for varX (e 5 ) = {x 2 } with the ideal AE tool T is minimal and irreducible.
Hence, we can by Theorem 2 conclude that the equation set E = {e 1 , e 2 , e 3 , e 4 , e 5 } is an
MSO set.
1: function findResidualGenerators(E, X, T )
2: R ∶= ∅
3: MSOs ∶= findAllMSOs(E, X)
4: for all Ē ∈ MSOs do
5: X′ ∶= varX (Ē)
6: for all e i ∈ Ē do
7: E′ ∶= Ē ∖ e i
8: C ∶= findComputationSequence(E′ , X′ , T )
9: if C ≠ ∅ then
10: R = R ∪ {(T (C) , e i )}
11: end if
12: end for
13: end for
14: return R
15: end function
tool.
First identify the SCCs, recall Section 2.3, of the structure of E = {e 1 , e 2 , . . . , e 7 } with
respect to X, and order the corresponding partitions of the equation and variable sets
accordingly
x4 x3 x2 x5 x6 x7 x1
e7 1
e3 1 1 1
e2 1 1
(20)
e4 1 1 1
e6 1 1 1
e1 1 1 1 1 1
e5 1 1 1 1 1 1
E = ({e 7 } , {e 2 , e 3 } , {e 4 } , {e 1 , e 5 , e 6 })
and
X = ({x 4 } , {x 2 , x 3 } , {x 5 } , {x 1 , x 6 , x 7 }) ,
where each element in E is a SCC with respect to the corresponding element in X , e.g.,
({e 2 , e 3 } , {x 2 , x 3 }). The SCCs are marked with bold in (20).
The first SCC, ({e 4 }, {x 7 }), contains one linear algebraic equation. Under assump-
tion that our AE tool can handle such equations, e 7 is solved for x 4 and we obtain
x4 = y6 . (21)
Then consider the next SCC, ({e 2 , e 3 }, {x 2 , x 3 }) which contains two differential
equations. The permuted structure of {e 2 , e 3 } with respect to the differentiated variables
{ẋ 2 , ẋ 3 } is
ẋ 3 ẋ 2
e3 1 (22)
e2 1
As seen, the structure (22) contains two SCCs of size one, ({e 3 }, {ẋ 3 }) and ({e 2 }, {ẋ 2 }).
Assuming our AE tool admits it, we then solve e 3 for ẋ 3 and e 2 for ẋ 2 and obtain
ẋ 3 = −x 3 + x 2 x 4 − y 2 (23)
ẋ 2 = −x 2 x 3 − y 1 .
The next SCC, ({e 4 }, {x 5 }), contains a differential equation. However, since x 5 is
the variable intended to compute from the equation, we can handle e 6 as an algebraic
equation and solve it for x 5 ,
x 5 = x 2 + ẋ 4 − y 3 . (24)
60 Paper A. Residual Generators for Fault Diagnosis using . . .
The SCC ({e 1 , e 5 , e 6 }, {x 1 , x 6 , x 7 }) contains the differential equation e 1 and the two
algebraic equations e 5 and e 6 . By analyzing the equations we see that x 6 and x 7 are
algebraic variables contained in both e 5 and e 6 and that x 1 is a differentiated variable
present in e 1 . We then solve e 1 for ẋ 1 and obtain
ẋ 1 = −x 1 x 6 + x 52 x 7 + x 3 . (25)
The structure of {e 5 , e 6 } with respect to {x 6 , x 7 } reveals a SCC of size two, see (26).
x6 x7
e5 1 1 (26)
e6 1 1
Under the assumption that our AE tool can handle it, we solve the equation system
{e 5 , e 6 } for {x 6 , x 7 } and obtain
x 6 = 2x 32 + x 1 − x 2 x 3 − x 4 + 2y 5 − y 4 (27)
x7 = x1 − x2 x3 − x4 + x 32 + y5 − y4 .
Collecting the equations (21), (23), (24), (25), and (27) gives
x4 = y6 (28a)
ẋ 3 = −x 3 + x 2 x 4 − y 2 (28b)
ẋ 2 = −x 2 x 3 − y 1 (28c)
x 5 = x 2 + ẋ 4 − y 3 (28d)
ẋ 1 = −x 1 x 6 + x 52 x 7 + x3 (28e)
x6 = 2x 32 + x 1 − x 2 x 3 − x 4 + 2y 5 − y 4 (28f)
x7 = x1 − x2 x3 − x4 + x 32 + y5 − y4 , (28g)
which is a system in BLT semi-explicit DAE form with four blocks. The equation (28a)
correspond to the first block, which only contains an algebraic equation. The second
block is given by (28b) and (28c), and correspond to an explicit ODE with respect to
the variables {x 2 , x 3 }. Hence, integral causality is used in this block. The third block
contains (28d), which is a differential equation in which derivative causality is used. The
equations (28e)–(28g) constitute the fourth and last block. This block corresponds to a
semi-explicit DAE of index one, with respect to the variables {x 1 , x 6 , x 7 }.
The resulting computation sequence for {x 1 , x 2 , . . . , x 7 } with the given AE tool is,
1. Find the SCCs of the structure of the equation set with respect to the unknown
variables. No distinction is made between a variable and its derivative.
2. For each SCC, split the equations into one set of differential equations and one
set of algebraic equations, and the variables into one set of differentiated variables
and one set of algebraic variables.
3. For the differential equations, find the SCCs of the structure of the differential
equations with respect to the differentiated variables. For each SCC, try to solve
the differential equations for the intended differentiated variables with the AE tool.
Note that due to the assumption that each differential equation only contains one
differentiated variable, all SCCs are of size one.
4. For the algebraic equations, find the SCCs of the structure of the algebraic equations
with respect to the algebraic variables. For each SCC, try to solve the algebraic
equations for the intended algebraic variables with the AE tool.
5.3 Algorithm
The method is formally described in the function findComputationSequence below.
The function takes a just-determined equation set E′ ⊆ E, a set of unknown variables
X′ ⊆ X, and an AE tool T as input, and returns an ordered set C as output. The function
findAllSCCs is assumed to return an ordered set of equation and variable pairs, where
each pair corresponds to a SCC of the structure of the equation set with respect to the
variable set. The order of the SCCs returned by findAllSCCs is assumed to be the
one depicted in Figure 1, for more information regarding ordering of SCCs please refer
to Murota (1987). There are efficient algorithms for finding SCCs in directed graphs, see
for example Tarjan (1972). The DM-decomposition (Dulmage and Mendelsohn, 1958)
can also be utilized. In Matlab, the DM-decomposition is implemented in the function
dmperm, from which also the order of the SCCs, according to Figure 1, easily can be
obtained. Other functions used in findComputationSequence are:
• Diff and unDiff, takes a variable set as input and returns its differentiated and
undifferentiated correspondence, see (14) and (15).
• isToolSolvable determines if the given AE tool can solve the given equations
for the given set of variables.
• Append, takes an ordered set and an element as input and simply appends the
element to the end of the set.
• The operator ∣ ⋅ ∣, taking a set as input, is assumed to return the number of elements
in the set and the notion A (i) is used to refer to the i:th element of the ordered
set A.
That the ordered set C returned by findComputationSequence, indeed, is a mini-
mal and irreducible computation sequence is verified in the following theorem.
Theorem 4. Let E′ ⊆ E be a just-determined set of equations with respect to the variables
X′ ⊆ X, and T an AE tool. If E′ , X′ , and T are used as arguments to findComputation-
Sequence and a non-empty C is returned, then C is a minimal and irreducible computation
sequence for X′ with T .
6 Application Studies
The objective of this section is to empirically show the benefits of the method for finding
sequential residual generators proposed in Sections 4.2 and 5.3. This is done by applying
the method to models of an automotive diesel engine and an auxiliary hydraulic braking
system. In addition, we illustrate how a sequential residual generator for the diesel engine,
found with the proposed method, can be realized. The realized residual generator is then
evaluated using real measurements from a truck.
1: function findComputationSequence(E′ , X′ , T )
2: C ∶= ∅
3: S ∶= findAllSCCs(E′ , X′ )
4: for i = 1, 2, . . . , ∣S∣ do
5: (E i , X i ) ∶= S (i)
6: D i ∶= Diff(X i )
7: Z i ∶= varD (E i ) ∩ D i
8: W i ∶= X i ∖ unDiff(Z i )
9: if not isInitCondKnown(Z i ) then
10: return ∅
11: end if
12: EZ i ∶= getDifferentialEquations(E i , Z i )
13: EW i ∶= E i ∖ EZ i
14: SZ i ∶= findAllSCCs(EZ i , Z i )
15: for j = 1, 2, . . . , ∣SZ i ∣ do
j j
16: (EZ i , Z i ) ∶= SZ i ( j)
if isToolSolvable(Z i , EZ i , T ) then
j j
17:
j j
18: Append(C, (Z i , EZ i ))
19: else
20: return ∅
21: end if
22: end for
23: if isJustDetermined(EW i , W i ) then
24: SW i ∶= findAllSCCs(EW i , W i )
25: for j = 1, 2, . . . , ∣SW i ∣ do
j j
26: (EW i , W i ) ∶= SW i ( j)
if isToolSolvable(W i ,EW i ,T ) then
j j
27:
j j
28: Append(C, (W i , EW i ))
29: else
30: return ∅
31: end if
32: end for
33: else
34: return ∅
35: end if
36: end for
37: return C
38: end function
64 Paper A. Residual Generators for Fault Diagnosis using . . .
D I DI SD SI SDI
SCC x x x
IC x x x x
DC x x x x
Figure 2: Cutaway of a Scania 13-liter, 6-cylinder diesel engine equipped with EGR and
VGT. Illustration by Semcon Informatic Graphics Solutions.
SDI, i.e., with mixed causality and the ability to handle loops, in comparison with any
other configuration of findComputationSequence.
The Scania auxiliary hydraulic braking system, called retarder, is used on heavy duty
trucks for long continuous braking, for example to maintain constant speed down a
slope. By using the retarder, braking discs can be saved for short time braking.
The model of the hydraulic braking system contains 49 equations, 44 unknown
variables, and 9 known variables. It is a non-linear DAE system and contains 4 differential
equations and 45 algebraic equations.
The model contains 125 MSO sets, which can be arranged into 83 MSO classes. The
total number of possible residual generators for the model of the hydraulic braking
system is, theoretically, 4607.
Table 3 and Figure 4 shows, for each configuration of the method, how many of the
MSO sets and MSO classes that could be used and the total number of residual generators
found for the model of the hydraulic braking system. As seen, a significantly larger
fraction of the MSO sets and MSO classes could be used and more residual generators
could be found with configuration SDI, in comparison with any other configuration.
6. Application Studies 67
35 SDI D
I
DI
30 SD
SI
SDI
25
SDI
20
%
15
10 SI
5 SI
D DI SD SDI
D I DI SD I D I DI SD SI
0
MSO Sets MSO Classes Residual Generators
Figure 3: The bars to the left and in the middle shows the fractions of the total number
of MSO sets and MSO classes in which a residual generator could be found with each
configuration of the method. The bars to the right shows the fractions of the number of
potential residual generators that could be found with each configuration of the method.
30 SDI
SD SD
SI
20 D DI D DI
10 I SD
I
D DI
I SI
0
MSO Sets MSO Classes Residual Generators
Figure 4: The bars to the left and in the middle shows the fractions of the total number
of MSO sets and MSO classes in which a residual generator could be found with each
configuration of the method. The bars to the right shows the fractions of the number of
potential residual generators that could be found with each configuration of the method.
68 Paper A. Residual Generators for Fault Diagnosis using . . .
20
40
60
Equations 80
100
120
140
160
180
200
0 50 100 150 200
Variables
Figure 5: Structure of the 203 equations in the considered computation sequence, with
respect to the 203 unknown variables. The SCCs of the structure, corresponding to the
elements in the computation sequence, are marked with squares. The large SCC contains
102 equations.
The considered computation sequence originates from an MSO set containing in total
204 equations, 203 unknown variables, and 8 known variables. Thus, the computation se-
quence contains 203 equations and 203 unknown variables. In total 33 residual generators
were found in the MSO class to which the MSO set belongs. All 33 residual generators
were found with configuration SDI of findComputationSequence.
The computation sequence contains 102 elements. All elements but the last one
contains one equation and one variable. The last element contains 102 equations and
102 variables and corresponds to a SCC of size 102. The structure of the 203 equations
contained in the computation sequence, with respect to the 203 unknown variables, is
shown in Figure 5. The SCCs of the structure, corresponding to the elements in the
computation sequence, marked with squares in Figure 5.
The residual equation used in the residual generator, i.e., the equation removed from
the MSO set when the corresponding computation sequence was found, compares the
measured and computed pressure in the intake manifold of the diesel engine.
6. Application Studies 69
w1 = h1 (y)
w2 = h2 (w1 , y)
⋮
w64 = h64 (w1 , w2 , . . . , w63 , y)
w65 = h65 (ẇ64 , w1 , . . . , w64 , y)
w66 = h66 (w1 , w2 , . . . , w65 , y)
⋮
w76 = h76 (w1 , w2 , . . . , w75 , y)
w77 = h77 (ẇ76 , w1 , . . . , w76 , y) (29)
w78 = h78 (w1 , w2 , . . . , w77 , y)
⋮
w100 = h100 (w1 , w2 , . . . , w99 , y)
ż101 = g101 (w1 , . . . , w101 , y)
w101
1
= h1101 (z101 , w1 , . . . , w100 , y)
w101
2
= h2101 (z101 , w1 , . . . , w100 , w101
1
, y)
⋮
w101
99
= h99
101 (z101 , w1 , . . . , w100 , w101 , . . . , w101 , y) ,
1 98
where w101 = (w101 , w101 , . . . , w101 ), and z101 is of dimension three and all w i , w i of
1 2 99 j
dimension one. The largest block, denoted 101 in (29), is a semi-explicit DAE of index
one with three differential equations with variables z101 and 99 algebraic equations with
variables w101
1
, . . . , w101
99
, corresponding to a differential loop and a SCC of size 102. Since
the block is a semi-explicit DAE of index one, integral causality is used in this block. In
two of the blocks, denoted 66 and 77 in (29), derivative causality is used. The remaining
blocks, denoted 1 - 65, 67 - 76, and 78 - 100 correspond to algebraic equations. In total,
the BLT semi-explicit DAE system contains five differential equations and 198 algebraic
equations.
Implementation Issues
The residual generator, i.e., the obtained BLT semi-explicit DAE system and the residual
equation, was implemented in Matlab. To compute the values of the unknown variables,
the approach described in Section 3.1 was used. To solve the resulting explicit ODE, Euler
forward with fixed step-size was utilized. All state variables in the residual generators
represent physical quantities, hence initial conditions were easy to obtain from the
available measurements.
70 Paper A. Residual Generators for Fault Diagnosis using . . .
Results
Real measurements of the known variables in the engine model were collected by driving
a truck on the road. Two sets of measurements were collected, one with a fault-free
engine and one with an implemented fault. The implemented fault was a constant bias in
the sensor measuring the pressure in the intake manifold of the diesel engine.
The residual generator was run off-board by using the collected measurements. The
residual was then low-pass filtered to remove some measurement noise and finally scaled.
In Figure 6, the resulting residual is shown. During the first 100 seconds, the measure-
ments are fault-free. The remaining time, the measurements contain the implemented
bias fault. It is obvious that the residual can be used to detect the injected fault.
7 Conclusions
We have in Section 1 concluded that it is important that there is a large selection of
different candidate residual generators to choose between when designing diagnosis
systems. In this spirit we have in this paper presented a method for deriving residual
generators with the key property that it is able to find a large number of different residual
generators. This property is firstly due to the fact that the method belongs to a class
of methods that we refer to as sequential residual generation. This class of methods
has in earlier works been shown to be powerful for real non-linear systems (Dustegor
et al., 2004; Izadi-Zamanabadi, 2002; Cocquempot et al., 1998; Svärd and Wassén, 2006;
Hansen and Molin, 2006). Secondly, which is the key contribution of the paper, we have
extended these earlier methods by handling mixed causality and also, in a systematic
manner, equation sets containing differential and algebraic loops.
The method has been presented as an algorithm utilizing an assumed given toolbox
of, e.g., algebraic equation solvers. We have proven, in Theorem 1, that the algorithm
really finds residual generators and, in Theorems 3 and 4, that the residual generators, or
7. Conclusions 71
2.5
1.5
0.5
−0.5
−1
0 50 100 150 200
time [s]
Figure 6: The residual obtained from the constructed residual generator. No fault is
present the first 100 seconds. During the remaining 100 seconds, there is a bias fault in
the sensor measuring the pressure in the intake manifold. The dashed lines suggests how
thresholds could be chosen in order to detect the fault.
rather sequential residual generators, found are proper. Properness guarantees that the
residual generator is not containing unnecessary computations and that computations
are performed from as small equation sets as possible. We have also proven, in Theorem 2,
that proper sequential residual generators are always found within MSO sets. This fact
has been utilized in the algorithm since there is no need to look for sequential residual
generators in other equation sets than MSO sets. Furthermore, this theorem provides a
link between structural and analytical methods without the use of any assumptions of
generic equations, such as in, e.g., Krysander et al. (2008).
In the empirical study in Section VI, we have evaluated our method on models of two
real automotive Systems. The results obtained are compared to results from the special
cases of using solely differential or integral causality, or only handling scalar equations.
It is evident that our more general method outperforms the other alternatives. Since the
two systems have quite different characteristics, e.g., in the number of redundant sensors,
we believe that these results are representative also for a larger class of systems.
Acknowledgment
This work was sponsored by Scania CV AB and VINNOVA (Swedish Governmental
Agency for Innovation Systems).
72 Paper A. Residual Generators for Fault Diagnosis using . . .
Proof of Theorem 1. Consider the model M(E, X, Y) and assume that ỹ ∈ O (M). Due to
the definition of O (M) in (4), we know that given ỹ there exists at least one trajectory of
the variables in X that satisfies the equations in E. Since describing E1 ∪ E2 ∪ . . . ∪ E k ⊆ E,
it holds that the trajectory ỹ also belongs to the observation set of the sub-model of
M(E, X, Y) given by E1 ∪ E2 ∪ . . . ∪ E k , i.e., the equation set contained in the computation
sequence C. Hence, given ỹ, there exists a trajectory x̃ of the variables in varX (E1 ∪
E2 ∪ . . . ∪ E k ) that satisfies E1 ∪ E2 ∪ . . . ∪ E k . By Lemma 1 we know that x̃ is a unique
solution that also satisfies the equations of the BLT semi-explicit DAE system obtained
by sequentially applying the tool T to the computation sequence C.
As said in Section 3.1, a BLT semi-explicit DAE system can be transformed to an
explicit ODE, with the exception that the ODE will contains derivatives of known
variables. Furthermore, after the discussion in Section 3.2, that an explicit ODE al-
ways can be solved if initial conditions are available. From this it follows that given
ỹ, consistent initial conditions of the states in the BLT semi-explicit DAE system, i.e.,
z i in (7), and the ability the compute all needed derivatives, the trajectory x̃ can be
computed from the BLT semi-explicit DAE system. Since e i ∈ E ∖ E1 ∪ E2 ∪ . . . ∪ E k and
varX (e i ) ⊆ X′ ⊆ varX (E1 ∪ E2 ∪ . . . ∪ E k ), the trajectory x̃ will also satisfy e i . We then
have that f i (x̃,
˙ x̃, ỹ) = 0. Hence, with r = f i (ẋ, x, y), ỹ ∈ O (M) implies r = 0 and we can
use r as residual. Thus the BLT semi-explicit DAE system obtained from the computation
sequence C with T , together with e i is a residual generator for M(E, X, Y).
Proof. From Definition 3, we have that a system in BLT semi-explicit DAE form can
be obtained by sequentially calling T with arguments V i and E i for every (V i , E i ) ∈ C.
From this fact, it follows that each variable x j ∈ unDiff (V i ) is present in some vector
z k or w l in the obtained BLT semi-explicit DAE system. Since the set of all vectors of
known variables in a BLT semi-explicit DAE system by Definition 2 is pairwise disjoint,
it follows that {unDiff (V i )} is pairwise disjoint and we have shown the first claim. For
A. Proofs of Theorems and Lemmas 73
the second claim, we start by noting that V i ⊆ varX (E i ) ∪ varD (E i ) due to Definition 3.
Since a system in BLT semi-explicit DAE form can be obtained from C and, according to
Lemma 1, the solution sets of E1 ∪ E2 ∪ . . . ∪ E k and the BLT semi-explicit DAE system,
with respect to V1 ∪ V2 ∪ . . . ∪ V k , are equal and unique, it holds that each unknown
variable in E1 ∪ E2 ∪ . . . ∪ E k , differentiated or undifferentiated, must be present in some
V i . From this fact and by the definitions of the operators unDiff () and varX (), it must
also hold that unDiff (V1 ∪ V2 ∪ . . . ∪ V k ) = varX (E1 ∪ E2 ∪ . . . ∪ E k ).
For the next proof, we need some additional graph theoretical concepts, see, e.g., As-
ratian et al. (1998); Murota (1987), therefore consider the bipartite graph G = (E, X, A)
describing the structure of E with respect to X, see Section 2.2. A path on the graph G is
a sequence of distinct vertices v 1 , v 2 , . . . , v n such that (v i , v i+1 ) ∈ A and v i ∈ E ∪ X. An
alternating path is a path in which the edges belong alternatively to a matching and not to
the matching. A vertex is said to be free, if it is not an endpoint of an edge in a matching.
Proof of Theorem 2. In this proof we will use a characterization of an MSO set given
in Krysander et al. (2008), saying that an equation set E is an MSO set if and only if E is
a Proper Structurally Over-determined (PSO) set and E contains one redundant equation.
Furthermore, an equation set E is a PSO set if E = E+ , where E+ is the structurally over-
determined part obtained from the DM-decomposition, recall Section 2.3, or equivalently
the equations e ∈ E such that, for any maximal matching, there exists an alternating path
between at least one free equation and e.
Returning to our case, we must show that E1 ∪ E2 ∪ . . . ∪ E k ∪ e i is a PSO set and
contains one redundant equation, with respect to the variables varX (E1 ∪ E2 ∪ . . . ∪ E k ).
We begin with the second property, i.e., that E1 ∪ E2 ∪ . . . ∪ E k ∪ e i contains a redundant
equation. Since S = (T (C) , e i ) is a proper sequential residual generator, it follows from
Definition 7 that C is a minimal and irreducible computation sequence for varX (e i ) with
T . If we let
we have from Definition 3 that a system in BLT semi-explicit DAE form is obtained by
sequentially calling the AE tool T with arguments V i and E i for every (V i , E i ) ∈ C. This
and Assumption 2, implies that ∣V i ∣ = ∣E i ∣ for every (V i , E i ) ∈ C and hence ∑ ki=1 ∣V i ∣ =
k
∑ i=1 ∣E i ∣. By the definition of the operator unDiff () in (15), we can conclude that ∣V i ∣ =
∣unDiff (V i )∣ and therefore it also holds that ∑ ki=1 ∣unDiff (V i )∣ = ∑ ki=1 ∣E i ∣. By Lemma 2
we have that {unDiff (V i )} is pairwise disjoint which implies that ∑ ki=1 ∣unDiff (V i )∣ =
∣unDiff (V1 ) ∪ unDiff (V2 ) ∪ . . . ∪ unDiff (V k )∣ = ∣unDiff (V1 ∪ V2 ∪ . . . ∪ V k )∣. Defi-
nition 3 states that also {E i } is pairwise disjoint and therefore ∣E1 ∪ E2 ∪ . . . ∪ E k ∣ =
k
∑ i=1 ∣E i ∣. Thus, it holds that ∣unDiff (V1 ∪ V2 ∪ . . . ∪ V k )∣ = ∣E1 ∪ E2 ∪ . . . ∪ E k ∣. By
Lemma 2, we have that unDiff (V1 ∪ V2 ∪ . . . ∪ V k ) = varX (E1 ∪ E2 ∪ . . . ∪ E k ) and there-
fore it also holds that ∣E1 ∪ E2 ∪ . . . ∪ E k ∣ = ∣varX (E1 ∪ E2 ∪ . . . ∪ E k )∣, i.e., E1 ∪E2 ∪. . .∪E k
contains as many equations as unknowns. Since C is a computation sequence for varX (e i )
with T , we have from Definition 3 that varX (e i ) ⊆ unDiff (V1 ∪ V2 ∪ . . . ∪ V k ) = varX (E1 ∪
E2 ∪ . . . ∪ E k ), where the last equality follows from Lemma 2, implying that adding e i to
74 Paper A. Residual Generators for Fault Diagnosis using . . .
E1 ∪E2 ∪. . .∪E k will not introduce any new unknown variables, i.e., e i is redundant. Hence,
the equation set E1 ∪E2 ∪. . .∪E k ∪e i contains one more equation than unknown variables,
since ∣E1 ∪ E2 ∪ . . . ∪ E k ∪ e i ∣ = ∣E1 ∪ E2 ∪ . . . ∪ E k ∣ + ∣e i ∣ = ∣varX (E1 ∪ E2 ∪ . . . ∪ E k )∣ + 1.
We will now show that E1 ∪ E2 ∪ . . . ∪ E k ∪ e i is a PSO set with respect to varX (E1 ∪
E2 ∪ . . . ∪ E k ∪ e i ). To show this, we must show that for any maximum matching on
the bipartite graph describing the structure of E1 ∪ E2 ∪ . . . ∪ E k ∪ e i , with respect to
varX (E1 ∪ E2 ∪ . . . ∪ E k ∪ e i ), there exists an alternating path between a free equation and
every equation in E1 ∪ E2 ∪ . . . ∪ E k ∪ e i . We start by constructing a maximum matching
and finding a free equation. Consider the computation sequence C described by (30)
and recall that C, given by (30), is a minimal and irreducible computation sequence for
varX (e i ) with T . The irreducibility of C implies that for each element (V i , E i ) ∈ C, it
holds that the structure of E i with respect to unDiff (V i ) corresponds to a SCC. To see
this, assume that (V i , E i ) not corresponds to a SCC. This implies that it is possible to
partition V i and E i into V i = V i1 ∪ V i2 ∪ . . . ∪ V i s and E i = E i1 ∪ E i2 ∪ . . . ∪ E i s so that
C ′ = ((V1 , E1 ) , . . . , (V i1 , E i1 ) , . . . , (V i s , E i s ) , . . . , (V k , E k )) ,
is also a computation sequence for varX (e i ) with T , due to Assumption 3. This con-
tradicts the irreducibility of C and hence (V i , E i ) must be a SCC. From this property
it follows, by the definition of a SCC, that there exists a maximum matching Γi on the
bipartite graph the structure of E i with respect to unDiff (V i ). This implies that a maxi-
mum matching, let it be denoted Γ, in the structure of E1 ∪ E2 ∪ . . . ∪ E k with respect
to unDiff (V1 ∪ V2 ∪ . . . ∪ V k ) can be constructed as Γ = ⋃ ki Γi , see, e.g., Murota (1987).
By Lemma 2, we have that unDiff (V1 ∪ V2 ∪ . . . ∪ V k ) = varX (E1 ∪ E2 ∪ . . . ∪ E k ) and
therefore Γ is also a maximum matching in the structure of E1 ∪ E2 ∪ . . . ∪ E k with
respect to varX (E1 ∪ E2 ∪ . . . ∪ E k ). In the first part of this proof, we concluded that the
equation e i is redundant and therefore Γ is also a maximum matching on the structure
of E1 ∪ E2 ∪ . . . ∪ E k ∪ e i with respect to varX (E1 ∪ E2 ∪ . . . ∪ E k ∪ e i ) and e i is a free
equation, since it is not contained in Γ.
Since it trivially exists a path between e i and e i , it is sufficient to show that there
exists an alternating path between the free equation e i and every equation in E1 ∪ E2 ∪
. . . ∪ E k . Due to the fact that each (V i , E i ) ∈ C corresponds to a SCC, there exists an
alternating path between any two vertices, i.e., equations or variables, in the bipartite
graph describing the structure of E i with respect to unDiff (V i ), see, e.g., Asratian et al.
(1998). Moreover, the minimality of C implies that for (V k , E k ) ∈ C there exists at least
one variable x m ∈ unDiff (V k ) such that x m ∈ varX (e i ), since otherwise C ′ = C∖(V k , E k )
is a computation sequence for varX (e i ) and C is not minimal. With the same argument,
we have that for (V i , E i ) ∈ C, i = 1, 2, . . . , k − 1, there exists at least one variable x m ∈
unDiff (V i ) such that either x m ∈ varX (e i ), or else x m ∈ varX (E j ) where (V j , E j ) ∈ C
and j ∈ {i + 1, i + 2, . . . , k}. This means that there exists an alternating path between at
least one variable in each (V i , E i ) ∈ C to e i , either directly or via one or several other
(V j , E j ) ∈ C. Thus, there exists an alternating path between e i and every equation in
E1 ∪ E2 ∪ . . . ∪ E k . We have by this shown that E1 ∪ E2 ∪ . . . ∪ E k ∪ e i is a PSO set.
Due to line 23, we know that the equation set EW i is just-determined with respect to
W i , and hence the structure of EW i with respect to W i can be uniquely partitioned into
SCCs. On line 24 these SCCs are computed and as above, the ordered set of SCCs can be
written as
p p
SW i = ((W1i , EW
1
i
) , (W2i , EW
2
i
) , . . . , (W i i , EWi i )) . (34)
Furthermore, as in the case with the set S in (31), the ordering of the SCCs in SW i implies
that
j j+1 j+2 p
varX (EW i ) ∩ {W i ∪ Wi ∪ . . . ∪ W i i } = ∅, (35)
j j j j
where every (Z i , EZ i ) ∈ C and (W i , EW i ) ∈ C corresponds to a SCC.
We will now utilize Definition 3 to show that the the ordered set C in (36) is a
computation sequence for X′ with T . First note that Z i ⊆ varD (EZ i ) and W i ⊆ varX (EW i ).
j j j j
When the structure of a just-determined equation set with respect to a set of variables
is decomposed into its SCCs, unique partitions of the equation and variable sets are
also obtained, see for example Dulmage and Mendelsohn (1958) and Figure 1 for an
illustration. From this fact it follows that every equation in E′ is present in some E i in (31)
only once. When the equations in E i are split into differential equations EZ i and algebraic
equations EW i on line 13, it is guaranteed that EZ i ∩ EW i = ∅. Moreover, again due to
the fact that a decomposition into SCCs gives an unique partition of the equation and
j
variable set, we have that every equation in EZ i is present in some equation set EZ i in (33)
j
only once and that every equation in EW i is present in some EW i in (34) only once. Thus,
we can conclude that each equation in E′ is contained in only one equation set in C, that
is, all equation sets in C are disjoint. Hence, the ordered set C fulfills the prerequisites
in Definition 3. According to conditions 1) and 2) in Definition 3, C is a computation
sequence for X′ with T if
s ⎛ si pi
j⎞
X′ ⊆ ⋃ ⋃ unDiff (Z i ) ∪ ⋃ W i
j
(37)
i=1 ⎝ j=1 j=1 ⎠
and a system in BLT semi-explicit DAE form is obtained by sequentially calling the tool
j j j j j
T , with arguments Z i and EZ i for every element (Z i , EZ i ) ∈ C, and with arguments W i
j j j
and EW i for every element (W i , EW i ) ∈ C.
We start by showing condition 1), i.e., (37). From the fact mentioned above that a de-
composition of a structure into its SCCs also induces a partitioning of the corresponding
equation and variable sets, it follows that every variable in X′ is present in some X i in (31).
That is, we have that X′ = ⋃si X i . When the variables in X i are split into differentiated
variables Z i and undifferentiated variables W i , it holds that X i = unDiff (Z i ) ∪ W i . In
j
addition, it holds that every variable in Z i is present in some variable set Z i in (33)
j j
and that every variable in W i is present in some W i in (34), so that Z i = ⋃sj=1 i
Z i and
ip j
W i = ⋃ j=1 W i . Hence,
s s
X′ = ⋃ X i = ⋃ (unDiff (Z i ) ∪ W i )
i i
⎛ s ⎛ si j⎞ pi j⎞
= ⋃ unDiff ⋃ Z i ∪ ⋃ W i
i ⎝ ⎝ j=1 ⎠ j=1 ⎠
s ⎛ si pi
j j⎞
= ⋃ ⋃ unDiff (Z i ) ∪ ⋃ W i , (38)
i ⎝ j=1 j=1 ⎠
where the last equality trivially follows from the definition of unDiff () in (15). The
property (37) and thus condition 1) has then been verified.
78 Paper A. Residual Generators for Fault Diagnosis using . . .
Condition 2) of Definition 3 will now be verified, that is, that C can be used to obtain
j j
a system in BLT semi-explicit DAE form. Consider an element (Z i , EZ i ) ∈ C. Since
j
EZ i ⊆ EZ i ⊆ E i , and we have that (X i , E i ) ∈ S, the property (32) implies that
j
varX (EZ i ) ∩ {X i+1 ∪ X i+2 ∪ . . . ∪ Xs } = ∅, (39)
for i = 1, 2, . . . , s − 1. From lines 17-21 in the algorithm, it follows that the AE tool T can
j j
be used to solve the equations in EZ i for the variables in Z i . Since we have assumed that
each differential equation contains at most one differentiated variable and (39) holds, we
j j
can use (Z i , EZ i ) ∈ C and the AE tool T to obtain
ż i = g i (x1 , x2 , . . . , x i , y) ,
j j
(40)
the known variables in E′ , and g i a function returned by T when the arguments are Z i
j j
j j j
and EZ i . From the elements (Z i , EZ i ) ∈ C, j = 1, 2, . . . , s i , we can thus, by using (40) and
also that X i = unDiff (Z i ) ∪ W i , obtain
ż i = g i (z1 , z1 , . . . , z i , w1 , w2 , . . . , w i , y) , (41)
where z i = (z1i , z2i , . . . , zsi i ) and a vector of the variables in Z i , w i a vector of the variables
in W i , y a vector of the known variables in E′ , and g i = (g1i , g2i , . . . , gsi i ).
j j j j
Now instead consider an element (W i , EW i ) ∈ C. Since also (W i , EW i ) ∈ SW i , where
j
SW i is given by (34) the property (35) holds. Since EW i ⊆ EW i ⊆ E i , and (X i , E i ) ∈ S we
also have that
j
varX (EW i ) ∩ {X i+1 ∪ X i+2 ∪ . . . ∪ Xs } = ∅, (42)
j j
for i = 1, 2, . . . , s − 1. By using that the AE tool T can solve EW i for W i due to lines 27-31,
that X i = unDiff (Z i ) ∪ W i and varD (EW i ) ∩ Z i = ∅ due to lines 6-8 and 12-14, and then
utilize (35) and (42), we can obtain
Note that the absence of vectors z˙i in (43) is a direct implication of the assumption that
each differentiated variable is present in only one equation in the original model and
therefore also in the BLT semi-explicit DAE system. Since z˙i , obviously, is present in (41),
it can not be present in (43).
A. Proofs of Theorems and Lemmas 79
j j
from the elements (W i , EW i ) ∈ C, j = 1, 2, . . . , p i . Comparing (41) and (44) with the
j j
system in Definition 2, shows that the elements (Z i , EZ i ) ∈ C, j = 1, 2, . . . , s i and
j j
(W i , EW i ) ∈ C, j = 1, 2, . . . , p i , corresponds to the i:th block of a BLT semi-explicit
DAE form. Applying the above arguments for i = 1, 2, . . . , s then implies that the ordered
set C in (36) can be used to obtain a system in BLT semi-explicit DAE form with s blocks.
Thus, C is computation sequence for X′ with T .
It now remains to show that C is a minimal and irreducible computation sequence
for X′ with T . We begin with the irreducibility of C. In the beginning of this proof, we
showed that all elements of C, given by (36), correspond to SCCs. We have also concluded
j j
that due to the assumptions regarding the model in Section 1, all elements (Z i , EZ i ) ∈ C
j j
are of size one, i.e., trivially irreducible. Now consider an element (W i , EW i ) ∈ C and
j j j j j j j j
assume that we partition W i as W i = W i1 ∪ W i2 and EW i as EW i = EW i 1 ∪ EW i 2 and form
j j j j j j
the two new elements (W i1 , EW i 1 ) and (W i2 , EW i 2 ). Due to the fact that (W i , EW i )
j
corresponds to a SCC, EW i is a dependent equation set with respect to the variables in
j j j j j
W i . This implies that when applying T to the elements (W i1 , EW i 1 ) and (W i2 , EW i 2 ),
we obtain the two equations
w i1 = h1i1 (. . . , w i2 , . . .)
j j
w i2 = h1i2 (. . . , w i1 , . . .) ,
j j
which clearly not has the structure of equations contained in a BLT semi-explicit DAE
system, due to the cyclic dependence between the equations. Hence, a system in BLT semi-
j j
explicit DAE form can not be obtained when the element (W i , EW i ) ∈ C is partitioned,
which violates condition 2) in Definition 3. We can then conclude that no elements of C
can be further partitioned and hence C is an irreducible computation sequence for X′
with T .
The minimality of C for X′ with T trivially follows from the fact that (38) holds. Since
as (38) is fulfilled, all elements in C is needed to compute the variables in X′ . This implies
that any attempt to form a computation sequence for X′ with T by using a subset of C
will violate condition 1) in Definition 3. This completes the proof.
80 Paper A. Residual Generators for Fault Diagnosis using . . .
References
U. M. Ascher and L. M. Petzold. Computer Methods for Ordinary Differential Equations
and Differential-Algebraic Equations. Siam, 1998.
A. S. Asratian, T. M. J. Denley, and R. Häggkvist. Bipartite Graphs and their Applications.
Cambridge University Press, 1998.
L. Barford, E. Manders, G. Biswas, P. Mosterman, V. Ram, and J. Barnett. Derivative
estimation for diagnosis. Technical report, HP Labs Technical Reports, 1999.
M. Blanke, M. Kinnaert, J. Lunze, and M. Staroswiecki. Diagnosis and Fault-Tolerant
Control. Springer, second edition, 2006.
K. E. Brenan, S. L. Campbell, and L. R. Petzold. Numerical Solution of Initial-Value
Problems in Differential-Algebraic Equations. Siam, 1989.
R. W. Brockett. Finite-Dimensional Linear Systems. Wiley, New York, 1970.
J. P. Cassar and M. Staroswiecki. A structural approach for the design of failure detection
and identification systems. In Proceedings of IFAC Control Ind. Syst., pages 841–846,
Belfort, France, 1997.
F. E. Cellier and H. Elmqvist. Automated formula manipulation supports object-
oriented continuous-system modeling. IEEE Control Systems Magazine, 13(2):28–38,
April 1993.
F. E. Cellier and E. Kofman. Continuous System Simulation. Springer, 2006.
V. Cocquempot, R. Izadi-Zamanabadi, M. Staroswiecki, and M. Blanke. Residual
generation for the ship benchmark using structural approach. In Proceedings of the
UKACC International Conference on Control ’98, pages 1480–1485, September 1998.
C. De Persis and A. Isidori. A geometric approach to nonlinear fault detection and
isolation. IEEE Transactions on Automatic Control, 46:853–865, 2001.
A. L. Dulmage and N. S. Mendelsohn. Coverings of bi-partite graphs. Canadian Journal
of Mathematics, 10:517–534, 1958.
D. Dustegor, V. Cocquempot, and M. Staroswiecki. Structural analysis for residual
generation: Towards implementation. In Proceedings of the 2004 IEEE Inter. Conf. on
Control App., pages 1217–1222, 2004.
E. Frisk, M. Krysander, M. Nyberg, and J. Åslund. A toolbox for design of diagnosis
systems. In Proceedings of IFAC Safeprocess’06, Beijing, China, 2006.
P. Fritzon. Principles of Object-Oriented Modeling and Simulation with Modelica 2.1.
IEEE Press, 2004.
E. Hairer and G. Wanner. Solving Ordinary Equations II - Stiff and Differential-Algebraic
Problems. Springer, 2002.
References 81
83
Realizability Constrained Selection of Residual
Generators for Fault Diagnosis with an
Automotive Engine Application
Carl Svärd, Mattias Nyberg, and Erik Frisk
Abstract
This paper considers the problem of selecting a set of residual generators, ful-
filling requirements regarding fault isolability and minimal cardinality, for in-
clusion in a model-based FDI-system. Two novel algorithms for solving the
selection problem are proposed. The first one provides an exact solution ful-
filling both requirements and is suitable for small problems. The second one,
which constitutes the main contribution, is suitable for large problems and
provides an approximate solution by means of a greedy heuristic by relaxing the
minimal cardinality requirement. The foundation for the algorithms is a novel
formulation of the selection problem which enables an efficient reduction of the
search-space by taking the realizability properties of the model, with respect
to the considered residual generation method, into account. Both algorithms
are general in the sense that they are aimed at supporting any computerized
residual generation method. In a case study the greedy selection algorithm is
successfully applied to the complex problem of finding a suitable set of residual
generators for detection and isolation of faults in an automotive engine system.
In this study a prior known sequential residual generation method is considered.
85
86 Paper B. Realizability Constrained Selection of Residual Generators . . .
1 Introduction
Model-based Fault Detection and Isolation (FDI) systems typically contains the three
sub-systems: residual generation, residual evaluation, and fault isolation, see, e.g., Blanke
et al. (2006). In this work, as in for example Nyberg (1999); Krysander (2006); Nyberg
and Krysander (2008); Svärd and Nyberg (2010), design of the residual generation
sub-system is considered to be a two-step approach. In the first step, a large set of
candidate residual generators are found. In general, it may be possible to find thousands
of candidate residual generators for large models and regarding implementation aspects
such as complexity and computational load it is infeasible, or even impossible, to use
all these in the FDI-system. In addition, it is often possible to meet stated requirements
with a, possibly small, subset of all residual generators. Therefore, in the second step, the
set of candidate residual generators most suitable to be included in the FDI-system are
selected. The topic of this paper is the selection problem emerging in the second step.
The selection problem is formulated by considering two different requirements on
the final set of residual generators. Firstly, it is required that the set of residual generators
fulfills an isolability requirement stating which faults that should be isolated from each
other. Motivated by the implementation aspects mentioned above, a set of residual
generators of low cardinality is preferred before a set of high cardinality, given that the
two sets have equal isolability properties. Therefore, secondly, it is required that the set
of residual generators is of minimal cardinality.
Two novel algorithms for solving the selection problem are proposed in this paper.
The first one provides an exact solution fulfilling both the isolability and the minimal
cardinality requirements and is suitable for small problems. The second one, which
is the main contribution, relaxes the minimal cardinality requirement and provides
an approximate solution by means of a greedy heuristic. This algorithm is suitable
for large, real-world, problems for which the approach used in the first algorithm is
intractable. Both algorithms are general in the sense that they are aimed at supporting
any computerized residual generation method.
In general, all the candidate residual generators found in the first step of the design
process are not realizable, i.e., it is not possible to create residual generators from all
found candidate residual generators. Typically, evaluation of realizability is a computa-
tional demanding task. Therefore, in those cases where the number of found candidate
residual generators is large, it may not be feasible to first evaluate the realizability of all
found candidate residual generators and then make the selection. To handle this, the
proposed algorithms exploits a novel formulation of the selection problem which takes
the realizability aspect into account. This, in addition, enables an efficient reduction of
the search-space which typically is quite large for practical problems. In this formulation,
which in fact is an optimization problem, isolability and realizability properties are stated
in terms of attributes of subsets of the model equations.
In Section 2, a motivating industrial application example is presented. Section 3
presents preliminaries regarding realizability and fault isolability, given a residual gen-
eration method. The residual generator selection problem is formalized in Section 4.
The first selection algorithm is presented and discussed in Section 5. The second, greedy,
algorithm is presented and justified in Section 6. Section 7 briefly describes the residual
2. Motivating Application Example 87
xegr
EGR-cooler
EGR-valve
Wegr
uδ
ne
xvgt
∆Wim pim Wei Weo pem Wt
Tim Tem
Intake Turbine
manifold
∆Wem
Exhaust ωt
Cylinders manifold
Wth
xth pbc
pic Tbc
Wc
Figure 1: Overview of the automotive engine System. Considered faults marked with red
arrows.
generation method (Svärd and Nyberg, 2010) which is used in the application example.
In Section 8 the greedy selection algorithm is used to solve the industrial application
problem described in Section 2. The paper is concluded in Section 9.
1
Fault
12
1 10 20 30 40 50 60 70
Candidate Residual Generator
Figure 2: Fault sensitivity for a small subset of the 14,242 candidate residual generators
found for the automotive engine system. A square in position (i,j) denotes that the
residual generator corresponding to column j is sensitive to the fault corresponding to
row i.
residual generators would be sufficient in order to isolate the 12 considered faults from
each other. Thus, a set of 12 residual generators, capable of isolating the 12 faults, should
be selected from the set of 14,242 candidate residual generators which means that the
search-space is quite large.
The fault sensitivity for a small subset of the found candidate residual generators, with
respect to the 12 considered faults, are shown in Figure 2. According to the figure, most
residual generators are sensitive to most faults and it is therefore not straightforward
to perform the selection. In addition, as said in Section 1, the sought set of residual
generators should be realizable and preferably of minimal cardinality. Due to the vast
number of candidate residual generators it is not possible to perform a complete search in
order to find the set of residual generators, which makes the selection problem non-trivial.
In Section 8 this selection problem will be reconsidered and solved.
3 Preliminaries
The purpose of this section is to formally introduce the notions of realizability and
isolability, given a residual generation method, and ultimately derive necessary and
sufficient conditions for fault isolability in terms of properties of model equation subsets.
Consider a model, M = (E, X, Y, F), containing an equation set E relating the un-
known variables X, known variables Y, and fault variables F. Without loss of generality,
the following is assumed regarding the model.
Assumption 1. Each fault f ∈ F is contained in one, and only one, of the equations in the
model M.
Note that if a fault f ∈ F is contained in more than one equation, the fault f can be
replaced with a new variable x f in these equations, and the equation x f = f added to the
equation set E. This added equation will then be the only equation where f occurs.
Given a model, a residual generator is formally defined as follows.
Definition 1 (Residual Generator). Let M = (E, X, Y, F) be a model. A system R with
input Y and output r is a residual generator for M, and r is a residual, if f = 0 implies
r = 0 for all f ∈ F.
3. Preliminaries 89
Definition 2 (Fault Sensitivity). Let R be a residual generator for the model M. Then R is
sensitive to fault f ∈ F if f ≠ 0 implies r ≠ 0.
Note that in practice, residuals typically deviate from zero even in the case when
all faults are zero due to for example unknown initial conditions, changes in operating
conditions, and uncertainties such as modeling errors and noise. Therefore, residuals are
often thresholded as a part of the residual evaluation mentioned in Section 1, where the
aim is to detect changes in the residual behavior caused by faults.
The notions of residual generator and fault sensitivity are possible to make more
precise and formal, see for example Blanke et al. (2006); Patton et al. (2000); Chen and
Patton (1999), and references therein. This is however not necessary in the context of
this work for which the above definitions are sufficient.
3.1 Realizability
The method used for design of residual generators plays a central role in this work. A
residual generation method is formally defined as follows.
Definition 4 (Realizability with method M). Let S be an equation set and M a residual
generation method. Then S is realizable with M if M (S) ≠ ∅.
For an example, consider a model containing the following set of differential and
algebraic equations
e1 ∶ ẋ 1 = −x 1 + u + f 1
e2 ∶ y1 = x1 + f2 (1)
e3 ∶ y2 = x1 + f3 ,
Consider again the model in (1) and the linear, static, residual generation method
M′ with which the equation set {e 2 , e 3 } is realizable. Due to this fact and since e f 2 =
e 2 ∈ {e 2 , e 3 }, e f 3 = e 3 ∈ {e 2 , e 3 }, and e f 1 = e 1 ∈/ {e 2 , e 3 }, it can be deduced from
Proposition 2 that faults f 2 and f 3 are both isolable from fault f 1 with the residual
generator R ′ = M′ ({e 2 , e 3 }).
Note that even though additive faults were considered in this example above, the
framework in this paper is general and independent on the fault model, i.e., also multi-
plicative faults are allowed.
such that S f i f j is realizable with M, and for which e f i ∈ S f i f j and e f j ∈/ S f i f j . Given the
equation subsets S f i f j , a set of residual generators fulfilling F can be constructed as
R = {M (S f i f j ) ∶ ∀ ( f i , f j ) ∈ F} . (2)
I f i f j = {S ∈ SM ∶ e f i ∈ S ∧ e f j ∈/ S} . (3)
I = {I f i f j ∶ ∀ ( f i , f j ) ∈ F} , (4)
5. Minimal Hitting Set Based Selection 93
R = {M (S) ∶ ∀S ∈ S} , (5)
if and only if
∀I ∈ I, S ∩ I ≠ ∅. (6)
Proof. Assume first that F is fulfilled with R defined according to (5). First note that
this implies that for each ( f i , f j ) ∈ F there exists a residual generator R ∈ R such that f i
is isolable from f j with R. This, Proposition 2, and (5), imply that for each ( f i , f j ) ∈ F
there exists a S ∈ S such that R = M (S) ∈ R, e f i ∈ S, and e f j ∈/ S. This implies, since
S ∈ S and S ⊆ SM , that S ∩ I f i f j ≠ ∅ where I f i f j is defined according to (3). Hence,
for each ( f i , f j ) ∈ F there exists S ∈ S such that S ∩ I f i f j ≠ ∅. Since (4) holds, this
implies that (6) is satisfied and the first part of the proof is complete. For the converse,
assume that (6) is satisfied. This, (3) and (4) implies that for each ( f i , f j ) ∈ F there exists
S ∈ S such that e f i ∈ S and e f j ∈/ S. This and the fact that all S ∈ S are realizable with
M, implies via Proposition 2 that for each ( f i , f j ) ∈ F there exists S ∈ S such that f i
is isolable from f j with R = M (S). Thus, if R = {M (S) ∶ ∀S ∈ S} there exists R ∈ R
such that f i is isolable from f j with R for each ( f i , f j ) ∈ F and the proof is complete.
For the set of residual generators R to fulfill also the stated minimal cardinality
requirement, the cardinality of the set S in Lemma 1 should be minimized. Thus, the
residual generator selection problem can be stated as the problem of finding the smallest
set within SM which satisfies (6). To conclude, the selection problem is stated as the
minimization problem
also fulfill the minimal cardinality requirement (7a), S should be a hitting set for I
of minimal cardinality, i.e., a so called minimal cardinality hitting set. By necessity, a
minimal cardinality hitting set is a minimal hitting set, i.e., a hitting set of which no
proper subset is a hitting set.
This fact suggests the following naive, but nevertheless simple, approach for solving
the selection problem (7). First find the collection of all minimal hitting sets for I,
denoted H, and then find the smallest set H ∈ H, where all candidate equation sets S ∈ H
are realizable.
• findCES finds all candidate equation sets for the method M given a model M
and a necessary realizability criterion for M.
• isolClasses returns the set of all isolability classes of a set of candidate equation
sets SM for the isolability requirement F according to (3) and (4).
• findMHS finds all minimal hitting sets for the collection of sets I given as input.
Proof. Consider first the claim concerning the isolability requirement F and assume
that R ≠ ∅. Due to rows 10-17 in Algorithm 1, and the fact that R ≠ ∅, it holds that
R equals (5) and consequently there is a S ∈ H where all S ∈ S is realizable with M.
From rows 4-6 and 7 and the definition of I, see (3) and (4), it can also be deduced that
S ⊆ SM . Hence, S fulfills the prerequisites of Lemma 1. Further, due to rows 4-6, it
can be concluded that S is a (minimal) hitting set for I and thus S fulfills (6). From
Lemma 1 it then follows that this property of S is equivalent to that F is fulfilled with R
which, according to Proposition 1, is equivalent to that F is fulfilled in M with M.
If instead R = ∅, rows 4-7 and 10-17 implies that there is no minimal hitting set in
H where all candidate equation sets are realizable with M. Hence, there is no S ⊆ SM ,
where all S ∈ S are realizable with M, that fulfills (6). This is, due to Lemma 1, equivalent
to that F not is fulfilled with R which is equivalent to that F not is fulfilled in M with
M, due to Proposition 1. This completes the part of the proof considering the isolability
requirement.
Regarding the cardinality of R, or equivalently S, it is first noted that a minimal
cardinality hitting set also is a minimal hitting set, that is, a hitting set of which no
proper subset is a hitting set. Thus, a minimal cardinality hitting set is by necessity
found within the collection H of all minimal hitting sets computed in row 6. Since the
96 Paper B. Realizability Constrained Selection of Residual Generators . . .
search for a realizable minimal hitting set in H, rows 7-22, is exhaustive and performed
by considering the sets in H in increasing order with respect to cardinality, row 8, it is
guaranteed that the first found, and then returned, realizable minimal hitting set is of
minimal cardinality.
The minimal hitting set problem, or the equivalent minimal set covering prob-
lem (Ausiello et al., 1980), is unfortunately known to be NP-complete, see, e.g., Karp
(1972); Aho et al. (1974); Garey and Johnson (1979). Thus, for large problems, that is, cases
when the number of candidate equation sets ∣SM ∣, as well as the number of isolability
classes ∣I∣, is large, it may be impossible, or at least intractable, to obtain the collection
of all minimal hitting sets for I. Two possible improvements of Algorithm 1, which may
overcome this complexity issue, are discussed below.
There are several algorithms that give approximate solutions, typically in the form of a
subset of all minimal hitting sets, to the NP-complete minimal hitting set problem, see
for example Abreu and van Gemund (2009) and references therein. A complicating issue
is however that for large and complex models, typically, only a fraction of the candidate
equation sets are realizable. Indeed, this situation applies to the automotive engine system
considered in Section 8. Typical causes of non-realizability are non-invertible functions
in the model, see for example Svärd and Nyberg (2010), but also numerical issues or
instability. For Algorithm 1, this implies that a vast amount of the found minimal hitting
sets, possibly all, would be discarded since only a fraction of the found minimal hitting
sets contain realizable candidate equation sets. To maximize the possibilities of finding a
minimal hitting set in which all candidate equation sets are realizable, it is important
to start with as many minimal hitting sets as possible. The reduced number of minimal
hitting sets found by an approximate algorithm may therefore not be large enough.
Another alternative approach is to find the realizable subset of all candidate equation sets,
′
SM = {S ∈ SM ∶ M (S) ≠ ∅}, calculate I ′ according to (3) and (4) using SM ′
instead
′
of SM , and then apply a minimal hitting set algorithm to I to obtain S. In general, it
′
holds that ∣SM ∣ < ∣SM ∣ and ∣I ′ ∣ < I, and therefore it is more likely that the set of all
minimal hitting sets can be computed for I ′ than for I. The set SM ′
can be computed
by applying M (⋅) to each S ∈ SM . However, realization of an equation set may be a
computational demanding task, see Section 8.2 for an example. It is therefore desirable
to keep the number of realizations, or realization attempts, at a minimum. Consequently,
this approach may not be preferable if SM is a large set.
It should however be noted that for small problems, where all minimal hitting set
can be found, Algorithm 1 works satisfactory and in those cases it provides an exact, and
yet straightforward and simple, solution to the selection problem.
6. Greedy Selection 97
6 Greedy Selection
Taking into account the complexity issues associated with finding all minimal hitting
sets, and the urge of keeping the number of realizations at a minimum, a more appealing
approach is instead to build the set of candidate equation sets S iteratively, and only
realize those candidate equation sets that are likely to be part of S. To employ this iterative
approach, a heuristic is needed for identifying and selecting a candidate equation set in
each iteration.
σI (S) = {I ∈ I ∶ ∃S ∈ S, S ∈ I} . (8)
Basically, σI (S) states which of the isolability classes in I that are covered by the
candidate equation sets in S.
Complete Solution
A complete solution to the selection problem is characterized as a set of candidate
equation sets S that fulfills (7b) and (7c). The hitting set requirement (7c) can with the
isolability class coverage notion be formulated as σI (S) = I.
Utility Function
The aim is fulfill the isolability requirement, formalized by (7b) and (7c), with as few
candidate equation sets as possible (7a). In line with this, the following utility function
will be used to evaluate a specific candidate equation set,
reflecting how many of the isolability classes in I that are covered by the candidate
equation set S ∈ SM . According to the greedy approach the candidate equation set
that maximizes µI (S), i.e., covers most isolability classes, should be selected in each
iteration.
98 Paper B. Realizability Constrained Selection of Residual Generators . . .
The procedures findCES and isolClasses are the same as in Algorithm 1 and
described in Section 5.2. The procedure pickCES, taking a set H containing candidate
equation sets as input, returns one of the equation sets in H. This function enables usage
of an additional, user-provided, heuristic for selecting one single candidate equation set
among candidate equation sets of equal utility by analyzing both structural and analytical
properties of equation sets. For instance, pickCES can be used to pick the candidate
equation set of lowest cardinality, i.e., containing fewest equations or to pick a candidate
equation set not containing a troublesome non-linearity.
Note that the complexity of Algorithm 2 is linear in the number of elements of
SM , in comparison with the NP-completeness of Algorithm 1 originating from the
search for all minimal hitting sets. For a further complexity analysis of Algorithm 2, the
complexity of the procedure findCES is of most interest. The complexity of findCES is
6. Greedy Selection 99
however dependent of the actual method used for residual generation. For the method
employed in Section 8, the procedure corresponding to findCES has nice complexity
properties (Krysander et al., 2008).
Proof. According to rows 5, 6, 14, and 21, and rows 4, 7, 16, and 18, there are two different
termination conditions in Algorithm 2; either I = ∅ or SM = ∅.
Consider first the case when Algorithm 2 terminates because of the condition on
row 6, i.e., I = ∅, and let n denote the total number of iterations performed by Algo-
rithm 2 in which the condition on row 11 is met. Further let S i , R i , I i , S ∗i , and R i , denote
the values of the variables S, R, I, S ∗ , and R, respectively, after iteration i. By assumption,
and due to row 6, it holds that In = ∅. Further, it holds that S0 = R0 = ∅, and I0 = I. By
assumption also R ≠ ∅ and therefore Rn ≠ ∅ and Sn ≠ ∅, due to rows 12 and 13. In fact,
∗
due to rows 10-12, it can be concluded that Rn = ⋃n−1 n−1
i=1 {R i }, and S n = ⋃ i=1 {S i }, where
∗ ∗
R i = M (S i ), and thus each S i ∈ Sn is realizable with M and the relation between Rn
and Sn is the same as between R and S in (5). Moreover, due to rows 7-9, it holds that
each S ∗i ∈ Sn is contained in SM and therefore Sn fulfills the prerequisites of Lemma 1.
∗
From row 14 it can be deduced that I0 = ⋃n−1 i=1 σI ({S i }). From (8), it follows that for
i = 1, 2, . . . , n − 1 and for all I ∈ σI ({S i }) it holds by definition that S ∗i ∈ I. Therefore,
∗
∗ ∗
since Sn = ⋃n−1 n−1
i=1 {S i }, it holds that S n ⋂ I ≠ ∅ for all I ∈ I0 = ⋃ i=1 σI ({S i }). According
to Lemma 1, this property of S = Sn is equivalent to that F is fulfilled with R = Rn
which, due to Proposition 1, is equivalent to that F is fulfilled in M with M.
Consider now instead the case when Algorithm 2 terminates because of the condition
on row 7 and let n denote the total number of iterations in which the condition on row 11
is met. With similar arguments and notations as above, it holds that Rn = ⋃n−1 i=1 {R i } and
∗
Sn = ⋃n−1 i=1 {S i }, where R i = M (S i ). Since termination of Algorithm 2 by assumption
∗
was due to the condition on row 7, it holds that In = I0 ∖ {⋃n−1 i=1 σI ({S i })} ≠ ∅.
Thus, there exists I ∈ I0 such that Sn ⋂ I = ∅ and consequently, by Lemma 1, it can
be deduced that F not is fulfilled with R = Rn . However, if I ′ = ⋃n−1 ∗
i=1 σI ({S i })
′ ′ ′ ′
and F = {( f i , f j , ) ∈ F ∶ I f i f j ∈ I }, Lemma 1 implies that F is fulfilled with R. By
n n
assumption and row 7, it holds that SM = ∅. Therefore, there are no S ∈ SM that can be
100 Paper B. Realizability Constrained Selection of Residual Generators . . .
used to isolate the fault pairs in F ∖ F ′ and thus F ′ is the maximum attainable isolability
for M with M.
Note that if the isolability requirement not can be fulfilled, the MHS-based Al-
gorithm 1 will return an empty set due to the non-existence of minimal hitting sets.
Algorithm 2 will instead provide the best possible solution, in terms of fault isolability,
with regard to the given method. However, if the output from Algorithm 2 is an empty
set, there are no realizable candidate equation sets that contribute to fulfill the stated
isolability requirement.
The problem (11) is referred to as a set covering problem, and can be shown to be equivalent
to the previously considered minimal hitting set problem
min ∣S∣, s.t. ∀I ∈ I, S ⋂ I ≠ ∅, (12)
S⊆SM
that is, the selection problem (7) with the realizability condition (7b) relaxed. In fact, if
U ∗ is a solution to the set covering problem (11), a solution S ∗ to the minimal hitting
set problem (12) can be constructed by finding for each U ∈ U ∗ a S ∈ SM such that
σI ({S}) = U. The converse is given by (10) with UM and SM replaced by U ∗ and S ∗ ,
respectively.
Consider now solving (11) approximately with a greedy heuristic equivalent to the
one described in Section 6. Namely, in each iteration, until all isolability classes in
UM are covered, select the one U ∈ UM that covers most uncovered isolability classes,
i.e., the U ∈ UM of highest cardinality. Denote the resulting solution U. It can be
shown (Johnsson, 1974; Lovász, 1975), that
k
∣U∣ 1
∗
≤ ∑ ≤ ln k + 1, (13)
∣U ∣ j=1 j
7. Sequential Residual Generation 101
where U ∗ is an exact solution to (11) and k is the cardinality of the largest set in UM .
As said, the greedy heuristic described above for solving problem (11) coincide with
the heuristic described in Section 6 for solving problem (12). Since the two problems are
equivalent, it can be concluded that the worst case bound (13) also holds for approximate
solutions to (12) obtained by usage of the greedy heuristic described in Section 6. This
fact is summarized in the following result.
k
∣R∣ 1
≤ ∑ ≤ ln k + 1, (14)
∣R∗ ∣ j=1 j
where R∗ is the exact solution to the residual generator selection problem, and k is the
cardinality of the largest set in UM , defined according to (10).
ẋ 1 = −x 1 + u (16a)
r = y1 − x1 . (16b)
In fact, also C2 = (({x 1 }, {e 2 })) and C3 = (({x 1 }, {e 3 })) are computation sequences for
x 1 . For instance, the sequential residual generator corresponding to C2 and the residual
equation e 3 is
x1 = y1 (17a)
r = y2 − x1 . (17b)
In Svärd and Nyberg (2010, Theorem 2), it is shown that the equations in a minimal and
irreducible computation sequence together with a redundant residual equation, in fact
correspond to a Minimal Structurally Overdetermined (MSO) set, see Krysander et al.
(2008). As said above, a non-empty computation sequence returned by findComputa-
tionSequence in Algorithm 3 is indeed minimal and irreducible. Thus, if an equation
set S is realizable with the sequential residual generation method then S is an MSO set.
Consequently, a necessary realizability criterion for the method is that the equation set
used as input is an MSO set and hence an MSO set is a candidate equation set for the
method. There are efficient algorithms for finding all MSO sets in a large set of equations,
see, e.g., Krysander et al. (2008).
For the model (1), it is possible to find in total three MSO sets. These are given by S 1 =
{e 1 , e 2 }, S 2 = {e 1 , e 3 }, and S 3 = {e 2 , e 3 }. In fact, the sequential residual generators (16)
and (17) are created from the MSO sets S 1 and S 3 , respectively.
As a side remark, note that the maximum number of sequential residual generators
that can be constructed from an MSO set equals the number of equations in the set. All
residual generators created from the same MSO set however have equal fault sensitivity
properties according Assumption 2. Nevertheless, their actual fault sensitivity may differ
due for example different sensitivity for noise, etc. To make the final selection of which
of the residual generators created from an MSO set that should be included in the final
diagnosis system, evaluation by means on execution using real measurements from
different fault cases might be needed. For this purpose, Algorithm 3 can be trivially
modified to return all residual generators that can be created from the MSO set used
input, and not only one.
8. Application Example 105
Fault Description
f Wic Leakage, intercooler
f Wim Leakage, intake manifold
f Wem Leakage, exhaust manifold
f u xth Fault, throttle position actuator
f u xegr Fault, EGR-valve position actuator
f u xvgt Fault, VGT-valve position actuator
f y pamb Fault, ambient pressure sensor
f y Tamb Fault, ambient temperature sensor
f y pic Fault, intercooler pressure sensor
f y pim Fault, intake manifold pressure sensor
f y Tim Fault, intake manifold temperature sensor
f y pem Fault, exhaust manifold pressure sensor
8 Application Example
In this section, the selection algorithms presented in Section 5 and 6 are applied to the
automotive engine system introduced in Section 2. The residual generation method
considered in this study is briefly outlined in Section 7.
Consider again the Scania truck diesel engine system introduced in Section 2, which
is shown in Figure 1. The main incentive for diagnosis of this system is the stricter
emission legislation requirements for heavy-duty trucks, which in turn implies stricter
on-board diagnosis (OBD) legislation requirements. The OBD-legislation states that all
manufactured vehicles must be equipped with a diagnosis system capable of detecting and
isolating faults in all components that, if broken, result in emissions above pre-defined
OBD-thresholds during a specified test cycle.
For the considered system, emission critical components include all actuators and
sensors, and to meet the OBD-requirements it is desirable that, at least, single faults in
these can be detected and isolated. Other emission critical components are pipes and
hoses. In particular, a broken pipe or hose may lead to gas-leakage which may increase
emissions. Leakages in or near the intercooler, intake manifold, and exhaust manifold are
particularly critical. It is desirable that these leakages can be detected and isolated, from
each other, but also from all sensor and actuator faults. In total, there are 12 emission
critical components and consequently 12 faults that should be isolated from each other
in the system. All the 12 considered faults for the system, along with their description,
can be found in Table 1.
106 Paper B. Realizability Constrained Selection of Residual Generators . . .
The Model
The model of the system used in this work is described in Wahlström and Eriksson
(2011) and relies on both fundamental first principle physics and gray-box modeling. The
model describes the behavior of the system in the no-fault case, i.e., it is a nominal model.
To incorporate fault information in the nominal model, faults are modeled as additive
signals in corresponding equations. For example, fault f y pim , representing a fault in the
intake manifold pressure sensor y pim , is modeled by simply adding f y pim to the equation
describing the relation between the sensor value y pim and the actual intake manifold
pressure pim according to y pim = pim + f y pim .
The model contains in total 46 equations, 43 unknown variables, 11 known variables,
and the 12 faults in Table 1. Of the 11 known variables, 3 are actuators, 6 are sensors, and
2 are control inputs. Of the 46 equations, 5 are differential equations and the rest are
algebraic equations. The model contains several non-linear functions.
5
10
4
10
3
10
|H|
2
10
1
10
0
10
2 3 4 5 6 7
|F |
Figure 3: The total number of minimal hitting sets, ∣H∣, as function of the cardinality of
the set of considered faults, ∣F∣. The number of minimal hitting sets grows rapidly with
the number of faults.
One simple way to reduce the size of the selection problem is to consider only a subset
of the faults in Table 1, and then calculate F and I for this smaller set of faults. For
each cardinality number, several randomized subsets of faults were chosen from the
set of 12 faults. Figure 3 presents, in logarithmic scale, the mean cardinality of the set
of all minimal hitting sets, ∣H∣, as a function of the cardinality of the set of considered
faults, ∣F∣. The minimal hitting sets were computed using a C++ implementation of the
algorithm presented in de Kleer and Williams (1987). From Figure 3 it can be seen that
the number of minimal hitting sets grows rapidly with the number of faults, and that the
total number of minimal hitting sets is over 30,000 already for 7 faults. Given this, it is
not that surprising that the problem with 12 faults was not possible to solve.
is non-invertible non-linear functions in the automotive engine model, see Svärd et al.
(2011) for a discussion of a similar result regarding a similar model.
By using the set of 59 realizable candidate equation sets, the size of the selection
problem is substantially reduced. Even for this smaller problem, it was unfortunately
not possible to compute the set of all minimal hitting sets within feasible time, no
termination after 24 h, using the same C++ implementation as above of the minimal
hitting set algorithm (de Kleer and Williams, 1987).
The other improvement of Algorithm 1 suggested in Section 5.2 is to use an approx-
imative MHS-algorithm to compute a subset of all minimal hitting sets. Neither this
approach did succeed, since it was impossible to find a realizable minimal hitting set
within feasible time due to the large number of non-realizable candidate equation sets.
f y pamb
f y pamb
f u xegr
f y pem
f u xvgt
f Wem
f y pim
f y Tim
f Wim
f u xth
f y pic
f Wic
R1 x x x x x x x x x
R2 x x x x x x x x x x
R3 x x x x x x x x x x
R4 x x x x x x x x x x
R5 x x x x x x x x x x
R6 x x x x x x x x x x
R7 x x x x x x x x x x
R8 x x x x x x x x x x
R9 x x x x x x x x x x
R 10 x x x x x x x x x x
R 11 x x x x x x x x x x
11
Exact Solution
Greedy Solution
10
7
|R|
2
2 3 4 5 6 7 8 9 10 11 12
|F |
Figure 4: Median cardinalities of exact and greedy solutions, as functions of the cardinality
of the set of considered faults, to the automotive engine selection problem.
6
10 Exact Algorithm
Greedy Algorithm
4
10
2
10
Time [s]
0
10
−2
10
2 3 4 5 6 7 8 9 10 11 12
|F |
Figure 5: Mean execution times for the exact and greedy minimal cardinality hitting
sets algorithms, as functions of the cardinality of the set of considered faults, for the
automotive engine selection problem.
8. Application Example 111
f y pamb
f y Tamb
f u xegr
f y pem
f u xvgt
f Wem
f y pim
f y Tim
f Wim
f u xth
f y pic
f Wic
f Wic x x x
f Wim x x x
f Wem x x x
f y pamb x x
f y Tamb x x x
f y pic x x x
f y pim x x x
f y Tim x x x
f y pem x x x
f u xth x x x
f u xegr x x x
f u xvgt x x
along with the median cardinalities of the greedy solutions are shown in Figure 6, for the
same instances of the selection problem used above. It can be seen that the cardinalities
of the greedy solution differ substantially from the worst-case bound. From this and the
fact that the cardinalities of the greedy solutions are more or less equal to the cardinalities
of the exact solutions, according to Figure 4, it can be concluded that for the automotive
engine selection problem, the bound (14) is very conservative.
45 Greedy Solution
Worst-Case Bound
40
35
30
25
|R|
20
15
10
2 3 4 5 6 7 8 9 10 11 12
|F |
Figure 6: The median cardinalities of the greedy solution to the truck diesel engine
selection problem compared with the worst-case bound provided in Theorem 3.
and run off-line. As input data, a set of measurements from an engine test bed during a
World Harmonized Test Cycle (WHTC) was used. In two separate runs, faults in the
intake manifold pressure sensor pim and intercooler pressure sensor pic were injected.
Both faults were in the form of a 20% positive gain of the corresponding pressure sensor
signal, i.e., y pim = 1.2 ⋅ pim and y pic = 1.2 ⋅ pic where pim and pic are the actual intake
manifold pressure and intercooler pressure signals, respectively.
The residuals obtained as output from the residual generators R 2 and R 4 , for each
of the faults f y pim and f y pic , are shown in Figure 7. From the figure it can be seen that
residual generator R 2 (top figure) responds to the fault f y pim but not to fault f y pic , and that
residual generator R 4 (bottom figure) responds to fault f y pic but not to fault f y pim . Clearly,
for these fault cases, R 2 is indeed sensitive to f y pim but not to f y pic , and R 4 sensitive to
f y pic but not to f y pim . Thus, fault f y pim is isolable from fault f y pic and vice versa, with the
residual generators R 2 and R 4 .
9 Conclusions
Two novel algorithms for solving the residual generator selection problem have been pro-
posed. The foundation for both algorithms was a formulation of the selection problem, in
the form of an optimization problem, where the isolability requirement was equivalently
stated in terms of properties of subsets of the model equations. The formulation enabled
an efficient reduction of the search-space by taking the realizability properties of equation
subsets, with respect to the considered residual generation method, into account. Both
algorithms are general in the sense that they are aimed at supporting any computerized
9. Conclusions 113
6
fypim
4 fypic
R2
2
−2
1610 1620 1630 1640 1650 1660 1670 1680 1690
Time [s]
6
fypim
4 fypic
R4
−2
1610 1620 1630 1640 1650 1660 1670 1680 1690
Time [s]
Figure 7: Residuals from residual generator R 2 (top figure) and residual generator R 4
(bottom figure) for the fault cases f y pim (solid lines) and f y pic (dashed lines). Both faults
are injected at t = 1630s. The dash dotted lines suggest how thresholds may be set in
order to detect the faults.
Acknowledgment
This work was sponsored by Scania and VINNOVA (Swedish Governmental Agency for
Innovation Systems).
114 Paper B. Realizability Constrained Selection of Residual Generators . . .
References
R. Abreu and A. J. C van Gemund. A low-cost approximate minimal hitting set algorithm
and its application to model-based diagnosis. In V. Bulitko and J. C. Beck, editors,
Proceedings of the Eighth Symposium on Abstraction, Reformulation, and Approximation,
pages 2–9, Lake Arrowhead, California, USA, September 2009.
A. V. Aho, J. E. Hopcroft, and J. D. Ullman. The Design and Analysis of Computer
Algorithms. Addison-Wesley, 1974.
G. Ausiello, A. D’Atri, and M. Protasi. Structure preserving reductions among convex
optimization problems. Journal of Computer and System Sciences, 21(1):136 – 153, 1980.
doi:10.1016/0022-0000(80)90046-X.
P. E. Black. Greedy algorithm. Dictionary of Algorithms and Data Struc-
tures (online), U.S. National Institute of Standards and Technology, February 2005.
http://tinyurl.com/3x5zzpp, Accessed: 2010-09-13.
M. Blanke, M. Kinnaert, J. Lunze, and M. Staroswiecki. Diagnosis and Fault-Tolerant
Control. Springer, second edition, 2006.
J. P. Cassar and M. Staroswiecki. A structural approach for the design of failure detection
and identification systems. In Proceedings of IFAC Control Ind. Syst., pages 841–846,
Belfort, France, 1997.
J. Chen and R. J. Patton. Robust Model-Based Fault Diagnosis for Dynamic Systems.
Kluwer Academic Publishers, 1999.
V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of Operations
Research, 4(3):233–235, 1979.
J. de Kleer. Hitting set algorithms for model-based diagnosis. In Proceedings of 22nd
International Workshop on Principles of Diagnosis (DX-11), Murnau, Germany, 2011.
J. de Kleer and B. C Williams. Diagnosing multiple faults. Artificial Intelligence, 32(1):
97–130, 1987.
M. R. Garey and D. S. Johnson. Computers and Intractability – A Guide to the Theory of
NP-Completeness. W.H. Freeman and Company, 1979.
E. R. Gelso, S. M. Castillo, and J. Armengol. An algorithm based on structural analysis
for model-based fault diagnosis. Artificial Intelligence Research and Development, 184:
138–147, 2008.
D. S Johnsson. Approximation algorithms for combinatorial problems. Journal of
Computer and System Sciences, 9:256–278, 1974.
R. M. Karp. Reducibility among combinatorial problems. In R. E. Miller and J. W.
Thatcher, editors, Complexity of Computer Computation, pages 85–103, New York, 1972.
Plenum Pres.
References 115
M. Krysander. Design and Analysis of Diagnosis Systems Using Structural Methods. PhD
thesis, Linköpings universitet, June 2006.
M. Krysander, J. Åslund, and M. Nyberg. An efficient algorithm for finding minimal
over-constrained sub-systems for model-based diagnosis. IEEE Trans. on Systems, Man,
and Cybernetics – Part A: Systems and Humans, 38(1):197–206, 2008.
L. Lovász. On the ratio of optimal integral and fractional covers. Discrete Math, 1975.
M. Nyberg. Automatic design of diagnosis systems with application to an automotive
engine. Control Engineering Practice, 87(8):993–1005, 1999.
M. Nyberg and M. Krysander. Statistical properties and design criterions for AI-based
fault isolation. In Proceedings of the 17th IFAC World Congress, pages 7356–7362, Seoul,
Korea, 2008.
R. J. Patton, P. M. Frank, and R. N. Clark, editors. Issues of Fault Diagnosis for Dynamic
Systems. Springer, 2000.
S. Ploix, M. Desinde, and S. Touaf. Automatic design of detection tests in complex
dynamic systems. In Proceedings of 16th IFAC World Congress, Prague, Czech Republic,
2005.
B. Pulido and C. Alonso-González. Possible conflicts: a compilation technique for
consistency-based diagnosis. IEEE Trans. on Systems, Man, and Cybernetics. Part B:
Cybernetics, Special Issue on Diagnosis of Complex Systems, 34(5):2192–2206, 2004.
M. Staroswiecki and P. Declerck. Analytical redundancy in non-linear interconnected
systems by means of structural analysis. In Proceedings of IFAC AIPAC’89, pages 51–55,
Nancy, France, 1989.
C. Svärd and M. Nyberg. Residual generators for fault diagnosis using computation
sequences with mixed causality applied to automotive systems. IEEE Transactions on
Systems, Man and Cybernetics, Part A: Systems and Humans, 40(6):1310–1328, 2010.
C. Svärd, M. Nyberg, E. Frisk, and M. Krysander. Residual evaluation for fault diagnosis
by data-driven analysis of non-stationary probability distributions. In Proceedings of
the 50:th IEEE Conference on Decision and Control and European Control Conference
(CDC-ECC 2011), pages 95–102, 2011.
L. Travé-Massuyès, T. Escobet, and X. Olive. Diagnosability analysis based on
component-supported analytical redundancy. IEEE Trans. on Systems, Man, and Cyber-
netics – Part A: Systems and Humans, 36(6):1146–1160, 2006.
J. Wahlström and L. Eriksson. Modeling diesel engines with a variable-geometry
turbocharger and exhaust gas recirculation by optimization of model parameters for
capturing non-linear system dynamics. Proceedings of the Institution of Mechanical
Engineers, Part D: Journal of Automobile Engineering, 225(7), July 2011.
Paper C
☆ A revised version has been submitted to Mechanical Systems and Signal Processing,
2012.
117
Data-Driven and Adaptive Statistical Residual
Evaluation for Fault Detection with an
Automotive Application
Carl Svärd, Mattias Nyberg, Erik Frisk, and Mattias Krysander
Abstract
119
120 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
1 Introduction
Fault diagnosis is becoming more and more important with the increasing demand for
dependable technical systems, driven mostly by economical, environmental, and safety,
incentives. One example is automotive systems, where good fault diagnosis is essential
in order to meet customer demands regarding up-time, efficient repair and maintenance,
and also to fulfill on-board diagnosis (OBD) legislative regulations.
Model-based fault diagnosis typically comprises fault detection and isolation (Blanke
et al., 2006), and the fault detection part contains the essential steps residual generation
and residual evaluation. In the first step, a model of the system is used together with
measurements to generate residuals. In the second step, the residuals are evaluated with
the aim to detect changes in the residual behavior caused by faults in the system. This
works concerns the second step, residual evaluation.
Ideally, residuals are signals that are zero when no faults are present in the system, and
non-zero otherwise. Due to the presence of uncertainties and disturbances, caused by
for instance modeling errors, measurement noise, and unmodeled phenomena, residuals
typically however deviate from zero even in the no-fault case. Moreover, due to changes in
the operating mode of the system, the magnitude of these uncertainties and disturbances
is time-varying, causing the behavior of residuals to be non-stationary. An illustration
is given by Figure 1, where a residual for fault detection in the gas-flow system of a
truck diesel engine is shown. Clearly, the residual is not zero in the no-fault case, and
it is obvious that the residual exhibit non-stationary features. It can also be noted
that the difference between the residual in the no-fault and fault cases is time-varying.
Nevertheless, the fact that there is a difference implies that the present fault is potentially
detectable.
There are two main approaches (Ding et al., 2007) for residual evaluation; statisti-
cal (Willsky and Jones, 1976; Gertler, 1998; Basseville and Nikiforov, 1993; Peng et al., 1997;
Al-Salami et al., 2006; Blas and Blanke, 2011; Wei et al., 2011) and norm-based (Emami-
Naeini et al., 1988; Frank, 1995; Frank and Ding, 1997; Sneider and Frank, 1996; Chen
and Patton, 1999; Zhang et al., 2002; Zhong et al., 2007/03/; Ingimundarson et al., 2008;
Al-Salami et al., 2010; Li et al., 2011; Abid et al., 2011). Statistical approaches exploits the
framework of statistical hypothesis testing in order to detect changes in some parameter
of the probability distribution of the residual, typically by means of likelihood ratio
testing (Gustafsson, 2000). In norm-based approaches, residual evaluation is typically
done by adaptive or constant thresholding of some norm of the residual.
Apparently, when encountering a residual as the one depicted in Figure 1, neither
statistical-based approaches assuming stationary probability distributions, nor norm-
based approaches using constant thresholds, would be successful. A potential solution is
to consider adaptive thresholds (Clark, 1989; Frank, 1994), and use a-priori knowledge,
either qualitative (Ingimundarson et al., 2008; Zhang et al., 2002; Höfling and Isermann,
1996; Emami-Naeini et al., 1988) or quantitative (Sneider and Frank, 1996; Frank, 1995;
Nyberg and Stutte, 2004), to derive non-constant thresholds to take the time-varying
uncertainties and disturbances into account. This paper instead proposes an adaptive
statistical residual evaluation method, which exploits quantitative a-priori knowledge in
the form of data.
2. Problem Formulation 121
200 No Fault
Residual [K] Fault
100
Figure 1: A residual for fault detection in the gas-flow system of a heavy-duty truck diesel
engine in the no-fault (solid) and fault (dashed) cases.
2 Problem Formulation
The residual evaluation problem, as considered in this work, is formally stated in this
section.
122 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
u y
System
Residual
Generator
2.1 Prerequisites
A residual, r, is considered to be the output from a residual generator, taking measure-
ments from a system as input. Typically, the measurements consists of the input u and
output y, see Figure 2. The system is considered to be subject to faults, and the intention
is to detect if any fault is present in the system by monitoring the behavior of the residual.
Note that if a set of residuals sensitive to different faults is used, faults can also be isolated,
see for example Blanke et al. (2006).
The system typically operates in a number of different operating modes, and normal
operation usually involves several of these modes. For an example, consider a heavy-duty
truck diesel engine, for which a residual is shown in Figure 1. Naturally, this system is
designed to operate in a number of different operating modes typically characterized by
engine torque, engine speed, ambient temperature, ambient pressure, etc.
The setup depicted in Figure 2 most often contains uncertainties in the form of
measurement noise or, in the case of a model-based residual generator, modeling errors.
Typically, the magnitudes and nature of the uncertainties are different for different
operating modes of the system. For example, a sensor may be more or less sensitive to
noise in different operating modes, and a model may be more accurate in one operating
mode than another. Since the operating mode of the system varies in time, so does the
magnitudes and nature of the uncertainties. This is the cause of the non-ideal residual
behavior illustrated in Figure 1.
It is assumed that during on-line operation, the current operating mode of the system
is unknown. In addition, it is also assumed that the probability that the system is in a
specific mode is unknown. In this sense, the system can be considered to be subject to
an unknown, i.e., unmeasurable, input signal, determining the current operating mode.
Regarding in particular the first assumption, it is considered to be hard to quantify and
measure all factors, internal and external, that determine the current operating mode
of a system. Furthermore, these factors may be different for different individuals of the
system, or may change over time. However, even if its is possible to determine a set of
measured signals that determines the operating mode, all signals may not be available
for the residual evaluation scheme due to for example fault decoupling principles, or
architectural constraints in the control system software. In addition, even if all signals
2. Problem Formulation 123
are available, they may as well be subject to faults. The second assumption is mainly
motivated by the fact that the operation of a system differs between different individuals
of the same system, and may change over time or due to external unmeasurable factors.
θ i j ≥ 0, j = 1, 2, . . . , M
M (2)
∑ θ i j = 1.
j=1
Under the assumption that there is in total K operating modes, the probability that
R = r can be characterized by the K-component mixture distribution given by the pmf
K
p (r∣α, θ) = ∑ α i p (r∣θ i ) (3)
i=1
with α = (α 1 , α 2 , . . . , α K ) and
⎛ θ 1 ⎞ ⎛ θ 11 θ 12 ⋯ θ 1M ⎞
⎜θ ⎟ ⎜θ θ 22 ⋯ θ 2M ⎟
θ = ⎜ 2 ⎟ = ⎜ 21 ⎟, (4)
⎜ ⋮ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝θ K ⎠ ⎝θ K1 θ K2 ⋯ θKM⎠
α i ≥ 0, i = 1, 2, . . . , K,
K (5)
∑ α i = 1.
i=1
In the context of this work, the mixture weight α i specifies the probability that the
system is in mode i. As said in Section 2.1, the probability that the system is in a specified
operating mode is considered to be unknown. Consequently, α i , i = 1, 2, . . . , K, are
assumed to be unknown and will in the following be considered as nuisance parameters.
124 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
θ1
0.6 θ2
θ3
0.55
0.5
0.45
Residual [-]
0.4
0.35
0.3
0.25
0.2
10 20 30 40 50 60 70 80 90
Time [s]
(a) Residual
θ1
θ2
0.1 θ3
0.08
p(r = xi |α, θ)
0.06
0.04
0.02
0
0.18 0.27 0.37 0.47 0.57 0.67
xi
(b) Distribution of the Residual
Figure 3: Example of a sample from a mixture distribution in the form (3) with 3 compo-
nents θ 1 , θ 2 , and θ 3 , and mixture weights α 1 = α 2 = α 3 = 31 .
3. GLR Test Statistic 125
H0 ∶ θ = θ NF , α ∈ Υ
(7)
H1 ∶ θ ≠ θ NF , α ∈ Υ
where the null hypothesis H 0 corresponds to the no-fault case, i.e., when no fault is
present in the system, and the alternative hypothesis H 1 to the faulty case, i.e., when
one or several faults are present in the system. Next section deals with the problem of
designing a test statistic for the hypotheses (7).
where L (θ, α∣R) is the likelihood function of α and θ, given the set R of residual samples,
and
⎧
⎪ M ⎫
⎪
⎪ ⎪
Θ = ⎨θ ∈ RK×M ∶ θ i j ≥ 0, ∑ θ i j = 1⎬ , (9)
⎪
⎪ j=1
⎪
⎪
⎩ ⎭
denotes the space of the distribution parameter θ as specified by (2). The GLR test
statistic becomes
where p (R∣θ, α) is the joint pmf for the residual samples in R. In the general case, the
expression for the joint pmf is cumbersome to deal with. To make subsequent derivations
tractable, or even possible, it is necessary to pose the following assumption.
Assumption 1. Samples from (3) are independent and identically distributed (iid).
Note that Assumption 1 not may be valid in the general case, since residuals often
are obtained as output from dynamic systems and thereby exhibit Markovian properties.
It can however often be fulfilled in practice by sampling the residual at a sufficiently
low rate. In addition, residuals based on innovation filters (Gustafsson, 2000), e.g., the
Kalman Filter, fulfills the assumption. The residual evaluation approach developed in
this paper has also been shown to be applicable in practical settings, for example in the
application example presented in Section 6.
By using Assumption 1, the joint pmf can be written as
where p (⋅∣α, θ) is given by (3). By using (12), the likelihood (11) takes the form L (α, θ∣R) =
∏r k ∈R p (r k ∣α, θ).
Next, let c j denote how many of the samples in R that have value x j , i.e.,
c j = ∣{r k ∈ R ∶ r k = x j , x j ∈ X }∣ , j = 1, 2, . . . , M. (13)
in the denominator of (8). Under Assumption 1, and by using the log-likelihood func-
tion (15) as well as the structure of the parameter spaces (9) and (6), the MLE problem (16)
can be equivalently stated as
M K
max ∑ c j log [∑ α i θ i j ]
α∈RK , θ∈RK×M j=1 i=1
subject to α i ≥ 0, i = 1, 2, . . . , K,
θ i j ≥ 0, i = 1, 2, . . . , K, j = 1, 2, . . . , M,
K
∑ α i = 1,
i=1
M
∑ θ i j = 1, i = 1, 2, . . . , K, (17)
j=1
M
cj
max ∏ ϕj (20a)
ϕ∈R M j=1
subject to ϕ j ≥ 0, j = 1, 2, . . . , M (20b)
M
∑ ϕ j = 1. (20c)
j=1
3. GLR Test Statistic 129
Proof. First note that by (20b) and Assumption 2 it holds that ϕ j ≥ 0 and c j > 0 for
j = 1, 2, . . . , M. Furthermore, by definition of c j in (13), it also noted that ∑ M
j=1 c j = N.
Consider now the weighted arithmetical and geometrical averages of the quantities
ϕj
cj
≥ 0 with weights c j > 0 for j = 1, 2, . . . , M. According to the inequality of weighted
arithmetic and geometric means, see, e.g., Hardy et al. (1934), it then holds that
¿
c
1 ⎛ M ϕj ⎞ ÁÁ
N
M ϕj j
∑ À∏ ( ) ,
⋅ cj ≥ Á (21)
N ⎝ j=1 c j ⎠ j=1 c j
ϕ1 ϕ2 ϕM
with equality if and only if c1
= c2
=⋯= cM
. For the left hand side of (21), it holds
ϕj
that 1
N
(∑ M
j=1 c j ⋅ c j) = 1
N ∑
M
j=1
1
ϕ j = due to (20c). Exploiting this fact and re-writing
N
√ √ cj
N M ϕj cj N ∏M
j=1 ϕ j
the right hand side of (21) as ∏ j=1 ( c j ) = c j , the inequality (21) can be
M
∏ j=1 c j
equivalently stated as
M
j c 1 M cj M cj cj
∏ ϕj ≤ ∏c = ∏( ) . (22)
j=1 N N j=1 j j=1 N
ϕ ϕ ϕ
Now assume that equality holds in (21), and let C = c 11 = c 22 = ⋯ = c MM . Under (20c),
it then holds that 1 = ∑ M M M
j=1 ϕ j = ∑ j=1 C ⋅ c j = C ∑ j=1 c j = C ⋅ N which is equivalent to
c cj
that C = N . Hence, for the objective function ∏ j=1 ϕ j j in (20a) it holds that ∏ M
1 M
j=1 ϕ j ≤
M jc cj ϕj 1 cj
∏ j=1 ( N ) under (20b), with equality under (20c) if and only if cj
= N
⇔ ϕj = N
,
j = 1, 2, . . . , M. This completes the proof.
Note that since log [⋅] is a strictly increasing function, Lemma 1 is also applicable
cj
to the problem of maximizing the function log ∏ M M
j=1 ϕ j = ∑ j=1 c j log ϕ subject to the
conditions (20b) and (20c).
By using Lemma 1, a condition for a solution to the maximization problem (17), and
thereby the MLE problem (16), can be obtained.
Proof. Assumption 1 implies that the joint distribution of R is given by (12). With c j
defined according to (13), the likelihood (11) can be written as (14) and by exploiting the
structure of the parameter spaces (6) and (9), it trivially follows that the MLE problem (16)
can be equivalently reformulated as the maximization problem (17). From Lemma 1, and
130 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
the fact that log [⋅] is a strictly increasing function, it follows that any α ⋆ ∈ Υ and θ ⋆ ∈ Θ
that satisfies Nj = ∑Ki=1 α ⋆i θ ⋆i j , j = 1, 2, . . . , M, is a solution to the maximization problem
c
M K
max ∑ c j log [∑ α i θ i j ]
α∈RK , θ∈RK×M j=1 i=1
K
subject to ∑ α i θ i j ≥ 0, j = 1, 2, . . . , M, (24)
i=1
M K
∑ ∑ α i θ i j = 1.
j=1 i=1
Now note that (24) has the same objective function as (17) and that the feasible set of (17) is
contained in the feasible set of (24), since α i ≥ 0 and θ i j ≥ 0 implies ∑ M K
j=1 ∑ i=1 α i θ i j ≥ 0
and ∑Ki=1 α i = 1 and ∑ M M K K M
j=1 θ i j = 1 implies that ∑ j=1 ∑ i=1 α i θ i j = ∑ i=1 α i ∑ j=1 θ i j =
∑ i=1 α i ⋅1 = 1, since θ ∈ Θ. Clearly, (α ⋆ , θ ⋆ ) is contained in the feasible set of problem (17)
K
and it follows that (α ⋆ , θ ⋆ ) is a solution also to (17). It now remains to show that
(α ⋆ , θ ⋆ ) is a global solution to (17). Since log [⋅] is a non-decreasing concave function,
and ∑Ki=1 α i θ i j is a linear function, it holds that log [∑Ki=1 α i θ i j ] is a concave function.
Therefore, the objective function in (17) is a convex sum of concave functions, since c j > 0
due to Assumption 2, and hence a concave function. Since all constraints in (17) are
linear, it follows that (17) is a concave optimization problem. Thus, the solution (α ⋆ , θ ⋆ )
is a global maximizer to (17) and hence a solution to the MLE problem (16).
subject to α i ≥ 0, i = 1, 2, . . . , K, (26)
K
∑ α i = 1.
i=1
4. Online Residual Evaluation Algorithm 131
In the general case, it is unfortunately not possible to find an explicit expression for a
solution to the maximization problem (26), or equivalently the MLE problem (25), as
was the case with the MLE problem (16). There are however several efficient numerical
approaches, see, e.g., Nocedal and Wright (2006).
By using similar arguments as in the proof of Theorem 1, it can be shown that also (26)
is a concave maximization problem. The concavity property facilitates the numerical
solving since it implies that if a local maximum can be found, then it is also a global
maximum.
However, since K < M, see Section 2.2, (27) corresponds to an overdetermined set of
equations which in general has no solution. Motivated by this discussion, it makes sense
cj
to chose α so that each ∑Ki=1 α i θ NF
i j is as close as possible to N for j = 1, 2, . . . , M. Thus,
the following relaxation of the problem (26) is considered
1 K ⋆ 2
min ∥ ∑ α i θ NF
i − ϕ ∥2
α∈RK 2 i=1
subject to α i ≥ 0, i = 1, 2, . . . , K, (28)
K
∑ α i = 1,
i=1
In order to compare the fault detection properties of the residual evaluation tests
based on the relaxed problem (28) and the original MLE problem (26), the following
result is given.
Lemma 2. Let c 1 , c 2 , . . . , c M fulfill Assumption 2, let θ NF ∈ Θ, and
K
ΦNF = {ϕ ∶ ϕ = ∑ α i θ NF
i , ∀α ∈ Υ} . (29)
i=1
Further, let ϕ⋆ ∈ ΦNF , and let α O and α R be solutions to the original problem (26) and
relaxed problem (28), respectively. Then, it holds that
K K
O NF R NF ⋆
∑ αi θ i = ∑ αi θ i = ϕ . (30)
i=1 i=1
is non-empty. Assume that ΥNF ≠ ∅ and consider first the optimization problem (26).
Since ΥNF ≠ ∅, it follows from Lemma 1, and the fact that log [⋅] is an increasing function,
that any optimal solution to (26) is contained in ΥNF . In particular, this holds for α O and
thus ϕ⋆ = ∑Ki=1 α O NF
i θ i . Consider next the optimization problem (28). Again Υ
NF
≠∅
O NF
implies that any optimal solution to (28), in particular α , is contained in Υ . Hence,
ϕ⋆ = ∑Ki=1 α Ri θ NF
i and the proof is complete.
Consider the hypotheses in (7) and the GLR test statistic λ (R) defined by (10)
and (8). Define the test statistic
L (α R , θ NF ∣R)
λR (R) = −2 log , (32)
L (α ⋆ , θ ⋆ ∣R)
where (α ⋆ , θ ⋆ ) is a solution to the original MLE problem (16) as present in (8), but where
α R is a solution to the relaxed numerator MLE problem (28).
The power of the residual evaluation test λ (R) > J can be quantified by the power
function (Casella and Berger, 2001)
β λ (α, θ) = Pr (reject H 0 ∣α, θ) = Pr (λ (R) > J∣α, θ) , (33)
where J is a fixed threshold. If α ∈ Υ and θ = θ NF in (33), i.e., under H 0 , the power
function gives the probability of false detection, or Type I error. Otherwise, the power
function gives the probability of detection for fixed α and θ, or equivalently the probability
of missed detection or Type II error, by 1 − β λ (α, θ).
Consider now the power function
β λR (α, θ) = Pr (λR (R) > J∣α, θ) , (34)
for the residual evaluation test λR (R) > J, based on the relaxed problem (28). The
relation between the power functions (33) and (34) is given by the following result.
4. Online Residual Evaluation Algorithm 133
Proof. Define ϕ = ∑Ki=1 α i θ i and note that from (7), it can be deduced that ϕ ∈ ΦNF is
equivalent to that α ∈ Υ and θ = θ NF , i.e., that H 0 in (7) is valid. Thus, by assumption,
it holds that ϕ ∈ ΦNF . Consider now ϕ⋆ and note that due to the invariance prop-
erty (Casella and Berger, 2001) of maximum likelihood estimates it holds that if (α ⋆ , θ ⋆ )
are the MLE of (α, θ), which indeed is true by assumption, then ϕ⋆ = ∑Ki=1 α ⋆i θ ⋆i is the
MLE of ϕ. Lemma 5 (found in Appendix A) then implies that
lim Pr (∣ϕ⋆ − ϕ∣ ≥ ε) = 0,
N→∞
for all ε > 0 and ϕ ∈ Φ′ , with Φ′ defined by (70). Since it holds that ϕ ∈ ΦNF by
assumption, it therefore holds that ϕ⋆ ∈ ΦNF when N → ∞. Since ϕ⋆ ∈ ΦNF holds, (36)
holds with equality which is equivalent to that λR (R) = λ (R). By (33) and (34) this is
equivalent to β λR (α, θ) = β λ (α, θ), and thus (37) holds.
134 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
0.9
λ R (R)
λ(R)
0.8
0.7
0.6
0.5
0.4 0 1 2 3 4
10 10 10 10 10
N
Figure 4: Comparison of test quantities λR (R) and λ(R) under hypothesis H 0 , by means
of the quantity λλ(R)
R (R)
, for different values of the size N of the residual sample R.
Note that for use with sequential residual data, the samples in R may be collected by
using a sliding window, i.e., at sampling instant t the set of residual samples
R t = {r t−N+1 , r t−N+2 , . . . , r t } ,
Parameter Choices
The parameters involved in the residual evaluation are the number N of residual samples
in R, the detection threshold J, and the no-fault distribution parameter θ NF . The first two
parameters, N and J, are discussed below. The parameter θ NF is the topic of Section 5.
According to Theorem 3, the relaxation (28) of the MLE problem (25) is justified in
terms of the probability for false detection if N is sufficiently large. The actual meaning
of “sufficiently large” is application dependent and must be evaluated from case to case.
This can for example be done by comparing the test quantities λR (R) and λ(R), under
hypothesis H 0 , for different values of N in the same manner as in Figure 4.
In general, given that N is large enough to justify the relaxation, the choice of N
is a trade-off between detection performance and complexity. A large N will give the
test statistic smoothed, low-pass, characteristics. This makes it possible to detect small
changes in the residual, but on the other hand a large N may increase the detection time.
Computational and memory aspects will be discussed in Section 4.3.
The choice of detection threshold J is a trade-off between detection time, and test
power, in terms of probability of false detection and probability of missed detection. The
higher the threshold, the longer the detection time, the lower the probability of false
detections but the higher the probability of missed detection. The actual selection of
threshold may be aided by the fact that the test statistic based on the GLR, ideally, is
Chi-squared distributed (Willsky and Jones, 1976).
136 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
histogram. The most crucial parameter of these two is K, which in this sense should
be kept as low as possible. Implications of the value of this quantity, in the context of
residual evaluation performance, is further discussed in Section 5.
It is worth noting that the complexity of the problem (28) does not depend on the
number N of residual samples in R. This is favorable since it is only justified to consider
the relaxed problem (28) instead of the MLE problem (25) if N is sufficiently large, see
Section 4.1.
fault detection performance, either in the form of missed detections or false alarms,
depending on the strategy used when setting the alarm threshold.
In conclusion, the choice of K and θ NF is a trade-off between fault detection perfor-
mance and computational effort. However, in order to take the fault detection perfor-
mance into account, training data from a set of representative fault cases is needed. In
the context of this work it is however assumed that only no-fault training data is available
due to a number of reasons. First of all, the amount of available no-fault data is typically
substantially larger than the available amount of fault data, since faults are rare. To create
fault data, one alternative is to inject faults in the real system. This is however considered
to be expensive, both in terms of time and money, since it typically require hardware
modifications and active usage of the system. Another alternative is to create fault data
by simulation. To give realistic results, this on the other hand requires models capable of
describing the faulty system, which in turn require detailed knowledge regarding the
behavior of the faulty system and possibly also its environment. This kind of information
is seldom available for real applications.
Motivated by this discussion, fault detection performance will not be explicitly
considered in the learning of K and θ NF . Instead, the learning problem will be formulated
as a trade-off between the ability of K and θ NF to characterize the set of all no-fault
residual distributions, i.e., model fit, and computational effort. The main motivation
for this choice is that a good characterization of the no-fault case will hopefully make
it possible to detect deviations from the no-fault case, meaning good fault detection
performance. The resulting fault detection performance is however empirically studied
in Section 6.
A general approach for enabling a trade-off between goodness of model fit and model
complexity when identifying parameters in a model is to combine the model fit metric,
in the present case V (T , θ), with some metric that reflects the model complexity (Ljung,
1999; Söderström and Stoica, 1989).
In the context of this work, required computational effort rather than model com-
plexity is of direct interest. As said in Section 4.2, the required computational effort for
the residual evaluation algorithm presented in Section 4.2 is strongly dependent on the
dimension K × M of θ NF , and in particular the value of K. Since the larger the value of
K, the higher the computational requirements, a function C (K) that increases with K
is suitable for quantification of the computational effort. Typically, the actual choice of
C (K) is implementation dependent. In general, there are many options, see, e.g., Ljung
(1999); Söderström and Stoica (1989). One alternative is to exploit the information criteria
due to Akaike (Akaike, 1974).
Given V (T , θ) and C (K), the learning problem as stated in Section 5.1 can be
formulated as the problem
where the notation Θ(K) for the space defined in (9) is introduced to stress the depen-
dency between the space and K. The topic of the remaining of this section is to derive a
suitable metric V (T , θ) for quantification of model fit.
p (R k ∣α k , θ) = ∏ p (r p ∣α k , θ)
r p ∈R k
K (42)
= ∏ ∑ α k i p (r p ∣θ i ) ,
r p ∈R k i=1
Let c k j denote the total number of residual samples in R k that takes value x j , c.f. (13).
The log-likelihood of θ and α k , k = 1, 2, . . . , NT , given T , can then be written as
Under the assumption that each R k ∈ T contains residual samples from only one
operating mode, it holds that each α k , k = 1, 2, . . . , NT , contains one and only one
5. Learning No-Fault Distribution Parameters 141
will be used to quantify how well the set of distributions characterized by a given param-
eter θ is able to describe a data set T .
Method Outline
The basic idea of the proposed approach for finding θ (K) is to first calculate a distribution
parameter θ k ∈ Θ(1) for each R k ∈ T by exploiting Theorem 1 and form the set
Ψ = (θ 1 , θ 2 , . . . , θ NT ) , (49)
where
Algorithm
The general algorithm for finding a solution to (48) is given below. The input to the
algorithm is a set of residual samples D and constants n and K. The output is a distribution
parameter θ (K) .
In the algorithm, D (p (r∣θ k ) ∥p (r∣θ ⋆i )) denotes the Kullback-Leibler (KL) diver-
gence (Kullback and Leibler, 1951) between the probability distributions characterized
by p (r∣θ k ) and p (r∣θ ⋆i ). The KL-divergence is one way to quantify the similarity of
probability distributions and is properly defined in Section 5.4.
Step 1: Let T be defined by (40).
Step 2: Let Ψ be defined by (49).
Step 3: Partition Ψ into P⋆ = (P1 , P2 , . . . , PK ) such that
K
P⋆ = arg min ∑ ∑ D (p (r∣θ k ) ∥p (r∣θ ⋆i )) , (51)
P i=1 θ k ∈P i
where
1
θ ⋆i = ∑ θk , i = 1, 2, . . . , K. (52)
∣P i ∣ θ k ∈P i
Step 4: Let
⋆ ⋆
⎛ θ 1 ⎞ ⎛ θ 11 θ ⋆12 ⋯ θ ⋆1M ⎞
⎜θ⋆ ⎟ ⎜ θ⋆ θ ⋆22 θ ⋆2M ⎟
θ (K) = ⎜ 2 ⎟ = ⎜ 21
⋯
⎟. (53)
⎜ ⋮ ⎟ ⎜ ⋮ ⋮ ⋮ ⋮ ⎟
⎝θ ⋆ ⎠ ⎝θ ⋆ θ ⋆K2 ⋯ θ ⋆ ⎠
K K1 KM
The most crucial part of the above algorithm is Step 3, in which a particular partition
of the set Ψ should be computed. This problem in fact corresponds to a hard K-means
clustering problem (Bishop, 2006), for which efficient heuristic methods exists (Lloyd,
1982). Implementation issues are discussed in Section 5.5.
The justification of the algorithm, in terms of its ability to provide a solution to the
problem (48), is given in next section.
5. Learning No-Fault Distribution Parameters 143
R1 R2 R3 R4 R5 R6 R7 R8 R9
(a) Residual sample sets in T
θ1 θ2 θ3
θ4 θ5 θ6
θ7 θ8 θ9
Figure 5: Illustration of the proposed learning algorithm. Figure 5a shows the residual
sample sets in T and Figure 5b the corresponding distribution parameters in Ψ.
144 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
θ1
θ2
0.3
θ3
0.25
p(r = xi |α, θ)
0.2
0.15
0.1
0.05
0
0.17 0.27 0.37 0.48 0.58 0.68
xi
Figure 6: The distribution parameters learned from the training data in Figure 5a.
then
V (T , θ (K) ) = max V (T , θ) , (56)
θ∈Θ (K)
Proof. It is first noted that by Assumption 1, the joint pmf for the samples in R k ∈ T
is given by (42), which is equivalent to (12). From (15), and the fact that θ ∈ Θ(1) due
to (55), which implies that K = 1 and α 1 = 1 in (15), it holds that
M
l (θ∣R k ) = ∑ c k j log θ j , (57)
j=1
for p = 1, 2, . . . , K. Due to (58) it holds that for each T i ∈ T and for each R k ∈ T i
M M
⋆ ⋆
max ∑ c k j log θ p j = ∑ c k j log θ i j . (59)
p∈{1,2,...,K} j=1 j=1
By definition (55), it holds that θ ⋆i ∈ Θ(1) , i = 1, 2, . . . , K, and therefore that θ (K) ∈ Θ(K)
with θ (K) defined by (56). To show that (56) is satisfied, it is sufficient to show that
V (T , θ ⋆ ) is a maximum value. Since θ ⋆i = (θ ⋆i1 , θ ⋆i2 , . . . , θ ⋆i M ) only is present in the term
M
⋆
∑ ∑ c k j log θ i j , (61)
R k ∈T i j=1
The implication of Theorem 4, is that the solving of (48) can be reduced to finding a
partition T = (T1 , T2 , . . . , TK ) of the set T , defined according to (40), that fulfills (54).
Next result, establishes a relation between the sought partition T of T and a partition P
of the set Ψ computed in Step 2 of the algorithm.
To this end, KL-divergence needs to be properly defined. In general, for two distribu-
tions of a discrete random variable R with range X that are characterized by the pmf ’s
f 1 (r) and f 2 (r), the KL-divergence between f 1 (r) and f 2 (r) is defined as
f 1 (x k )
D ( f 1 (r)∥ f 2 (r)) = ∑ f 1 (x k ) log . (62)
x k ∈X f 2 (x k )
It follows that D ( f 1 (r)∥ f 2 (r)) ≥ 0, with equality if and only if f 1 (r) ≡ f 2 (r).
A transformation of the sufficient condition in Theorem 4 on a partition T of T to a
partition P of the set Ψ is given by the following lemma.
Lemma 3. Let P i ⊆ Ψ, let
T i = {R k ∈ T ∶ θ k ∈ P i } (63)
and let all residual samples in all R k ∈ T i fulfill Assumption 1. Then, for any θ p , θ q ∈ Θ(1)
and for each R k ∈ T i it holds that
l (θ p ∣R k ) ≥ l (θ q ∣R k ) , (64)
if and only if for each θ k ∈ P i it holds that
D (p (r∣θ k ) ∣∣p (r∣θ q )) ≤ D (p (r∣θ k ) ∣∣p (r∣θ q )) . (65)
Moreover, it holds that
arg max ∑ l (θ∣R k ) = arg min ∑ D (p (r∣θ k ) ∣∣p (r∣θ)) .
(1) (1) (66)
θ∈Θ R k ∈T i θ∈Θ θ k ∈P i
for i = 1, 2, . . . , K. Moreover, for each block P i ∈ P⋆ and for each element θ k ∈ P i it holds
that
D (p (r∣θ k ) ∣∣p (r∣θ ⋆i )) ≤ D (p (r∣θ k ) ∣∣p (r∣θ ⋆j )) , (68)
for j = 1, 2, . . . , K.
5. Learning No-Fault Distribution Parameters 147
With help of Theorem 4, Lemma 3, and Lemma 4, it can be proved that the output
from the algorithm in Section 5.3 indeed is a solution to the problem (48).
Proof. Due to Step 3 in the algorithm, it is clear that the partition P⋆ = (P1 , P2 , . . . , PK )
fulfills (51) and that θ ⋆i , i = 1, 2, . . . , M, fulfills (52). Lemma 4 then implies that (68)
holds for each block P i ∈ P⋆ and for each element θ k ∈ P i , and that θ ⋆i , i = 1, 2, . . . , M,
fulfills (67). Now define T = (T1 , T2 , . . . , TK ) with T i according to (63) for i = 1, 2, . . . , K.
Note that due to (49) and (63), it follows that there is block T i ∈ T and an element R k ∈ T i
for each element θ k ∈ P i and for each block P i ∈ P⋆ , and vice versa. The fact that P⋆ is a
partition of Ψ, then implies that T is a partition of T . Appliance of Lemma 3 to each block
P i ∈ P⋆ then asserts that the partition T satisfies l (θ ⋆i ∣R k ) ≥ l (θ ⋆j ∣R k ) for each block
T i ∈ T and for each element R k ∈ T i , for all j = 1, 2, . . . , K. Further, since (67) holds
for θ ⋆i and due to (66) in Lemma 3, it follows that θ ⋆i = arg max θ∈Θ(1) ∑Rk ∈T i l (θ∣R k ),
i = 1, 2, . . . , M. The claim then follows directly from Theorem 4.
As a remark, it is noted that the assignment and update steps in fact (Bishop, 2006)
correspond to the Expectation and Maximization steps, respectively, in the EM-algorithm
(Dempster et al., 1977). Thus, when the K-means algorithm is employed for solving
the clustering problem in Step 3, the learning algorithm in a sense resembles the EM-
algorithm.
It is also noted that in a practical implementation of the learning algorithm, the
training data set T is preferably split into an estimation data set E and a validation data
set V, in order to avoid over-fitting, see, e.g., Ljung (1999). In this setting, the estimation
data set E is used when solving (48) to obtain θ (K) , for a fixed K, and then the validation
data set V is used to evaluate if the obtained solution θ (K) and K satisfies (41).
Parameter Choices
The only parameter involved in the learning problem (41) is n, the number of residual
samples used in each R k when calculating the set T according to (40), which is done in
Step 1 of the algorithm.
The choice of n is determined by the properties of the considered system. As said
in Section 5.2, n should be chosen so that each R k ∈ T contains residual samples from
only one operating mode of the system. In order to achieve this, n should be chosen so
that the time it takes to collect a set of n residual samples is less than the average time
that the system spends in one operating mode.
Before learning the parameter θ NF , the quantization M of the residual, i.e., the size
of the residual range space and thereby the resolution of the residual distribution (1),
must be determined and the training data in D formated accordingly. Choosing M, in
fact, corresponds to the well-studied, but nevertheless difficult, problem of choosing the
number of bins in a regular histogram given a sample of data. Numerous approaches for
solving this problem exist, see for example Davies et al. (2009) and references therein.
Regardless of the method used to solve the problem, the choice of M is a trade-off
between accuracy and computational complexity, in terms of time and storage. A larger
M results in a more accurate discretization of the residual and higher resolution of the
probability distributions. On the other hand, a large M requires more memory and
involves more computations. The choice of M is also related to the choice of n and N,
since a small n, or N, together with a large M will result in an inadequate estimation of
the distribution, i.e., a sparse histogram.
The resolution of the residual also affects the fault detection performance in the
sense that if the resolution is high, small deviations of the residual can be perceived and
thereby small faults can be detected. As a guideline, the resolution of the residual can be
matched to the size of the smallest fault that should be possible to detect.
6 Application Example
The proposed residual evaluation approach has been applied to the problem of fault
detection in the gas-flow system of a Scania 6 cylinder, 13 liter, truck diesel engine
equipped with Exhaust Gas Recirculation (EGR), Variable Geometry Turbine (VGT),
6. Application Example 149
and intake throttle. The overall purpose of the study was to evaluate and demonstrate
the proposed on-line residual evaluation algorithm, as well as the off-line algorithm
for learning no-fault residual distributions, using measurement data. In addition, it is
also illustrated how the fault detection performance of the residual evaluation test is
influenced by different values of the involved parameters, in particular the size N of the
residual sample set R, and the number K of no-fault distribution parameters in θ NF .
Parameter Values
The value of the parameter M, i.e., the quantization of the residual samples, was chosen
to be M = 80. This makes it theoretically possible to detect faults that cause deviations
of the residual of about 3 kelvin. For this application, this is a good trade-off between
complexity, in terms of required memory and computational effort, and accuracy.
150 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
5
x 10
−1.4
−1.6
−1.8
−2
−2.2
−2.4
−2.6
Figure 7: Evaluation of model fit metrics V (E , θ (K) ) (dashed, black) and V (V , θ (K) )
(solid, red) for different of values of K.
By a brief analysis of the residual samples, it seems that the minimum time that the
gas-flow system spends in one operating mode is approximately 4 s. This can be seen in
Figure 1, which in fact shows a subset of the residual samples used in this study. Since
the sample rate is 0.1 s, the parameter n, which specifies the number of residual samples
in each R k in the set T calculated in Step 1 of the algorithm, should be chosen to satisfy
n < 40, see Section 5.5. Based on this, the parameter was chosen to be n = 32.
Results
The algorithm for learning no-fault distribution parameters described in Section 5.3,
was implemented in Matlab. To solve the involved clustering problem, the K-means
algorithm (MacQueen, 1967; Lloyd, 1982) was employed. The algorithm was run with
K ∈ {1, 2, . . . , 79}.
Figure 7 shows the model fit metric (47) evaluated for the estimation data set E and
validation data set V, and with the parameters θ (K) , K ∈ {1, 2, . . . , 79}, obtained as output
from the algorithm. In Figure 7 it can first of all be seen that the quantitative behaviors
of V (E , θ (K) ) and V (V , θ (K) ) are similar, but that V (E , θ (K) ) always is larger than
V (V , θ (K) ). The latter seems natural since the data set E indeed was used as input to
the learning algorithm. Second, it can also be noticed that the improvement in model fit
as a function of K is larger for smaller K.
Based on the above observations, and with respect to the trade-off between model
fit and required computational effort stated by (41), K = 10 was chosen. The 10 no-fault
distribution parameters, i.e., the rows of θ (10) , are shown in Figure 8. Note that the
characteristics of the learned distribution parameters are quite different, some are multi-
6. Application Example 151
0.5
0.2
θ1
θ2
0.1
0 0
20 40 60 80 20 40 60 80
0.2
0.1
θ3
θ4
0.1
0 0
20 40 60 80 20 40 60 80
0.2
θ5
θ6
0.2
0 0
20 40 60 80 20 40 60 80
0.4
0.2
θ7
θ8
0.2
0 0
20 40 60 80 20 40 60 80
0.05
θ10
θ9
0.05
0 0
20 40 60 80 20 40 60 80
xi xi
modal and some have only one single mode. In addition, the distribution parameters are
overlapping.
Considered Fault
The fault considered in the evaluation is a fault in the boost pressure sensor. The relation
between the boost pressure sensor signal y pim and the considered residual is dynamic,
and the residual value r depends on the derivative of the boost pressure sensor signal,
as well as the actual sensor signal, i.e., r = F(y pim , ẏ pim , . . .), where F(⋅) is a non-linear
function. The considered fault scenario is a gain fault in the boost pressure sensor, that is,
the sensor signal y pim fed to the residual generator is y pim = δ ⋅ pim , where pim is the actual
boost pressure, and δ ≠ 1 indicates a gain fault. Gain faults in the range δ ∈ [0.2, 1.8]
were implemented off-line by modification of the sensor signal.
for the test λR (R) > J, defined in Section 4. Note that δ = 1 in the power function (69)
corresponds to that α ∈ Υ and θ = θ NF in the power function (34).
To study another important aspect of the detection performance, the Mean Time
to Detection (MTD) will also be considered. Note that the choices of the values of the
parameters N and J, i.e., the size of the residual sample set R and the detection threshold,
respectively, are a trade-off between the metrics measured by the power function and
the MTD, see Section 4.2.
In order to be able to say something about the relative performance of the proposed
residual evaluation approach, it will be compared to the often in practice used norm-
based residual evaluation approach built upon the test statistic s(R) = N1 ∑r k ∈R̄ r̄ 2k
where R ¯ = (r̄ 1 , r̄ 2 , . . . , r̄ N ) is a low-passed filtered version of the sample R. Note that
the purpose of this comparison merely is to give a feeling of the relative performance
of the proposed residual evaluation approach, and the comparison is not claimed to
be exhaustive. The low-pass filtering was in this study performed with a first-order
Butterworth filter and for comparison, four different cut-off frequencies, f 1 = 0.005
Hz, f 2 = 0.05 Hz, f 3 = 0.5 Hz, and f 4 = 4.5 Hz, were used. The corresponding test
statistics are denoted s 1 , s 2 , s 3 , and s 4 . Recall that the residual is sampled at a rate of 0.1 s,
corresponding to a frequency of f s = 10 Hz.
Implementation Details
The residual evaluation algorithm described in Section 4.2, was implemented in Matlab.
To solve the optimization problem (28), a tailored solver was generated using the soft-
ware tool CVXGEN (Mattingley and Boyd, 2012), see Section 4.3. With this solver, the
optimization problem (28) in the setting of this study, could be solved in the time scale
of 10−4 s. Solving the corresponding problem using the Matlab optimization toolbox
results in solving times of the magnitude of 10−3 s. Solving the original numerator MLE
problem (25) using the Matlab optimization toolbox however renders solving times of
magnitude 10−1 s.
As said in Section 4, it is only justified, in terms of the probability of false detection,
to consider the relaxed problem (28) instead of the original MLE problem (25) if the size
N of the set of residual samples R is sufficiently large. To investigate the meaning of
sufficiently large in the context of this study, Figure 9 shows a comparison of the solutions
to the respective problems, as well as a comparison of the corresponding test statistics, for
different values of N in the no-fault case. Figure 9a shows a comparison of the solution
α R to the relaxed problem (28) and the solution α O to the original MLE problem (25),
by means of the quantity ∥ϕR − ϕO ∥22 , where ϕR = ∑Ki=1 α Ri θ NF K O NF
i and ϕO = ∑ i=1 α i θ i .
Figure 9b shows a comparison of the test statistics λR (R), based on the relaxed problem
and λ(R), based on the original MLE problem, by means of the quantity λλ(R) R (R)
. The
results shown in Figure 9 are the average of 150,000 runs. Based on Figure 9, it was
concluded that in the context of this study, N > 1000 is good enough to justify the
switch to the relaxed problem. Recall from Section 4.3 that the complexity of the relaxed
problem, in terms of computational time and memory, is independent of N.
The threshold J for the test λR (R) > J, as well as the thresholds for the norm-based
6. Application Example 153
−1
10
0.95
0.9
φR − φO 22
0.85
λR (R)
λ(R)
0.8
−2
10 0.75
0.7
2 3 2 3
10 10 10 10
N N
(a) Comparison of α R and α O . (b) Comparison of λR (R) and λ(R).
Figure 9: Investigation of how the relation between the solutions α R and α O to the
relaxed (28) and original (25) MLE problems, respectively, as well as the corresponding
test quantities, λR (R) and λ(R), changes with the size N of the residual sample R.
tests, was computed based on the estimation data set used in the learning of the no-fault
distribution parameters. All thresholds were computed in order to give a probability of
false detection of 5 %. All residual sample sets were taken from the validation data set by
using a sliding window, see Section 4.2.
Power as Function of N
To illustrate how the power of the test λR (R) > J varies with the number N of residual
samples in R, Figure 11 shows the power function for the test for different values of N
and parameter θ NF = θ (10) . Figure 11 clearly shows that the power of the test increases
with N.
In Figure 11, it can be seen that as small faults as δ ≈ 0.95 and δ ≈ 1.05, corresponding
to gain faults in the boost pressure sensor of about ± 5 %, may be possible to detect if N
154 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
200
100
r
0
300 350 400 450 500 550 600 650 700
3000
λR
2000
1000
300 350 400 450 500 550 600 650 700
6
x 10
2
1.5
s1
1
0.5
Figure 10: Residual r (top), test statistic λR (R) (middle), and test statistic s 1 (R) (bottom),
when an abrupt fault occurs at t = 450 s. The fault is a 10 % gain fault in the boost pressure
sensor, which corresponds to δ = 1.1.
6. Application Example 155
0.9
0.8
0.7
0.6
βγ (δ)
0.5
0.4
0.3 N = 64
N = 128
0.2 N = 256
N = 512
N = 1024
0.1 N = 2048
N = 4096
N = 8192
0
0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Fault Size δ
Figure 11: Power function β λR (δ) for the test λR (R) > J for different sizes N of the
sample R. The power increases with N.
is sufficiently large. To further illustrate this, Figure 12 shows the Receiver Operating
Characteristic (ROC) curve for different values of N, for a test case with δ = 1.05. The
ROC curve shows the relation between the True Positive Rate (TPR) of detection (y-axis),
and the False Positive Rate (FPR) of detection (x-axis), i.e., the relation between correct
detections and false detections, when the detection threshold J is varied. Figure 12 again
shows that the detection performance increases with N, but also that the rate of false
detections can be made lower than the rate of actual detections even for moderate values
of N.
Power as Function of K
To analyze how the power of the test λR (R) > J varies with different values of the
parameter θ NF = θ (K) , specifying the set of no-fault residual distributions, or more
specifically with K, i.e., the number of operating modes of the system, Figure 13 shows
the power function for the test for different values of K. All considered parameters θ (K)
were obtained by means of the algorithm described in Section 5. To also see how the
power of the test depends on the relation between K and N, Figure 13 shows how the
power function depends on K for different values of N.
The general conclusion from the evaluation shown in Figure 13, is that for a given
256 ≤ N ≤ 1024, the power of the test λR (R) > J is almost equal for all considered K.
For small N, e.g., N = 64, however, the power increases with K and for large N, e.g.,
N = 4096, the power increases as K decreases. The liable rationale behind this is that a
small K results in a generic and averaged, in terms of operating modes, description of
156 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
0.9
0.8
0.7
0.6
TPR
0.5
0.4
0.3 N = 64
N = 128
0.2 N = 256
N = 512
N = 1024
0.1 N = 2048
N = 4096
N = 8192
0
0 0.2 0.4 0.6 0.8 1
FPR
Figure 12: ROC for test λR (R) > J when δ = 1.05 for different sizes N of the sample R.
the set of no-fault residual distributions. A large set of residual samples typically means
residual samples from a variety of operating modes, while a small set of residual samples
on the other hand means residual samples from only a few operating modes. This means
that a parameter θ NF corresponding to a small K, typically can describe the distribution
of a large set of no-fault residual samples, i.e., a large N, better than the distribution of
a small set of no-fault residual samples, i.e., a small N. An accurate description of the
no-fault residual distribution makes it possible to distinguish such from a faulty residual
distribution, which indeed means good detection power.
Comparison of Tests
Figure 14 shows a comparison of the power functions for the tests based on the test
statistics λR (R), s 1 (R), s 2 (R), s 3 (R), and s 4 (R), for different values of the parameter
N, which specifies the number of residual samples in R. For the test statistic λR (R), the
parameter θ NF = θ (10) illustrated in Figure 8 was used.
Figure 14 shows that the powers of all tests increases with N and that the differences
between the power of the tests seem to decrease with an increasing N. It can also be seen
that the power function for the test based on λR (R) is near symmetric for all N, while
the power functions for the other tests are asymmetric and tend to be less powerful for
faults sizes δ < 1. The difference in power for δ < 1 is for example significant for N = 64.
The mean time to detection (MTD) for each of the tests based on λR (R), s 1 (R),
s 2 (R), s 3 (R), and s 4 (R), is shown in Figure 15, for different sizes N of the sample R.
In order to get comparable results, the MTD was computed as the mean of the
detection time for the two largest faults, corresponding to δ = 0.2 and δ = 1.8, since all
7. Conclusions 157
N = 64 N = 256
1 1
0.8 0.8
0.6 0.6
β(δ)
0.4 0.4
0.2 0.2
N = 1024 N = 4096
1 1
0.8 0.8
0.6 0.6
β(δ)
K = 3
0.4 0.4 K = 10
K = 22
0.2 0.2 K = 30
K = 48
K = 64
0 0
0.5 1 1.5 0.5 1 1.5
Fault Size δ Fault Size δ
Figure 13: Comparison of power functions for the test based on λR (R) for a set of no-fault
distribution parameters θ (K) with different values of K.
considered test statistics are able to detect these faults to some extent, see Figure 14. Each
fault was injected in the test sequence at 10 time instances.
In Figure 15, it can be seen that the MTD’s for all tests increase for N > 256. For
N < 256, however, the MTD decreases with N for the norm-based tests and increases
with N for the test based on λR (R). It is worth noting that the MTD for the test based
on λR (R) is smaller for all N than the MTD’s for all other tests.
7 Conclusions
As illustrated by Figure 1, residuals in practice often deviate from zero even in the
no-fault case due to uncertainties and disturbances caused by for example modeling
errors, measurement noise, and unmodeled phenomena. In addition, due to changes
in the operating mode of the underlying system, the magnitude of uncertainties and
disturbances is time-varying, causing the behavior of residuals to be non-stationary. To
handle these issues, a novel statistical residual evaluation approach has been proposed.
The main contribution is to base the residual evaluation on an explicit comparison of
the probability distribution of the residual, estimated on-line using current data, with a
no-fault residual distribution. The no-fault distribution is based on a set of a-priori known
158 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
N = 64 N = 256
1 1
0.8 0.8
0.6 0.6
β(δ)
0.4 0.4
0.2 0.2
0 0
0.5 1 1.5 0.5 1 1.5
N = 1024 N = 4096
1 1
0.8 0.8
0.6 0.6
β(δ)
γ(R)
0.4 0.4 s1 (R)
s2 (R)
0.2 0.2 s3 (R)
s4 (R)
0 0
0.5 1 1.5 0.5 1 1.5
Fault Size δ Fault Size δ
Figure 14: Comparison of power functions for the tests based on λR (R) (solid with dot
markers), s 1 (R) (solid), s 2 (R) (dashed), s 3 (R) (dash-dotted), and s 4 (R) (dotted), for
different sizes N of the sample R.
λR (R)
MTD [samples]
s 1 (R)
s 2 (R)
10
2 s 3 (R)
s 4 (R)
Figure 15: Comparison of the Mean Time to Detection (MTD) for the tests based on
λR (R) (solid with dot markers), s 1 (R) (solid), s 2 (R) (dashed), s 3 (R) (dash-dotted),
and s 4 (R) (dotted), for different sizes N of the sample R.
A. Proofs of Theorems and Lemmas 159
no-fault distributions, and is continuously adapted to the current operating mode of the
system by means of the likelihood maximization problem (26). A computational efficient
version of the residual evaluation test statistic suitable for online implementation has
been derived by considering a properly chosen approximation (28) to the maximization
problem (26). The fault detection properties of the resulting residual evaluation test have
been analyzed by means of Theorems 2 and 3.
As a second contribution, a method has been proposed for learning the required set
of no-fault residual distributions off-line from training data. Thus, by using this method,
the overall residual evaluation method is data-driven and no assumptions regarding the
properties of the probability distribution of the residual, nor the properties of the faults
to be detected, are needed. The method was given by means of an algorithm based on
K-means clustering, and was theoretically justified in Theorem 5.
The proposed residual evaluation method has been evaluated with measurement
data on a residual for fault detection in the gas-flow system of a Scania truck diesel
engine. The proposed test statistic performs well despite non-conventional properties
of the considered residual. For instance, the method outperforms regular norm-based
methods using constant thresholding in the sense that small faults can be detected in cases
where these methods fail. It has been empirically investigated how the fault detection
performance of the proposed method is influenced by different values of the involved
parameters.
Acknowledgment
This work was sponsored by Scania and VINNOVA (Swedish Governmental Agency for
Innovation Systems).
Proof. According to (Casella and Berger, 2001, Theorem 10.1.6), (71) holds if the following
regularity conditions on p (r∣ϕ) are satisfied: i) r 1 , r 2 , . . . , r N are iid samples from p (r∣ϕ);
ii) the parameter ϕ is identifiable, i.e., if ϕ ≠ ϕ′ , then p (r∣ϕ) ≠ p (r∣ϕ′ ); iii) the densities
p (r∣ϕ), for all ϕ ∈ Φ′ , have common support, and p (r∣ϕ) is differentiable in ϕ; iv) the
parameter space Φ′ contains an open set φ of which the true parameter ϕ is an interior
point. It is first noted condition i) is trivially satisfied by assumption. For condition ii),
160 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
assume that ϕ ≠ ϕ′ . This implies that there exists k ∈ {1, 2, . . . , M} such that ϕ k ≠ ϕ′k ,
and it holds that
and hence p (r∣ϕ) ≠ p (r∣ϕ′ ). Regarding condition iii), it is recalled that the support of a
function is the set of points where the function is non-zero zero. Thus, the first part of
condition iii) is trivially satisfied due to the form of the pmf p (r∣ϕ) in (1) and the prop-
erties of the parameter space Φ′ defined by (70). Considering next the differentiability, it
holds that
⎧
⎪
∂ ⎪1 k=j
p (x j ∣ϕ) = ⎨
∂ϕ k ⎪
⎪0 k≠j
⎩
for j = 1, 2, . . . , M, and hence condition iii) is satisfied. For condition iv), it is noted that
the parameter space Φ′ is an open set. Therefore every ϕ ∈ Φ′ is an interior point of an
open set and condition iv) is satisfied. This completes the proof.
K cj
⋆ ⋆
∑ αi θ i j = , (72)
i=1 N
(K)
where N = ∑ M
j=1 c j . Then, for each α ∈ Υ and θ ∈ Θ it holds that
1 L (α ⋆ , θ ⋆ ∣R)
D (p (r∣α ⋆ , θ ⋆ ) ∥p (r∣α, θ)) = log , (73)
N L (α, θ∣R)
M p (x j ∣α ⋆ , θ ⋆ )
D (p (r∣α ⋆ , θ ⋆ ) ∥p (r∣α, θ)) = ∑ p (x j ∣α ⋆ , θ ⋆ ) log
j=1 p (x j ∣α, θ)
∑ i=1 α ⋆i θ ⋆i j
M K K
= ∑ (∑ α ⋆i θ ⋆i j ) log K
(74)
j=1 i=1 ∑ i=1 α i θ i j
M j c
cj
=∑ log K N .
j=1 N ∑ i=1 α i θ i j
Consider next the right hand side of (73). Due to the prerequisites of the lemma, the
likelihood l (⋅, ⋅∣R) is given by (15). With this, the right hand side of (73) can be written
A. Proofs of Theorems and Lemmas 161
as
1 L (α ⋆ , θ ⋆ ∣R) 1
log = (l (α ⋆ , θ ⋆ ∣R) − l (α, θ∣R))
N L (α, θ∣R) N
1 ⎛M K
⋆ ⋆
M K ⎞
= ∑ c j log [∑ α i θ i j ] − ∑ c j log [∑ α i θ i j ]
N ⎝ j=1 i=1 j=1 i=1 ⎠
∑ i=1 α ⋆i θ ⋆i j
K
1 M
= ∑ c j log K
N j=1 ∑ i=1 α i θ i j
cj
1 M
= ∑ c j log K N ,
N j=1 ∑ i=1 α i θ i j
which equals (74).
Proof of Lemma 3. First note that (63) implies that for each θ k ∈ P i there is an element
R k ∈ T i , and vice versa. By using the same arguments as in the proof of Theorem 4, it
holds that the log-likelihood l (θ k ∣R k ) is given by (57). Thus, each MLE problem in (49)
is equivalent to (16) if K = 1 and α 1 = 1, or equivalently (17), and Theorem 1 is applicable.
c
From Theorem 1 it then follows that θ k j = nk j , j = 1, 2, . . . , M, for each θ k ∈ Ψ. From
Lemma 6, again with K = 1 and α 1 = 1, it follows that
1 L (θ k ∣R k )
D (p (r∣θ k ) ∥p (r∣θ)) = log
n L (θ∣R k )
(75)
1
= (l (θ k ∣R k ) − l (θ∣R k )) ,
n
for any θ ∈ Θ(1) . Consider now the inequality (65). By exploiting (75), the inequality (65)
can be written as
D (p (r∣θ k ) ∥p (r∣θ p )) ≤ D (p (r∣θ k ) ∥p (r∣θ q ))
⇐⇒
1 1
(l (θ k ∣R k ) − l (θ p ∣R k )) ≤ (l (θ k ∣R k ) − l (θ q ∣R k ))
n n
⇐⇒
l (θ p ∣R k ) ≥ l (θ q ∣R k ) ,
and equivalence between (65) and (64) has been established. Consider now (66). By
again using (75) and (63), it follows that
1
arg min ∑ D (p (r∣θ k ) ∥p (r∣θ)) = arg min ∑ (l (θ k ∣R k ) − l (θ∣R k )) . (76)
(1) (1) n
θ∈Θ θ k ∈P i θ∈Θ R k ∈T i
Since θ only is present in the term l (θ∣R k ) in (76), and due to the minus sign in front
of this term, (76) can be written as
1
arg min ∑ (l (θ k ∣R k ) − l (θ∣R k )) = arg max ∑ l (θ∣R k ) ,
(1)
θ∈Θ R ∈T i n θ∈Θ (1) R k ∈T i
k
162 Paper C. Data-Driven and Adaptive Statistical Residual Evaluation . . .
for i = 1, 2, . . . , K. Now note that due to the properties of the log-likelihood function (57)
it holds that
M
l (θ∣ ∪Rk ∈T i R k ) = ∑ ∑ c k j log θ j
j=1 R k ∈T i
M
= ∑ ∑ c k j log θ j (79)
R k ∈T i j=1
= ∑ l (θ∣R k )
R k ∈T i
for i = 1, 2, . . . , K. From (80), the claim (67) follows directly via (66) in Lemma 3. Now
turn to the claim (68) and denote
K
M(P⋆ ) = ∑ ∑ D (p (r∣θ k ) ∥p (r∣θ ⋆i )) , (81)
i=1 θ k ∈P i
for i = 1, 2, . . . , K. To show that (68) holds by contradiction, assume that there exists
θ p ∈ P i , for some P i ∈ P⋆ , such that
and note that, due to (83), it holds that M̄ < M(P⋆ ). Define a new partition P′ of Ψ by
moving θ p from block P i ∈ P⋆ to block P j ∈ P⋆ , i.e., let P′ = (P′1 , P′2 , . . . , P′K ) where
⎧
⎪P i ∖ {θ p } , l=i
⎪
⎪
P′l
⎪
= ⎨P j ∪ {θ p } , l=j (85)
⎪
⎪
⎪
⎪
⎩P l , else.
Form
K
M(P′ ) = ∑ ∑ D (p (r∣θ k ) ∥p (r∣θ ′l )) (86)
l =1 θ k ∈P′l
where
for l = 1, 2, . . . , K. It is first noted that due to Lemma 3, and a similar argument as above
including (79), (78), and (77), the distribution parameters θ ′l , l = 1, 2, . . . , K, satisfy (52).
Consider now the quantity M̄ − M(P′ ), which by using (84) and (86) can be written as
Due to (81) and the properties of the partition P′ as given by (85), it holds that
and therefore (89) implies that M̄ − M(P′ ) ≥ 0, or equivalently that M(P′ ) ≤ M̄. Thus,
it holds that M(P′ ) ≤ M̄ < M(P⋆ ), which contradicts the statement (51). Hence, (83)
cannot hold and consequently (68) holds and the proof is complete.
References 165
References
M. Abid, W. Chen, S. X. Ding, and A.Q. Khan. Optimal residual evaluation for nonlinear
systems using post-filter and threshold. International Journal of Control, 84(3):526 – 39,
2011.
A. Bjorck. Numerical Methods for Least Squares Problems. SIAM, Philadelphia, PA,
1996.
M. R. Blas and M. Blanke. Stereo vision with texture learning for fault-tolerant automatic
baling. Computers and Electronics in Agriculture, 75(1):159 – 68, 2011.
G. Casella and R. L. Berger. Statistical Inference. Duxbury Press, second edition, 2001.
J. Chen and R. J. Patton. Robust Model-Based Fault Diagnosis for Dynamic Systems. MA:
Kluwer, Boston, 1999.
R. N. Clark. State estimation schemes for instrument fault detection. In R. J. Patton,
P. M. Frank, and R. N. Clark, editors, Fault Diagnosis in Dynamic Systems: Theory and
Application, chapter 2, pages 21–45. Prentice Hall, 1989.
L. Davies, U. Gather, D. Nordman, and H. Weinert. A comparison of automatic his-
togram constructions. ESAIM: Probability and Statistics, 13:181–196, 2009.
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete
data via the em algorithm. Journal of the Royal Statistical Society, 39(1):1–38, 1977.
S. X. Ding, P. Zhang, and E. L. Ding. Fault detection system design for a class of stochas-
tically uncertain systems. In Hong-Yue Zhang, editor, Fault Detection, Supervision and
Safety of Technical Processes 2006, pages 705 – 710. Elsevier Science Ltd, 2007.
A. Emami-Naeini, M. M. Akhter, and S. M. Rock. Effect of model uncertainty on failure
detection: the threshold selector. IEEE Transactions on Automatic Control, 33(12):1106
–1115, 1988.
P. M. Frank. Enhancement of robustness in observer-based fault-detection. International
Journal of Control, 59(4):955–981, 1994.
P. M. Frank. Residual evaluation for fault diagnosis based on adaptive fuzzy thresh-
olds. In IEE Colloquium on Qualitative and Quantitative Modelling Methods for Fault
Diagnosis, pages 401 –411, 1995.
P. M. Frank and X. Ding. Survey of robust residual generation and evaluation methods
in observer-based fault detection systems. Journal of Process Control, 7(6):403 – 424,
1997.
J. J. Gertler. Fault Detection and Diagnosis in Engineering Systems. Marcel Dekker, 1998.
F. Gustafsson. Adaptive Filtering and Change Detection. Wiley, 2000.
G. H. Hardy, J. E. Littlewood, and G. Polya. Inequalities. Cambridge, 1934.
K.H. Haskell and R.J. Hanson. An algorithm for linear least squares problems with
equality and nonnegativity constraints. Mathematical Programming, 21(1):98–118, 1981.
T. Höfling and R. Isermann. Fault detection based on adaptive parity equations and
single-parameter tracking. Control Engineering Practice, 4(10):1361 – 1369, 1996.
M. Inaba, N. Katoh, and H. Imai. Applications of weighted voronoi diagrams and
randomization to variance-based k-clustering: (extended abstract). In Proceedings of
the tenth annual symposium on Computational geometry, SCG ’94, pages 332–339, New
York, NY, USA, 1994. ACM.
References 167
C. Svärd, M. Nyberg, E. Frisk, and M. Krysander. Residual evaluation for fault diagnosis
by data-driven analysis of non-stationary probability distributions. In Proceedings of
the 50th IEEE Conference on Decision and Control and European Control Conference
(CDC-ECC 2011), 2011.
A. Willsky and H. Jones. A generalized likelihood ratio approach to the detection and
estimation of jumps in linear systems. IEEE Transactions on Automatic Control, 21(1):
108 – 112, 1976.
S. J. Wright. Primal-Dual Interior-Point Methods. SIAM, Philadelphia, PA, 1997.
169
Automotive Engine FDI by Application of an
Automated Model-Based and Data-Driven
Design Methodology
Carl Svärd, Mattias Nyberg, Erik Frisk, and Mattias Krysander
Abstract
171
172 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
1 Introduction
Emission related legislations (United Nations, 2008; European Parliament, 2009; Califor-
nia EPA, 2010; United States EPA, 2009) require on-board diagnosis (OBD) of all faults
in automotive engines that may lead to increased exhaust emissions. In addition, fault
accommodation by, e.g., fault-tolerant control (FTC) (Blanke et al., 2006), and off-board
diagnosis, are means in order to meet dependability requirements in the form of high
vehicle uptime, high safety, and efficient repair. A necessity for both diagnosis and fault
accommodation is fault detection and isolation (FDI).
Automotive engines pose several challenges and difficulties when it comes to design
of FDI-systems. Typically, engines are optimized for low-cost and high functionality, and
not for FDI, which means that there is no hardware redundancy in the form of multiple
sensors. To obtain good detection and isolation of faults it is therefore necessary to
employ analytical redundancy and model-based FDI. Due to the inherent complexity of
automotive engines, as well as their multi-domain features due to chemical, mechanical,
and thermodynamic subsystems, modeling results in large-scale, dynamic, and highly
non-linear systems (Wahlström and Eriksson, 2011). Thus, such models must be handled
by the methods used in the design of the FDI-system.
As a consequence of the complexity of automotive engines, in combination with their
wide operating range, models are typically not fully capable of capturing their behavior
in all operating modes. This results in model errors, and in particular stationary model
errors (Höckerdal et al., 2011a,b), regardless of substantial modeling work. In addition, a
model may be more accurate in one operating mode than another and since the operating
mode of the engine varies in time, so does the magnitude and nature of the model errors.
These aspects must be taken into account in the design of the FDI-system.
It is clear that design of a complete model-based FDI-system for an automotive
engine, and for large-scale real-world systems in general, is an intricate task that de-
mands a substantial engineering effort. An optimal solution in general requires detailed
knowledge of the behavior of the system and well-defined requirements, which typically
not is available during early design stages. In order to make the overall design process
more systematic and efficient, and in this way enable re-design or re-configuration, and
eventually higher quality, a generic automated methodology for design of FDI-systems
has been developed.
The design methodology relies on previously developed methods for sequential
residual generation (Svärd and Nyberg (2010), Paper B), and statistical residual evalua-
tion (Paper C). The residual generation methods described in Svärd and Nyberg (2010)
and Paper B are together able to design residual generators for fault detection and isola-
tion in systems described by complex large-scale models. This was demonstrated in Svärd
and Nyberg (2012), where they were combined with a residual evaluation approach based
on the Kullback-Leibler divergence (Kullback and Leibler, 1951) and applied to the Wind
Turbine Benchmark (Fogh Odgaard et al., 2009). The residual evaluation approach
employed in Svärd and Nyberg (2012) was however not able to fully handle the issue
concerning time-varying uncertainties related to model errors and operating modes
discussed above. In this work, the automated design methodology is refined by means
of the data-driven statistical residual evaluation approach described in Paper C, which
2. Automotive Diesel Engine System 173
xegr
EGR-cooler
pamb
EGR-valve Tamb
Wegr
ρ
ne
xvgt
Intake pim Wei Weo pem Wt
manifold Tim Tem
Turbine
Exhaust
manifold
ωt
Wth Cylinders
xth pbc
pic Tbc
Wc
the intake manifold, the air is mixed with recirculated exhaust gases, whose mass-flow is
denoted Wegr , before it enters the cylinders. The amount of recirculated gas is controlled
by the EGR-valve, whose position is denoted xegr . The total mass-flow of the gas entering
the cylinders is denoted Wei .
In the cylinders, the gas is mixed with fuel and then combusted. The amount of fuel
injected into the cylinders is given by ρ, and the rotational speed of the engine is denoted
ne . After the combustion, the gas enters the exhaust manifold. The mass-flow of the
exhaust gas is denoted Weo , and the pressure and temperature of the gas in the exhaust
manifold pem and Tem , respectively. The exhaust gas then passes the turbine side of the
VGT, whose rotational speed is given by ωt , and leaves the system with mass-flow Wt .
The geometry of the VGT is controlled with the VGT-valve, whose position is denoted
xvgt .
2.3 Faults
Faults in all sensors and actuators in Table 1, except in actuator u ρ and sensor y ne ,
are considered. All faults along with their description can be found in Table 2. The
2. Automotive Diesel Engine System 175
Signal Description
u xth Throttle position actuator
u xegr EGR-valve position actuator
u xvgt VGT-valve position actuator
uρ Injected fuel actuator
y ne Engine speed sensor
y pamb Ambient temperature sensor
y Tamb Ambient pressure sensor
y pic Inter-cooler pressure sensor
y pim Inlet manifold pressure sensor
y Tim Inlet manifold temperature sensor
y pem Exhaust manifold pressure sensor
Modeling of Faults
The faults are modeled as additive signals in corresponding equations in the nominal
model presented in next section. For example, fault ∆ y pim , representing a fault in the
intake manifold pressure sensor y pim , is modeled by simply adding ∆ y pim to the equation
describing the relation between the sensor value y pim and the actual intake manifold
pressure pim , i.e., y pim = pim + ∆ y pim .
The main argument for using this fault modeling approach is that it is considered
to be hard, or even impossible, to know how a faulty component behaves in reality and
data for evaluation and validation of a more detailed fault model is seldom available.
Moreover, modeling faults in this way also results in a minimum of fault modes, which
gives a smaller model. This is beneficial since a smaller model simplifies several steps in
model-based diagnosis, for example residual generation or fault isolation. The last but
not least argument is simplicity, since extending the nominal model with additive fault
signals is straightforward and easy. Nevertheless, the approach has shown to provide
good results (Svärd and Nyberg, 2012).
The adopted approach is nonetheless general, and no assumptions are made regarding
for example the time-behavior of faults. Note for example that the approach is able
to handle multiplicative faults even though the fault signal is assumed to be additive.
Consider for example a multiplicative fault in y pim given by y pim = δ ⋅ pim , δ ≠ 1, which
can be equivalently described by ∆ y pim = pim (δ − 1).
2.4 Model
The model of the automotive diesel engine can be found in Appendix A. The model
contains in total 46 equations, 43 unknown variables, 11 known variables, of which 4 are
actuators and 7 sensors, and 9 faults. Of the 46 equations, 5 are differential equations
176 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
Fault Description
∆ y pamb Fault, ambient pressure sensor
∆ y Tamb Fault, ambient temperature sensor
∆ y pic Fault, intercooler pressure sensor
∆ y pim Fault, intake manifold pressure sensor
∆ y Tim Fault, intake manifold temperature sensor
∆ y pem Fault, exhaust manifold pressure sensor
∆ u xth Fault, throttle position actuator
∆ u xegr Fault, EGR-valve position actuator
∆ u xvgt Fault, VGT-valve position actuator
⎪Ψ∗ (Πth )
⎧
⎪ if Πth ≤ Πth,lin
Ψth (Πth ) = ⎨ th , (1)
⎪
⎪ Ψ∗ (Π ) 1−Πth if Πth > Πth,lin
⎩ th th,lin 1−Πth,lin
where
√
∗ 2γth 2/γ 1+1/γ
Ψth (Πth ) = (Πth th − Πth th ),
γth − 1
Design of Residual
No-Fault Residual Evaluators Evaluators
Data
Two-Step Approach
Given the model M of the system and the diagnosis requirement F, the two steps
illustrated in Figure 4 are conducted. In the first step, a large set of candidate residual
generators, in the form of subsets of the model equations, is found. This step is done in
an exhaustive manner, in the sense that all model equation subsets that can be used as
input to the sequential residual generation method (Svärd and Nyberg, 2010) are found.
For this particular method, it can be shown (Svärd and Nyberg, 2010) that candidate
residual generators by necessity should be based on Minimal Structural Overdetermined
(MSO) sets of equations. There exists efficient algorithms for finding all MSO sets, given
a model, see, e.g., Krysander et al. (2008).
In general, all candidate residual generators found in the first step are not realizable,
i.e., it is not possible to create residual generators from all found candidate residual
generators with the considered method. Therefore, in the second step, a set of realizable
candidate residual generators that fulfills the diagnosis requirement F are selected and
the final set of residual generators R 1 , R 2 , . . . , R n is created.
3. Overview of Design Methodology 179
Candidate
Generate Candidate Residual
Model Generators
Residual Generators
ż = f (z, w1 , w2 , . . . , wm , y) (2a)
w1 = g 1 (z, y) (2b)
w2 = g 2 (z, w1 , ẇ1 , y) (2c)
⋮
wm = g m (z, w1 , ẇ1 , w1 , ẇ2 , . . . , wm−1 , ẇm−1 , y) (2d)
Residual No-Fault
Generators Estimate Residual Distributions Create Residual Residual
No-Fault Distributions Evaluation Tests Evaluators
Data
not already isolable faults in the given diagnosis requirement F is selected, and added to
the solution if it is realizable. This procedure is repeated until F is fulfilled, or no useful
candidate residual generators remains.
In addition to make the selection problem tractable, the greedy selection algorithm
has some additional properties. Specifically, it can be shown (Paper B that if, and
only if, the given diagnosis requirement can be fulfilled for the given model with the
method (Svärd and Nyberg, 2010), then the algorithm will provide a solution.
Test Statistic
The obtained no-fault residual distributions are then used to create a residual evaluator Ti
for each of the residuals r 1 , r 2 , . . . , r n . The residual evaluator Ti , with the binary detection
signal d i as output, comprises a fault detection test
⎧
⎪
⎪1 if λ i (R i ) > J i ,
d i = Ti (R i ) = ⎨ (3)
⎪
⎪0 else,
⎩
where λ i is a test statistic, R i is a set of discretized samples from residual r i , and J i is a
constant detection threshold.
The test statistic λ i in each fault detection test is designed with the method developed
in Paper C and based on the Generalized Likelihood Ratio (GLR) test. Given a set
4. Design of Residual Generators 181
max L (α, θ NF
i ∣R i )
α
λ i (R i ) = −2 log , (4)
max L (α, θ∣R i )
α, θ
where L (α, θ∣R i ) denotes the likelihood of the parameters α and θ, given the residual
samples in R i . The parameters α and θ fully specify the probability distribution of the
samples in R i . In this sense, the quantity in the denominator of (4) corresponds to the
most likely distribution of the samples in R i , and the quantity in the numerator to the
most likely no-fault residual distribution.
(2008). An MSO set by definition contains one more equation than unknown variables.
Given an MSO set, a sequential residual generator is created by removing one equation
and then finding a computation sequence for the unknown variables in the remaining
just-determined set of equations. The number of candidate residual generators that can
be created from a single MSO set thus equals the number of equations in the MSO set.
This is the rationale behind the number of 14, 242 candidate residual generators.
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
R1 x x x x x x x
R2 x x x x x x x x
R3 x x x x x x x x
R4 x x x x x x x x
R5 x x x x x x x x
R6 x x x x x x x x
R7 x x x x x x x x
R8 x x x x x x x x
Fault Detectability
Table 3 shows the fault signature matrix (FSM) for the 8 selected residual generators
with respect to the faults in Table 2. In this context, the FSM contains an “x” in position
(R i , ∆ x ) if the equation containing fault ∆ x is used in the computation sequence on
which the residual generator R i is based. This should be interpreted as that residual
generator R i may be sensitive to fault ∆ x , meaning that it may respond to the fault. The
sensitivity of residual generator R i to the fault ∆ x however strongly depends on the
properties of R i , the size and temporal properties of ∆ x , and also on for example the
current operating mode of the system. In order to verify that R i is indeed sensitive to
∆ x , it is necessary to implement and run R i using representative data from relevant fault
cases. This will be done in Section 6.
Clearly, assuming that Table 3 reflects the fault sensitivity, there is more than one
residual generator that is sensitive to each of the 9 considered faults and thus all 9 faults
can, in theory, be detected.
184 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
∆ y pamb x x
∆ y Tamb x x x
∆ y pic x x x
∆ y pim x x x
∆ y Tim x x x
∆ y pem x x x
∆ u xth x x x
∆ u xegr x x x
∆ u xvgt x x
Fault Isolability
In general, given a set of residual generators, a fault ∆ x is said to be isolable from a fault
∆ y if the set contains a residual generator that is sensitive to fault ∆ x but not to fault ∆ y ,
see for example Paper B. As seen in Table 3, all 8 residual generators may be sensitive to
the faults ∆ y pamb and ∆ u xvgt . This is also indicated in Table 4, which shows the resulting
isolability matrix for the 8 selected residual generators. In Table 4, for instance, the “x” in
position (∆ y Tamb , ∆ y pamb ) denotes that fault ∆ y Tamb is not isolable from fault ∆ y pamb using
the residual generators R 1 , R 2 , . . . , R 8 .
Clearly, according to Table 4, the diagnosis requirement F in (5) not is met since,
for example, ∆ y Tamb not is isolable from ∆ y pim . Nevertheless, due to the properties of
the greedy selection algorithm discussed in Section 3.3, Table 4 shows the maximum
attainable isolability for the engine model, given the method for residual generation
considered in this work. The cardinality of the set of selected residual generators may
however not be minimal. See Paper B for more details.
Columns 2 and 3 in Table 5 show if the corresponding residual generator uses integral
causality (IC) and/or derivative causality (DC), respectively. Clearly, 5 out of 8 residual
generators employs mixed causality. Column 4 shows the number of equations contained
in the computation sequence on which the corresponding residual generator is based,
and the value in parenthesis how many of those equations that are differential equa-
tions. Recalling that the model contains in total 46 equations, of which 5 are differential
equations, it can be concluded that all residual generators uses a substantial part of the
complete model in spite of the above mentioned heuristic. This issue is further illustrated
by column 5 in Table 5, which shows how many of the 11 available signals in Table 1 that
each residual generator uses as input.
Columns 4 and 5 explain why most of the 8 selected residual generators may be
sensitive to most of the 12 faults, as illustrated in Table 3. In fact, this property holds
for all candidate residual generators which on average use about 40 equations, and is a
direct consequence of the properties of the automotive engine system. Specifically, the
system contains many physical interconnections, for example due to the shaft connecting
the turbine and the compressor and thus the intake and the exhaust parts of the engine,
see Figure 1. This leads to a model with coupled equations, in the sense that there are
sets of equations containing the same set of unknown variables. This fact implies that
a fault affecting one of these equations influences a large amount of the other model
equations. This fact, in combination with the relatively small number of sensors, makes
fault decoupling non-trivial and results in the situation shown in Table 3.
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
∆ y pamb x x x x x
∆ y Tamb x x x x x
∆ y pic x x x x x x
∆ y pim x x x x x x
∆ y Tim x x x x x
∆ y pem x x x x x x
∆ u xth x x x x x x
∆ u xegr x x x x x
∆ u xvgt x x x x x
14,242 candidate residual generators for not being realizable in the mixed causality case,
and 7,133 candidates in the integral causality case. The corresponding numbers in terms
of MSO sets are 91 and 135, respectively, out of 270. In the derivative causality case, ap-
parently, all candidate residual generators were discarded due to non-realizability. It can
be concluded that mixed causality improves realizability, in the sense that considerably
more candidate residual generators can be realized, which implies that more faults can be
isolated. This can be seen by comparing Table 4 with Table 6, which shows the resulting
isolability matrix when using only integral causality.
The large amount of discarded candidate residual generators, independent on the
causality assumption, is due to that no computation sequence can be found for these
candidate residual generators. This in turn is to a large extent caused by non-invertible
non-linear functions in the model. To illustrate this aspect, consider the equation
where Wth , pic , Tim , Πth , and xth are unknown variables, Ath,max and Ra are parameters,
γ γ
and Ψthth (⋅) and fth (⋅) are non-linear functions, with Ψthth (Πth ) given by (1). Clearly, the
γ th
function Ψth (Πth ) is not invertible with respect to Πth which implies that the variable
Πth can not be computed from the equation e 7 . The same holds for the variable xth ,
since the function fth (⋅) is non-invertible with respect to xth . This implies that only the
variables Wth , pic , and Tim , can be computed from equation e 7 . Most of the equations in
the diesel engine model exhibit this property, and this substantially limits how unknown
variables in the model can be computed, which in turn explains the large amount of
non-realizable, and thus discarded, candidate residual generators.
Stability Analysis
In comparison, only a fraction of the discarded candidate residual generators were
discarded due to not being stable. Nevertheless, the stability analysis is an important part
5. Design of Residual Evaluators 187
4
x 10
1
0
r5
−1
−2
−3
100 120 140 160 180 200 220 240 260 280 300
Time [s]
0.1
0.2
0.2 0.2
θ1
θ2
θ3
θ4
0.05 0.1
0.1 0.1
0 0 0 0
20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80
0.15 0.04 0.15
0.1 0.1
0.1
θ5
θ6
θ7
θ8
0.02
0.05 0.05 0.05
0 0 0 0
20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80
0.08 0.2
0.15
0.06 0.2
θ10
θ11
θ12
θ9
0.1
0.04 0.1 0.1
0.05 0.02
0 0 0 0
20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80
θ14
θ15
θ16
0.4 0.4
0.4
0.2
0.2 0.2 0.2
0 0 0 0
20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80
0.1 0.3
0.2 0.06
θ17
θ18
θ19
θ20
0.2
0.04
0.1 0.05
0.02 0.1
0 0 0 0
20 40 60 80 20 40 60 80 20 40 60 80 20 40 60 80
xi xi xi xi
6
x 10
−0.9
−1
−1.1
−1.2
(θ NF |Y)
−1.3
−1.4
−1.5
−1.6
Figure 8: Fit of the set of estimated no-fault distributions for different values of K, i.e.,
for different number of distributions in the set, to the estimation and validation data sets.
The figure shows the average of the fit for all 8 residuals.
For this application, 20 distributions per residual is a good trade-off between model
fit and complexity since the gain in model fit obtained when choosing a higher number
is marginal in comparison with the corresponding increase in computational effort. This
is illustrated in Figure 8, which shows the model fit in the form of the log-likelihood
ℓ (θ NF ∣Y) of the distributions in θ NF given the no-fault data Y. The quantity shown in
Figure 8 is the averaged model fit for all 8 residuals, evaluated for different number of
distributions and for both the estimation and validation data.
For each of the residuals r 1 , r 2 , . . . , r 8 , a residual evaluator Ti in the form (3) was created.
The sampling of residual values for the sets R i , i = 1, 2, . . . , 8, was done by means of
a sliding window. The number of samples in each sliding window was chosen to be
1024. The choice of this number is a trade-off between detection performance and
computational complexity. For a thorough discussion of this issue, see Paper C.
To solve the relaxed version of MLE problem in the numerator of (4), see Section 3.4,
a tailored solver was generated using the software tool CVXGEN (Mattingley and Boyd,
2012). The detection thresholds J i , i = 1, 2, . . . , 8, were computed in order to give a
probability of false detection of 1%, by using the validation data set used in Section 5.1.
190 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
6 Experimental Evaluation
This section presents an experimental evaluation of the designed FDI-system. The
evaluation consists of two parts, with different purposes. The first part, presented in
Section 6.1, focus on the fault detection performance of the individual residual generators
and residual evaluators, whereas the second part, presented in Section 6.2, focus on the
detection and isolation performance of the complete FDI-system.
Metrics
The fault detection performance is studied by means of the statistical power of the fault
detection tests, for different sizes of the considered faults in Table 2. To quantify the
power of a test, the power function (Casella and Berger, 2001) will be used. In this context,
the power function for the fault detection test (3) for residual r i is defined as
where λ i is the test statistic, R i a set of samples from residual r i , J i is the detection
threshold, and δ is a fixed fault size. In the no-fault case, i.e., when δ corresponds to a
fault of size zero, the power function (7) gives the probability of false detection, or Type
I error (Casella and Berger, 2001). Otherwise, the power function gives the probability
of detection for fixed δ, or equivalently the probability of missed detection or Type II
error, by 1 − β i (δ).
In order to obtain a scalar metric for the detection performance of a specific detection
test with respect to a set D of different fault sizes, the quantity
1
∑ β i (δ) , (8)
∣D∣ δ∈D
will also be considered, where β i (δ) is the power function for detection test i. The
quantity (8) in some sense reflects the average detection performance of the detection
test. It may be noted that for an ideal test, i.e., whose probability for detection is one for
all fault sizes, the quantity (8) is equal to one.
Setup
In total 5 data sets were used in the evaluation. The data is not the same as the data
described in Section 5. Each data set contains measurements collected during a drive
on the Swedish west coast. The data sets contain measurements from in total approxi-
mately 2.5 hours of driving, and includes both high-way and city driving under different
conditions.
The considered fault type is gain fault. In the case of for example sensor fault ∆ y pamb ,
this means that the sensor signal y pamb fed to the residual generators is y pamb = δ ⋅ pamb
where δ ≠ 1 indicates a fault. The gain faults were implemented off-line by modification
of the corresponding sensor or actuator measurement signals.
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
T1 .01 .03 0 .17 .18 .74 .01 .05 .11 .22
T2 .38 .06 .62 .32 .01 .38 .09 .01 .11 .28
T3 .07 .05 .75 .40 .07 .25 .06 .01 .16 .23
T4 .01 0 .83 .47 .02 0 .11 0 .02 .49
T5 .01 0 .75 .53 .12 .44 .16 .01 0 .41
T6 .09 .10 .81 .01 .40 .86 .11 .06 .34 .35
T7 .01 .04 0 .19 .13 .39 .01 .07 .12 .16
T8 .65 .41 .02 .45 .59 .84 .12 .05 .33 .43
.31 .12 .76 .36 .26 .56 .11 .07 .20
and λ 8 respond clearly to the injected fault. However, the test statistic λ 7 do not cross the
detection threshold. It is noted that this indeed corresponds to a typical situation and is
taken into account in the fault isolation scheme, see Section 5.3. It may be noted that a
traditional column matching approach (Gertler, 1998) not is sufficient for this, typical,
case.
For the fault ∆ u xth , Table 3 states that residuals r 1 and r 7 should not be sensitive
to the fault. Again, this is hard to tell from Figure 10a but Figure 10b clearly shows
that test statistics λ 1 and λ 7 do not respond to the fault. As also seen in Figure 10b,
the response from the test statistic λ 3 is weak and it only barely crosses the detection
threshold. The responses from test statistics λ 2 and λ 8 are even weaker and they do not
cross the detection thresholds at all. This issue will be further discussed in Sections 6.1
and 6.2.
4
x 10
15
2000
10
r1
5 1500
λ1
0 1000
0.2 2500
0 2000
r2
−0.2
λ2
1500
−0.4
1000
660 680 700 720 740 760 780 800 500
5
x 10 660 680 700 720 740 760 780 800
0 3000
r3
λ3
−5 2000
−10 1000
660 680 700 720 740 760 780 800
4
x 10 660 680 700 720 740 760 780 800
15000
0
−1
r4
10000
λ4
−2
−3 5000
0 2500
2000
−2
r5
λ5
1500
−4 1000
500
660 680 700 720 740 760 780 800
4 660 680 700 720 740 760 780 800
x 10
2500
10 2000
r6
λ6
5 1500
0 1000
500
660 680 700 720 740 760 780 800
660 680 700 720 740 760 780 800
4
x 10
2000
10 1500
λ7
r7
5 1000
0 500
660 680 700 720 740 760 780 800 660 680 700 720 740 760 780 800
4
x 10 800
2 600
λ8
1
r8
400
0
200
−1
660 680 700 720 740 760 780 800 660 680 700 720 740 760 780 800
4
x 10
15 2000
10 1500
r1
λ1
5
1000
0
660 680 700 720 740 760 780 800 500
660 680 700 720 740 760 780 800
1400
0.2 1200
1000
r2
λ2
800
600
−0.2 400
200
660 680 700 720 740 760 780 800
5 660 680 700 720 740 760 780 800
x 10
4 1400
1200
2 1000
r3
λ3
800
0
600
−2 400
660 680 700 720 740 760 780 800 200
4 660 680 700 720 740 760 780 800
x 10
6 3000
4
2000
r4
λ4
2
0 1000
5 1500
0
λ5
r5
1000
−5
−10 500
10
r6
1000
5
500
0
660 680 700 720 740 760 780 800
660 680 700 720 740 760 780 800
4
x 10
2000
15
1500
λ7
10
r7
1000
5
500
0
660 680 700 720 740 760 780 800 660 680 700 720 740 760 780 800
4
x 10 800
6
600
4
λ8
2
r8
400
0
−2 200
660 680 700 720 740 760 780 800 660 680 700 720 740 760 780 800
1 1 1 1
β1 (δ)
β2 (δ)
β1 (δ)
β2 (δ)
0.5 0.5 0.5 0.5
0 0 0 0
0.8 1 1.2 0.8 1 1.2 0.5 1 1.5 0.5 1 1.5
1 1 1 1
β3 (δ)
β4 (δ)
β3 (δ)
β4 (δ)
0.5 0.5 0.5 0.5
0 0 0 0
0.8 1 1.2 0.8 1 1.2 0.5 1 1.5 0.5 1 1.5
1 1 1 1
β5 (δ)
β6 (δ)
β5 (δ)
β7 (δ)
β8 (δ)
β7 (δ)
β8 (δ)
Figure 11: Power functions β i (δ), i = 1, 2, . . . , 8, for faults ∆ y pic and ∆ u xth .
in best overall averaged test power than all other faults. Faults ∆ y Tamb , ∆ u xth , and ∆ u xegr ,
result in quite poor test power in comparison. This can also be seen in Figure 11.
The correspondence between the FSM in Table 3 and the averaged test powers in
Table 7 when it comes to non-sensitive residual generators is good, in the sense that an
empty entry in Table 3 always corresponds to a zero, or almost zero, entry in Table 7.
However, the converse is not always true, since there are zero, or almost zero, entries in
Table 7 where there are an “x” in Table 3. In particular, this holds for faults ∆ y pamb and
∆ u xvgt . According to Table 3, all residual generators may be sensitive to faults ∆ y pamb and
∆ u xvgt . However, as indicated by Table 7, all tests do not respond to these faults.
Metrics
To this end, the following metrics are considered.
Detection Time (DT): Time from fault injection to first detection by any test that may
be sensitive to the fault.
Isolation Time (IT): Time from fault injection to first correct fault isolation statement.
Missed Detection Rate (MDR): The fraction of test runs for which the injected fault
not is detected by any of the tests that may be sensitive to the fault.
Missed Isolation Rate (MIR): The fraction of test runs for which a correct fault isola-
tion statement not is obtained.
196 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
Fault Specification
∆ y pamb y pamb = 0.5 ⋅ pamb
∆ y Tamb y Tamb = 1.3 ⋅ Tamb
∆ y pic y pic = 1.2 ⋅ pic
∆ y pim y pim = 0.9 ⋅ pim
∆ y Tim y Tim = 0.7 ⋅ Tim
∆ y pem y pem = 0.8 ⋅ pem
∆ u xth u xth = 0.3 ⋅ u xth
∆ u xegr u xegr = 0.4 ⋅ u xegr
∆ u xvgt u xvgt = 0.5 ⋅ u xvgt
False Detection Rate (FDR): The fraction of samples for which the injected fault is
detected by a test that should not be sensitive to the fault, or a fault is detected by
any test in a no-fault condition.
Note that all metrics are defined with respect to the complete FDI-system, and not
in the context of the individual tests. This means, for instance, that a run in which where
only one out of several sensitive tests responds, not will be regarded as a missed detection.
A situation where only one out of several possible tests responds falsely, will on the
other hand be counted as a false detection. Also note that missed detections and missed
isolations are counted on test run basis, whereas false detections are counted on sample
basis.
Moreover, note that with a correct fault isolation statement it is meant an isolation
statement in accordance with the isolability matrix in Table 4. That is, when fault ∆ y pamb
has occurred, the correct fault isolability statement is that either of the faults ∆ y pamb or
∆ u xvgt has occurred.
Setup
In total 12 different data sets were used in this part of the evaluation. As in the previous
study, the data sets contain measurements from drives with both high-way and city parts
under different conditions. Each fault specified in Table 8 was injected abruptly after
a fixed time one at a time in each of the 12 data sets. This means that there were in
total 12 test runs per fault. The sizes of the faults as specified in Table 8 were chosen in
consultation with experienced engineers in order to be realistic for the considered diesel
engine.
Table 9: Results
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
Mean 49.1 78.4 33.2 41.1 86.5 39.2 66.5 75.0 90.9
DT Min 5.0 2.3 18.7 18.7 4.8 11.9 9.4 2.9 6.1
Max 83.6 35.9 72.5 115.0 290.5 61.3 166.8 116.9 144.3
Mean - - 221.0 149.0 - 523.0 308.8 - -
IT Min - - 97.0 96.6 - 261.4 227.2 - -
Max - - 437.9 223.8 - 784.7 369.5 - -
MDR 0 0 0 0 0 0 0 0 0
MIR 1 1 0.75 0.67 1 0.83 0.75 1 1
FDR 0.043 0.076 0.057 0.067 0.043 0.049 0.056 0.051 0.043
First of all, it can in Table 9 be noted that all faults can be detected within reasonable
time, meaning that there were no missed detections. As seen, however, ideal isolation
statements were not obtained for all faults. Nevertheless, the injected fault was contained
in each of the obtained isolation statement. The occurrence of missed isolations can be
explained by the fact that the FSM in Table 3 used in the isolation scheme, see Section 5.3,
does not completely reflect the fault sensitivity of the tests in the FDI-system. This was
illustrated in Figures 9b and 10b and will be further considered in next section.
It is evident from Table 9 that the conclusion in Section 6.1 regarding the ability to
detect the pressure sensor faults ∆ y pic , ∆ y pim , and ∆ y pem in a reliable way, is supported
by Table 9. All of these faults result in comparatively short detection times, low rates of
false detections, and can in addition be isolated to a higher extent than the other faults.
The same holds for the conclusions in Section 6.1 regarding the faults ∆ y Tamb and ∆ u xegr ,
which according to Table 9 results in longer detection times, and higher rates of false
detection.
The absolute values of the metrics in Table 9 depend mainly on the value of the
detection thresholds. The higher the detection thresholds, the lower the rate of false
detection, the higher the rate of missed detection, and the longer the detection and
isolation times, and vice versa. In addition, as said in Section 5.2, the detection and
isolation times is affected by the size of the sliding windows used to collect samples for
the residual evaluation.
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
T1 x x x x x x
T2 x x x x x x x
T3 x x x x x x x x
T4 x x x
T5 x x x x x
T6 x x x x x x x x
T7 x x x x x x
T8 x x x x x x x x
similar as those depicted in Figures 9b and 10b, where a test responds but the response
not is sufficient in order for the test statistic to cross the threshold. However, this would
also increase the amount of false detections. In addition, the situation where a test do
not respond at all to a fault, is not handled.
The second approach is to instead adjust the FSM so that it indeed represent the actual
fault sensitivity of the tests. This can for example be done by exploiting the averaged
test powers in Table 7. The benefit with this approach is that, in addition, the detection
thresholds can be adjusted in order to achieve desired detection times, and desired
rates of false and missed detections. The main drawback is that it may affect the overall
detectability and isolability properties of the FDI-system, due to additional zeros in the
adjusted FSM. See (Krysander, 2006, Chapter 11) for a more general treatment of this
issue. Moreover, it should be noted that the adjustment of the FSM typically relies on
estimated test power, which strongly depends on the features of the available data.
Results
Both approaches were applied. However, the first approach did not give satisfactory
results. Despite detection thresholds resulting in fault detection rates in the magnitude
of 30-40 %, the resulting missed isolation rates were not lower for all faults.
Using the second approach, the averaged powers of the residual evaluation tests as
given in Table 7 were used in order to adjust the entries of the FSM in Table 3. Specifically,
each “x” in the FSM in Table 3 was removed if the corresponding entry in Table 7 was
lower than 0.02. The removed entries are marked with bold in Table 7. The adjusted
FSM, now for residual evaluators instead of the residual generators, is given in Table 10.
The resulting isolability matrix is shown in Table 11, which should be compared with
the original isolability matrix given in Table 4. It can be noted that the isolability in fact
has increased in the sense that a larger fraction of the diagnosis requirement F in (5) is
fulfilled. Specifically, 58 of the 72 fault pairs in F can now be isolated from each other, in
comparison with 56 before.
Results in accordance with Table 9 are given for the FDI-system with the adjusted
FSM in Table 12. The same detection thresholds and data were used as in the evaluation
7. Conclusions 199
∆ y pamb
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
∆ y pamb x x x x x
∆ y Tamb x x x
∆ y pic x x
∆ y pim x
∆ y Tim x x
∆ y pem x
∆ u xth x
∆ u xegr x x x x x
∆ u xvgt x x x
∆ y Tamb
∆ u xegr
∆ y pem
∆ u xvgt
∆ y pim
∆ y Tim
∆ u xth
∆ y pic
Mean 48.1 82.9 33.2 41.1 87.0 39.2 66.5 77.8 90.7
DT Min 5.0 2.3 18.7 18.7 4.8 11.9 9.4 2.9 6.1
Max 83.6 35.9 72.5 115.0 290.5 61.3 166.8 116.9 144.3
Mean 168.7 228.6 47.2 148.0 142.7 190.4 246.8 315.7 430.5
IT Min 45.5 173.3 28.5 96.6 142.7 57.1 62.0 5.3 129.8
Max 346.3 283.2 94.0 223.8 142.7 784.7 329.6 545.8 612.8
MDR 0 0 0 0 0 0 0 0 0
MIR 0.42 0.75 0 0.58 0.83 0.25 0.42 0.67 0.67
FDR 0.11 0.082 0.064 0.067 0.053 0.049 0.056 0.063 0.069
presented in Table 9.
It can be seen in Table 12 that the missed isolation rate (MIR) is lower for all faults,
in comparison with Table 9. In addition, the isolation times are lower for all faults, and
for some faults, e.g., ∆ y pic , the difference is significant. Furthermore, the detection times
are identical, or comparable, with those given in Table 9. It may be noted that there is
a slight increase in false detection rate. This is a direct consequence of the additional
empty entries in the adjusted FSM shown in Table 10. Every detection of a fault by a
test whose corresponding entry in Table 10 has been removed, now counts as a false
detection.
7 Conclusions
It has been illustrated how an FDI-system for an automotive diesel engine can be de-
signed by application of a generic automated design methodology. No specific adaption
200 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
of the methodology to the automotive diesel engine system was made. Through the appli-
cation, it has been empirically shown that employment of mixed causality substantially
increased the number of realizable residual generators. Foremost, this leads to increased
fault isolability as is evident by comparison of Tables 4 and 6. Moreover, it has been
demonstrated how model errors of time-varying nature and magnitude can be handled
in the framework of statistical likelihood-based residual evaluation. Illustrations are
given in Figures 9 and 10.
The FDI-system, and thus the potential of the automated design methodology, has
been evaluated using road and test-bed measurements. The overall performance of the
FDI-system is good in comparison with the required design effort. The fault sensitivities
of the individual fault detection tests have been investigated by means of the estimated
averaged test power (8). It was concluded that the fault sensitivity indicated in the FSM
in Table 3, not fully corresponded to the fault sensitivity as given by the averaged test
powers shown in Table 7. Specifically, this results in high missed isolation rates. It has
been illustrated that an adjustment of the original FSM by utilization of the averaged
test powers, resulting in the adjusted FSM in Table 10, gives an FDI-system capable
of isolating more faults from each other, as can be seen by a comparison of Tables 11
and 4. In addition, this also resulted in increased fault isolation performance, in terms of
substantially lower missed isolation rater and lower isolation times, in comparison with
the original FSM, which can be seen by a comparison of Tables 12 and 9.
Acknowledgment
This work was sponsored by Scania and VINNOVA (Swedish Governmental Agency for
Innovation Systems).
A Model Equations
Ra Tim
e1 ∶ ṗic = (Wc − Wth )
Vic
Ra Tim
e2 ∶ ṗim = (Wth + Wegr − Wei )
Vim
Re Tem Re
e3 ∶ ṗem = (Weo − Wegr − Wt ) + (Win cve (Tem,in − Tem )
Vem Vem cve
+ Re (Tem,in Win − Tem Wout ))
Re Tem
e4 ∶ Ṫem = (Win cve (Tem,in − Tem ) + Re (Tem,in Win − Tem Wout ))
pem Vem cve
e5 ∶ Win = max(Weo , 0) + max(−Wegr , 0) + max(−Wt , 0)
e6 ∶ Wout = max(−Weo , 0) + max(Wegr , 0) + max(Wt , 0)
pic Ath,max γth
e7 ∶ Wth = √ Ψ (Πth ) fth (xth )
Tim Ra th
A. Model Equations 201
2
e 29 ∶ k c1 = k c11 (min(Ma, Ma max )) + k c12 min(Ma, Ma max ) + k c13
2
e 30 ∶ k c2 = k c21 (min(Ma, Ma max )) + k c22 min(Ma, Ma max ) + k c23
2
e 31 ∶ k c3 = k c31 (min(Ma, Ma max )) + k c32 min(Ma, Ma max ) + k c33
Rc ωt
e 32 ∶ Ma = √
γa Ra Tbc
1−1/γ a
2c pa Tbc (Πc − 1)
e 33 ∶ Ψc =
Rc2 ωt2
e 34 ∶ pbc = pamb
e 35 ∶ Tbc = Tamb
e 36 ∶ y pamb = pamb + ∆ y pamb
e 37 ∶ y Tamb = Tamb + ∆ y Tamb
e 38 ∶ y pic = pic + ∆ y pic
e 39 ∶ y pim = pim + ∆ y pim
e 40 ∶ y Tim = Tim + ∆ y Tim
e 41 ∶ y pem = pem + ∆ y pem
e 42 ∶ u xth = xth + ∆ u xth
e 43 ∶ u xegr = xegr + ∆ u xegr
e 44 ∶ u xvgt = xvgt + ∆ u xvgt
e 45 ∶ uδ = δ
e 46 ∶ y n e = ne
References 203
References
I. M. Al-Salami, S. X. Ding, and P. Zhang. Statistical based residual evaluation for
fault detection in networked control systems. In Proceedings of Workshop on Advances
Control and Diagnosis, Nancy, France, November 2006. Nancy University.
M. R. Blas and M. Blanke. Stereo vision with texture learning for fault-tolerant automatic
baling. Computers and Electronics in Agriculture, 75(1):159 – 68, 2011.
California EPA. Sections 1971.1, 1968.2, and 1971.5 of title 13, cal-
ifornia code of regulations: HD OBD and OBD II regulations.
http://www.arb.ca.gov/msprog/obdprog/hdobdreg.htm, 2010. California Envi-
ronmental Protection Agency, Air Resources Board.
G. Casella and R. L. Berger. Statistical Inference. Duxbury Press, second edition, 2001.
J. P. Cassar and M. Staroswiecki. A structural approach for the design of failure detection
and identification systems. In Proceedings of IFAC Control Ind. Syst., pages 841–846,
Belfort, France, 1997.
P. Fogh Odgaard, J. Stoustrup, and M. Kinnaert. Fault tolerant control of wind turbines
- a benchmark model. In Proceedings of the 7th IFAC Symposium on Fault Detection,
Supervision and Safety of Technical Processes, pages 155–160, Barcelona, Spain, 2009.
J. Gertler. Fault Detection and Diagnosis in Engineering Systems. Marcel Dekker, 1998.
204 Paper D. Automotive Engine FDI by Application of an Automated Design . . .
X. Wei, H. Liu, and Y. Qin. Fault diagnosis of rail vehicle suspension systems by using
glrt. In Control and Decision Conference (CCDC), 2011 Chinese, pages 1932 –1936, may
2011.
A. Willsky and H. Jones. A generalized likelihood ratio approach to the detection and
estimation of jumps in linear systems. IEEE Transactions on Automatic Control, 21(1):
108 – 112, 1976.
Paper E
207
Automated Design of an FDI-System for the
Wind Turbine Benchmark
Abstract
209
210 Paper E. Automated Design of an FDI-System . . .
1 Introduction
Wind turbines stand for a growing part of power production. The demands for reliability
are high, since wind turbines are expensive and their off-time should be minimized. One
potential way to meet the reliability demands is to adopt fault tolerant control (FTC),
i.e., prevent faults from developing into failures by taking appropriate actions. A typical
action is reconfiguration of the control system. An essential part of an FTC-system is
the fault detection and isolation (FDI) system, see, e.g., Blanke et al. (2006). To obtain
good detection and isolation of faults, model-based FDI is often necessary.
Design of a complete model-based FDI-system is a complex task and involves by
necessity several decisions, for example, method choices, tuning of parameters, and
assumptions regarding noise distributions and the nature of the faults to be diagnosed. In
general, an optimal solution requires detailed knowledge of the behavior of the considered
system, something that is rarely available for real applications. In this paper, inspired
by work with real industrial applications, we propose an automated design method that
minimizes the number of required human decisions and assumptions. Furthermore, we
investigate the potential of designing an FDI-system for the wind turbine benchmark,
see Fogh Odgaard et al. (2009), using this automated method.
The design method is composed of three main steps. In the first step, a large set of
candidate residual generators are generated using the algorithm described in Krysander
et al. (2008). In the second step, the residual generators most suitable to be included in
the final FDI-system are selected and realized by means of a greedy selection algorithm,
based on ideas elaborated in Svärd et al. (2011). The realization, or construction, of
residual generators is done by use of the algorithms presented in Svärd and Nyberg
(2010). In the third and final step, we design diagnostic tests based on the residuals
obtained as output from the selected set of residual generators. The diagnostic tests
relies on a novel methodology based on a comparison of the probability distributions of
no-fault residuals, estimated offline using no-fault training data, and the distributions of
residuals estimated online using current data.
As it turns out, the proposed FDI-system performs well when evaluated on the test
sequence described in Fogh Odgaard et al. (2009). A tailor-made FDI-system perfectly
tuned for the wind turbine benchmark would probably perform better than the one
we propose. However, in relation to the minimal effort required for application of the
automated design method, and in spite of no extra tuning or specific adaptation to
the benchmark, the performance of the FDI-system is satisfactory; all faults in the test
sequence can be detected within feasible time, and there are no false or missed detections.
Further, all faults, except a double fault, can also be isolated.
The wind turbine benchmark model and the strategy used for modeling of faults,
are described in Section 2. Section 3 presents an overview of the design method. The
method for constructing residual generators is described in Section 4, and the approach
used for selecting residual generators is described in Section 5. The method for design
of diagnostic tests, and the fault isolation scheme is considered in Section 6. Some
implementation specific details are discussed in Section 7. The performance of the
designed FDI-system is evaluated and discussed in Section 8, and Section 9 concludes
the paper.
2. The Wind Turbine Model 211
r g
vw
Blade & Pitch Generator &
Drive Train
System Converter
r g
r, m g,m g , m
Pg
r m vw, m
g ,r
Controller
Pr
where ζ, ω n are parameters, and x β i1 , x β i2 state variables. Using the same approach, the
relation between converter reference τ g,r and output τ g can be written as
Signal Description
vw Wind speed
vw ,m Wind speed measurement
βr Pitch angle reference
βm Pitch angle measurement
ωr Angular rotor speed
ω r,m Angular rotor speed measurement
ωg Generator rotor speed
ω g,m Generator rotor speed measurement
τr Rotor torque
τg Generator torque
τ g,r Generator torque reference
τ g,m Generator torque measurement
Pr Power reference
Pg Generator power
where ∆β 1 , ∆β 2 , ∆β 3 , and ∆τ g are actuator faults, ∆ω g a system fault, and ∆β 1,m1 , ∆β 1,m2 ,
∆β 2,m1 , ∆β 2,m2 , ∆β 3,m1 , ∆β 3,m2 , ∆ω r,m1 , ∆ω r,m2 , ∆ω g,m1 , and ∆ω g,m2 , sensor faults.
To incorporate fault information in the nominal model, we have chosen to model
all faults as additive signals in corresponding equations. Thus, we are not taking into
account all information regarding the nature of faults given in Fogh Odgaard et al. (2009).
Consider for example fault ∆β 1 which represents an actuator fault in pitch system 1, see (1),
resulting in changed dynamics of β 1 due to dropped main line pressure or high air content
in the oil. One possible way to model this fault would be as a deviation in parameters
ω n and ζ in (1a) and (1b). With the chosen approach, the fault is instead modeled as an
additive signal in (1c) for i = 1, i.e., β 1 = x β 11 + ∆β 1 .
Note that the adopted fault modeling approach is general and no assumptions are
made regarding for example the time-behavior of faults. Thus, the approach is able to
handle for example multiplicative faults even though the fault signal is assumed to be
additive. Consider for example a multiplicative fault in β 1 given by β 1 = δ ⋅ x β 11 where
δ ≠ 1, which can be equivalently described by β 1 = x β 11 + ∆β 1 , where ∆β 1 = x β 11 (δ − 1).
The main argument for using this, more general, approach is that we consider it
hard, or even impossible, to know exactly how a faulty component behaves in reality.
Furthermore, data from all fault-cases for evaluation and validation of a more detailed
model are seldom available. Modeling faults in this way also results in a minimum of
2. The Wind Turbine Model 213
fault modes. This is beneficial since it gives a smaller model which simplifies several steps
in model-based diagnosis, e.g., residual generation and isolation. In addition, regarding
how diagnosis information is utilized, e.g., for Fault Tolerant Control, it is unnecessary
to distinguish between different fault modes if they are associated with the same action
or consequence. Indeed, this applies to all sensor faults in the wind turbine, since the
system should be reconfigured regardless of the type of sensor fault, i.e., fixed value or
gain factor, see Table 2 in Fogh Odgaard et al. (2009). Last, but not least, an additional
important motivator is simplicity, since extending the nominal model with additive fault
signals in this way is straightforward and easy.
β i ,m1 + β i ,m2
β i ,r = β r + β i − ( ), i = 1, 2, 3 (3)
2
where β i is given by (1), and β i ,m1 and β i ,m2 are sensor measurements. To incorporate
this information in the design of the FDI system, the original wind turbine model is
extended with the relations between β i ,r and β r given by (3).
Bd t − Br Bd t Kd t 1
e 13 ∶ ω̇ r = − ( ) ωr + ( ) ωg − ( ) θ ∆ + ( ) τr
Jr N g Jr Jr Jr
1
e 14 ∶ θ̇ ∆ = ω r − ( ) ωg
Ng
214 Paper E. Automated Design of an FDI-System . . .
Residual Fault
Fault Detection
Generation Isolation
e 15 ∶ ẋ τ g = −α gc x τ g + α gc τ g,r
e 16 ∶ τ g = x τ g + ∆τ g
e 17 ∶ Pg = η gc ω g τ g
e 18 , e 20 , e 22 ∶ β i ,m1 = β i + ∆β i ,m1 , i = 1, 2, 3
e 19 , e 21 , e 23 ∶ β i ,m2 = β i + ∆β i ,m2 , i = 1, 2, 3
e 24 , e 25 ∶ ω r,m j = ω r + ∆ω r,m j , j = 1, 2
e 26 , e 27 ∶ ω g,m j = ω g + ∆ω g,m j , j = 1, 2
e 28 ∶ vw ,m = vw
e 29 ∶ τ g,m = τ g
e 30 ∶ Pg,m = Pg
β i ,m1 + β i ,m2
e 31 , e 32 , e 33 ∶ β i ,r = β r + β i − ( ) , i = 1, 2, 3
2
see Figure 3. In the first step, a large set of candidate residual generators are generated.
In the second step, the residual generators most suitable to be included in the final FDI-
4. Residual Generation 215
system are selected and realized. In the third and final step, we design diagnostic tests
based on the residuals obtained as output from the selected set of residual generators.
In the subsequent sections, we describe in detail the different steps of the design
method used to create the proposed FDI-system for the wind turbine benchmark system.
As input to the design method, or prerequisites, we assume a model of the system and
no-fault training data. The data is assumed to be expressed as measurements, either
real or simulated, of the inputs and outputs of the model in realistic and representative
no-fault operating conditions.
4 Residual Generation
The set of residual generators used in the FDI-system are based upon the ideas originally
described in Staroswiecki and Declerck (1989), where unknown variables in a model
are computed by solving equation sets one at a time in a sequence and a residual is
obtained by evaluating a redundant equation. Similar approaches are described and
exploited in for example Cassar and Staroswiecki (1997); Staroswiecki (2002); Pulido and
Alonso-González (2004); Ploix et al. (2005); Travé-Massuyès et al. (2006); Blanke et al.
(2006); Svärd and Nyberg (2010). This class of residual generation methods, referred
to as sequential residual generation, has shown to be successful for real applications and
also has the potential to be automated to a high extent.
Computation Sequence
As said above, the main idea in sequential residual generation is to compute unknown
variables in the model by solving equation sets one at a time in a sequence, and then
216 Paper E. Automated Design of an FDI-System . . .
for computation of a subset of the unknown variables in wind turbine model presented
in Section 2.4. According to the computation sequence (6), the series of computations
begins with computation of variable τ g using equation e 29 , then variable ω r is computed
using equation e 24 , and so on, ending with computation of variable ω̇ g , or in fact ω g
from equation e 12 .
By construction, see Svärd and Nyberg (2010), it is guaranteed that no variable is
needed before it has been computed. Hence, the series of computations described by
the computation sequence exhibit an upper triangular structure. For the computation
sequence (6), this series of computations is given by
τ g = τ g,m (7a)
ω r = ω r,m1 (7b)
1
θ̇ ∆ = ω r − ( ) ωg (7c)
Ng
ηd t Bd t
ηd t Bd t ⎛− N g2
− Bg ⎞ η K 1
ω̇ g = ( ) ωr + ⎜ ⎟ ωg + ( dt dt ) θ∆ − ( ) τg (7d)
Ng Jg ⎝ Jg ⎠ N g J g J g
Whether it is possible or not to compute the specified variables from the corresponding
equations depends naturally on the properties of the equations. Equally important are
however prerequisites in terms of causality assumption, i.e., regarding integral and/or
derivative causality, and the properties of the computational tools, that are available
for use, for a detailed discussion see, e.g., Svärd and Nyberg (2010). The computation
sequence (6) makes use of solely integral causality when the variables θ ∆ and ω g are
computed using equations e 14 and e 12 , respectively.
var X (⋅) returns the unknown variables that are contained in an equation set. A residual
generator based on a computation sequence C and redundant equation e is referred to as
a sequential residual generator.
The computation sequence (6), together with equation e 26 constitute a sequential
residual generator for the wind turbine model. When all variables in the computation
sequence (6) have been computed according to (7), the residual is computed as r =
ω g,m1 − ω g .
X = {τ r , β 1 , λ, vw , β 2 , β 3 , ω r , x β 11 , x β 12 , β 1,r , x β 21 , x β 22 ,
β 2,r , x β 31 , x β 32 , β 3,r , ω g , θ ∆ , τ g , x τ g , Pg } ,
1. the set of residual generators should enable us to isolate all single faults from each
other;
instance be deduced that fault ∆ω g is structurally isolable from fault ∆β 1,m1 in M, since
e ∆ω g = e 12 , e ∆β 1,m1 = e 18 , and it holds that e 12 ∈ M and e 18 ∈/ M, see Section 2.4.
By again utilizing the structure of the wind turbine model, the structural isolability
properties of the model were calculated. All considered faults, see Section 2.2, can be
(structurally) isolated from each other in the wind turbine model.
Characterization of a Solution
We will now characterize a complete solution to the selection problem for use in the
selection algorithm. First, we define the isolation class coverage of a set of MSO sets
S ⊆ M as
σI (S) = {I f i f j ∈ I ∶ ∃S ∈ S, S ∈ I f i f j } , (11)
which states which of the isolation classes in I that are covered by the MSO sets in S. The
property 1 in Section 5.1, i.e., the isolation or hitting set property, can with the isolation
class coverage notion be formulated as σI (S) = I. This characterizes a complete solution
of the selection problem.
Utility Function
To evaluate a specific MSO set, we want to take into account the properties 1, 2, and 3,
above. For a given MSO set S, we will use the utility function
where Ŝ is the MSO set in M with largest cardinality, and γ, 0 ≤ γ ≤ 1, a weighting factor.
The term ∣σI ({S})∣
∣I∣
in (12) tells how many of the isolation classes in I that are covered by
the MSO set S. Since we aim at covering all isolation classes with a minimum of MSO
sets, property 2, we want to pick an MSO set that maximizes this term. The term 1 − ∣S∣ ∣ Ŝ∣
relates the cardinality of S to the cardinality of all other sets in M. Picking an MSO set
that maximizes this term in (12) hence corresponds to picking the MSO set with smallest
cardinality in M. This will help us satisfy property 3. The weighting factor γ is used to
trade between the two properties reflected by these two terms.
Note that an MSO set maximizing one term in (12) may minimize the other since
an MSO set of larger cardinality likely cover more isolation classes than an MSO set of
smaller cardinality.
F (S) = { f i ∈ F ∶ e f i ∈ S} .
∆ω g,m2
∆ω r,m2
∆ω g,m1
∆β 2,m2
∆β 3,m2
∆ω r,m1
∆β 1,m2
∆β 2,m1
∆β 3,m1
∆β 1,m1
∆ω g
∆β 2
∆β 3
∆τ g
∆β 1
R 1 (S 1 ) x x
R 2 (S 2 ) x x
R 3 (S 3 ) x x
R 4 (S 4 ) x x
R 5 (S 5 ) x x
R 6 (S 8 ) x
R 7 (S 11 ) x x x
R 8 (S 27 ) x x
R 9 (S 29 ) x x
R 10 (S 31 ) x x
R 11 (S 7 ) x
R 12 (S 6 ) x
R 13 (S 14 ) x x x
R 14 (S 28 ) x x
R 15 (S 30 ) x x
R 16 (S 32 ) x x
online using current data. A clear advantage with this approach is that changes in mean
and variance are handled in a unified way, since we consider the complete distribution
of the residual.
normalized histogram with n bins for the data from which the distribution should be
estimated.
On-line, we continuously estimate the distribution of the current residual r i using a
sliding window containing N samples of r i . If we by Pit denote the estimated distribution
of r i calculated at time t, i.e., Pit ≈ P (R i ∣Z t ), where Z t denotes the batch of data in the
sliding window at time t, the diagnostic test is designed as
⎧
⎪
⎪1, if D (Pit ∥PiN F ) ≥ J i ,
Ti (t) = ⎨ (14)
⎪
⎪0, else,
⎩
where J i is the threshold for alarm. The K-L divergence D (Pit ∥PiN F ) is referred to as the
test quantity of the diagnostic test Ti .
7 Implementation Details
The final FDI-system was implemented in Simulink© according to the structure in
Figure 2. The 16 residual generators were implemented as Embedded Matlab Functions
(EMF) in which the code was automatically generated from the structures obtained
from the functions findComputationSequence and findResidualGenerators. The
initial conditions for the states in the dynamic residual generators were derived from
the corresponding sensor measurements, if available, otherwise set to zero. For instance,
β (t )+β (t ) ω (t )+ω (t )
θ ∆ (t 0 ) = 0, x β i1 (t 0 ) = i ,m1 0 2 i ,m2 0 , and ω g (t 0 ) = g,m1 0 2 g,m2 0 . This may cause
transients in the residuals, but this is not considered a problem.
226 Paper E. Automated Design of an FDI-System . . .
Alarm Thresholds
The choice of alarm thresholds J i , i = 1, 2, . . . , 16, is a trade-off between detection time
and the number of false detections. The higher the thresholds, the longer the detection
time and the lower the rate of false alarms. The choice of alarm thresholds is related to the
choices of n and N since both affect how sensitive a K-L test quantity is to noise, which in
turn affects the rate of false detections. We aim at choosing the alarm thresholds so that
the number of false detections is minimized, implying that the choice of J i must match
the choices of n and N. For the wind turbine benchmark model, the alarm thresholds
were computed as a safety factor α = 1.1 times the maximum value of the corresponding
K-L test quantities from 100 simulations with no-fault data.
the other hand increase the isolation time. For the wind turbine benchmark model, the
isolation validation time t vI al was set to 4 samples.
∆ω g,m2
∆ω r,m2
∆β 2,m2
∆ω r,m1
∆β 3,m1
∆β 1,m1
∆β 2
∆β 3
∆τ g
Req. 0.1 0.1 0.1 0.1 0.1 0.08 6 0.05
TD 0.040 0.16 0.058 4.30 0.069 51.57 18.1 7.94
TDmax 0.04 0.27 0.07 6.10 0.07 51.88 19.05 7.98
TDmin 0.03 0.06 0.05 0.40 0.06 50.57 16.37 7.90
TI - 2.53 0.12 88.85 0.13 56.95 31.84 7.99
TImax - 3.13 0.12 114.26 0.13 120.73 111.96 8.03
TImin - 1.89 0.11 13.17 0.12 51.62 17.91 7.95
MD 0 0 0 0 0 0 0 0
FD 0 0 0 0 0 0 0 0
According to the row corresponding to TDmax in Table 5, all faults in the test sequence
could be detected. For faults ∆ω g,m2 ∧ ∆ω r,m2 , ∆β 1,m1 , ∆β 3,m1 detection requirements
are met, by means of both T D and TDmax .
All faults, except the double fault ∆ω g,m2 ∧ ∆ω r,m2 could also be isolated. However,
the mean time of isolation, T I , for some faults, e.g., ∆β 2,m2 , is substantially longer than
the corresponding mean time of detection. The main reason for this is that some tests
respond slower to faults than other. As said, fault ∆ω g,m2 ∧ ∆ω r,m2 could not be isolated.
In fact, this fault is not uniquely isolable with the isolation strategy described in Section 6.2
since the test response of fault ∆ω g,m2 ∧ ∆ω r,m2 is a subset of the test response of fault
∆ω g,m2 ∧ ∆ω r,m1 , see Table 3. Both faults ∆ω g,m2 and ∆ω r,m2 are however contained in
the diagnosis statement computed after the faults have been detected.
It seems like sensor faults, e.g., ∆β 3,m1 tend to be easier to detect than actuator faults
as for example ∆τ g and ∆β 2 . One possible explanation may be that actuator faults in
general cause changes in dynamics, whose effects are attenuated by modeling errors,
noise, etc.
As can be seen in the last two rows of Table 5, there are no missed or false detections
in any of the 100 test runs.
1 5
0.5
r13
0
r2 0
−0.5 −5
1000 100
)
D(P2 ||P2N F )
NF
D(P13||P13
500 50
0 0
1450 1500 1550 1450 1500 1550
Time [s] Time [s]
Figure 4: Affected residuals r 2 (top-left) and r 13 (top-right), and the corresponding K-L
test quantities D (P2t ∥P2N F ) (bottom-left) and D (P13t ∥P13N F ) (bottom-right) at the time of
occurrence of fault ∆ω r,m1 .
and ∆ω r,m2 (bottom), and also the signal that indicates when the isolation procedure
is done (middle and bottom). As can be seen in Figure 5, the first test that reacts to the
fault is T2 . This occurs at t = 1500.23 s. Since T2 is sensitive to both fault ∆ω r,m1 and
∆ω r,m2 and no other test has alarmed, the diagnosis statement is that either ∆ω r,m1 or
∆ω r,m2 may be present, and no fault can be isolated. At t = 1502.55 s, test T13 alarms.
Test T13 is sensitive to faults ∆ω g , ∆ω r,m1 , and ∆ω r,m2 , and the updated total diagnosis
statement based on that both T2 and T13 have alarmed thus becomes ∆ω r,m1 , see Table 3.
This occurs at time t = 1502.59 s.
9 Conclusions
We have proposed an FDI-system for the wind turbine benchmark designed by applica-
tion of a generic automated design method, in which the number of required human
decisions and assumptions are minimized. No specific adaptation of the method for
the wind turbine benchmark was needed. The method contains in essence three steps:
generation of candidate residual generators; residual generator selection; and diagnostic
test construction. The second step is done by means of greedy selection, and the third
step is based on a novel method utilizing the K-L divergence.
The performance of the proposed FDI-system has been evaluated using the pre-
defined test sequence for the wind turbine benchmark. The FDI-system performs well;
all faults in the test sequence were detected within feasible time and all faults, except a
230 Paper E. Automated Design of an FDI-System . . .
T2 , T13
0.5
T2
T13
0
1500 1501 1502 1503 1504 1505 1506
1
∆ωr,m1
0.5
isolationResult
isolationDone
0
1500 1501 1502 1503 1504 1505 1506
1
∆ωr,m2
0.5
isolationResult
isolationDone
0
1500 1501 1502 1503 1504 1505 1506
Time [s]
Figure 5: Isolation procedure for fault ∆ω r,m1 . Top figure shows diagnostic tests T2 and
T13 . Middle and bottom figures show the isolation result corresponding to faults ∆ω r,m1
and ∆ω r,m2 , respectively, and when the isolation procedure is done.
double fault, could be isolated shortly thereafter. In addition, there are no false or missed
detections. A tailor-made, finely tuned, FDI-system for the benchmark would probably
perform better. However, in relation to the required design effort, and that no specific
adaptation or tuning of the method to the benchmark was done, the performance is
satisfactory.
Acknowledgment
This work was supported by Scania CV AB, Södertälje, Sweden.
• Append, takes an ordered set and an element as input and simply appends the
element to the end of the set.
• The operator ∣ ⋅ ∣, taking a set as input, is assumed to return the number of elements
in the set and the notion A (i) is used to refer to the i:th element of the ordered
set A.
232 Paper E. Automated Design of an FDI-System . . .
References
M. Blanke, M. Kinnaert, J. Lunze, and M. Staroswiecki. Diagnosis and Fault-Tolerant
Control. Springer, second edition, 2006.
J. P. Cassar and M. Staroswiecki. A structural approach for the design of failure detection
and identification systems. In Proceedings of IFAC Control Ind. Syst., pages 841–846,
Belfort, France, 1997.
A. L. Dulmage and N. S. Mendelsohn. Coverings of bi-partite graphs. Canadian Journal
of Mathematics, 10:517–534, 1958.
P. Fogh Odgaard. Wind turbine benchmark model, 2011. http://www.kk-
electronic.com/Default.aspx?ID=9385.
P. Fogh Odgaard, J. Stoustrup, and M. Kinnaert. Fault tolerant control of wind turbines
– a benchmark model. In Proceedings of the 7th IFAC Symposium on Fault Detection,
Supervision and Safety of Technical Processes, pages 155–160, Barcelona, Spain, 2009.
M. R. Garey and D. S. Johnson. Computers and Intractability – A Guide to the Theory of
NP-Completeness. W.H. Freeman and Company, 1979.
F. Gustafsson. Adaptive Filtering and Change Detection. Wiley, 2000.
M. Krysander and E. Frisk. Sensor placement for fault diagnosis. IEEE Transactions on
Systems, Man and Cybernetics, Part A: Systems and Humans, 38(6):1398–1410, 2008.
M. Krysander, J. Åslund, and M. Nyberg. An efficient algorithm for finding minimal
over-constrained sub-systems for model-based diagnosis. IEEE Trans. on Systems, Man,
and Cybernetics – Part A: Systems and Humans, 38(1):197–206, 2008.
S. Kullback and R. A. Leibler. On information and sufficiency. Annals of Mathematical
Statistics, 22(1):79–86, 1951.
M. Nyberg. Automatic design of diagnosis systems with application to an automotive
engine. Control Engineering Practice, 87(8):993–1005, 1999.
S. Ploix, M. Desinde, and S. Touaf. Automatic design of detection tests in complex
dynamic systems. In Proceedings of 16th IFAC World Congress, Prague, Czech Republic,
2005.
B. Pulido and C. Alonso-González. Possible conflicts: a compilation technique for
consistency-based diagnosis. IEEE Trans. on Systems, Man, and Cybernetics. Part B:
Cybernetics, Special Issue on Diagnosis of Complex Systems, 34(5):2192–2206, 2004.
W. J. Rugh. Linear System Theory, chapter 13. Prentice Hall Information and System
Sciences, 1996.
M. Staroswiecki. Fault Diagnosis and Fault Tolerant Control, chapter Structural Analysis
for Fault Detection and Isolation and for Fault Tolerant Control. Encyclopedia of Life
Support Systems, Eolss Publishers, Oxford, UK, 2002.
234 Paper E. Automated Design of an FDI-System . . .
C. Svärd, M. Nyberg, and E. Frisk. A greedy approach for selection of residual generators.
In Proceedings of the 22nd International Workshop on Principles of Diagnosis (DX-11),
Murnau, Germany, 2011.
L. Travé-Massuyès, T. Escobet, and X. Olive. Diagnosability analysis based on
component-supported analytical redundancy. IEEE Trans. on Systems, Man, and Cyber-
netics – Part A: Systems and Humans, 36(6):1146–1160, 2006.
Notes 235
236 Notes
Linköping studies in science and technology, Dissertations
Division of Vehicular Systems
Department of Electrical Engineering
Linköping University
No 3 Mattias Nyberg, Model Based Fault Diagnosis: Methods, Theory, and Automotive
Engine Applications, 1999.
No 9 Markus Klein, Single-Zone Cylinder Pressure Modeling and Estimation for Heat
Release Analysis of SI Engines, 2007.
No 10 Anders Fröberg, Efficient Simulation and Optimal Control for Vehicle Propulsion,
2008.
No 11 Per Öberg, A DAE Formulation for Multi-Zone Thermodynamic Models and its
Application to CVCP Engines, 2009.
No 12 Johan Wahlström, Control of EGR and VGT for Emission Control and Pumping
Work Minimization in Diesel Engines, 2009.
No 15 Erik Höckerdal, Model Error Compensation in ODE and DAE Estimators with
Automotive Engine Applications, 2011.
No 16 Carl Svärd, Methods for Automated Design of Fault Detection and Isolation
Systems with Automotive Applications, 2012.