0% found this document useful (0 votes)

1K views360 pages

Bio Signal Processing by Arnon Cohen

Uploaded by

Priya Dharshini Rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views360 pages

Bio Signal Processing by Arnon Cohen

Uploaded by

Priya Dharshini Rao

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 360

Biomedical

Signal Processing

Volume I
Time and Frequency Domains
Analysis

Author

Arnon Cohen, Ph.D.

Associate Professor
Departments of Biomedical Engineering and
Electrical and Computer Engineering
Head
Center for Biomedical Engineering
Ben Gurion University
Beer Sheva, Israel

CRC Press, Inc.

Boca Raton, Florida
Library of Congress Cataloging in Publication Data

C ohen. Arnon. 1938-

Biomedical signal processing.

Bibliography: p.
includes index.
Contents: v. I, Time .and frequency domains analysis
- - v. 2. Compression and automatic-recognition.
1. Signal processing. 2. Biomedical engineering.
I. Title
R857.S47C64 1986 6i<)\28 85-9626
ISBN 0-8493-5933-3 (v. h
ISBN 0-8493-5934-! fv 2)

This book represents information obtained from authentic and highly regarded sources. Reprinted material is
quoted with permission, and sources are indicated. A wide variety of references are listed. Every reasonable effort
has been made to give reliable data and information, but the author and the publisher cannot assume responsibility
for the validity of all materials or for the consequences of their use.

Ail rights reserved. This book, or any parts thereof, may not be reproduced in an\ form without written consent
from the publisher.

Direct all inquiries to CRC Press, Inc.. 2000 Corporate Blvd.. N.W ., Boca Raton. Florida, 33431.

© 1986 by CRC Press. Inc.

Second Printing, 1987
Third Printing. 1988

Internationa! Standard Book Number 0-8493-5933-3 tv. i!

International Standard Book Number 0-8493-5934-i (v. 2l

Library of C ongic , Card Number 85-9626

Printed in die' I'nited States
Dedicated to
my mother Rachel
my wife Yama
and my sons Boa/, Gilead, and Nadav
PR E F A C E

Biomedical signal processing is of prime importance not only to the physiological re
searcher but also to the clinician, the engineer, and the computer scientists who are required
to interpret the signal and to design systems and algorithms for its manipulations.
The biomedical signal is, first of all, a signal. As such, its processing and analysis are
covered by the numerous books and journals on general signal processing. Biomedical
signals, however, possess many special properties and unique problems that render the need
for special treatment.
Most of the material dealing with biomedical signal processing methods has been widely
scattered in various scientific, technological, and physiological journals and in conference
proceedings. Consequently, it is a rather difficult and time-consuming task, particularly to
a newcomer to this field, to extract the subject matter from the scattered information.
This book was not meant k, be or reference on general signal processing. It
is intended to provide material of interest to engineers and scientists who wish to apply
modern signal processing techniques to the analysis of biomedical signals. It is assumed the
reader is familiar with the fundamentals of signals and systems analysis as well as the
fundamentals of biological systems. Two chapters on basic digital and random signal proc
essing have been included. These serve only as a summary of the material required as
background for other material covered in the book.
The presentation of the material in the book follows the flow of events of the general
signal processing system. After the signal has been acquired, some manipulations are applied
in order to enhance the relevant information present in the signal. Simple, optimal, and
adaptive filtering are examples of such manipulations. The detection of wavelets is of
importance in biomedical signals; they can be detected from the enhanced signal by several
methods. The signal very often contains redundancies. When effective storing, transmission,
or automatic classification are required, these redundancies have to be extracted. The signal
is then subjected to data reduction algorithms that allow the effective representation in terms
of features. Methods for data reduction and features extraction are discussed. Finally, the
topic of automatic classification is dealt with, in both the decision theoretic and the syntactic
approaches.
The emphasis in this book has been placed on modern processing methods, some of which
have been only slightly applied to biomedical data. The material is organized such that a
method is presented and discussed, and examples of its application to biomedical signals
are given. Rapid developments in digital hardware and in signal processing algorithms open
new' possibilities for the applications of sophisticated signal processing methods to biome
dicine. Solutions that were cost prohibitive beforehand or impractical because of the lack
of appropriate algorithm become available. In such a dynamic environment, the biomedical
signal processing practitioner requires a book such as this one.
The author wishes to acknowledge the help received from many students and colleagues
during the preparation of this book.

A rnon Cohen
TH E A U T H O R

Arnon Cohen, Ph.D ., is an Associate Professor of Electrical and Bio-Medical Engineering

at the Ben-Gurion University, Beer-Sheva, Israel, and the head of the Center for Bio-Medical
Engineering.
Dr. Cohen received his B.Sc. and M.Sc. degrees in Electrical Engineering in 1964 and
1966 respectively, from the Israel Institute o f Technology (Technion) in Haifa. He received
his Ph.D. in 1970 from the Department of Electrical Engineering (Bio-Medical Engineering
Program), Carnegie Mellon University in Pittsburgh.
Prior to joining the Ben-Gurion University in 1972, he was an Assistant and later an
Associate Professor of Electrical and Bio-Medical Engineering at the Connecticut State
University. Storrs.
During 1976-1977. Dr. Cohen was appointed a Visiting Professor at the Electrical
Engineering department of Colorado State University, Fort Collins.
Dr. Cohen has been the recipient of research grants from various foundations and cor
porations in the U.S. and Israel. He has been a consultant with various industrial companies
in the U.S. and Israel. Recently, he has founded and became the president of MEDI-SOFT.
a consulting Firm in the field of bio-medical signal processing. Dr. Cohen's current research
interests are signal and speech processing with applications to biomedicine.
T A B L E O F C O N TEN TS

Volume I

Chapter 1
Introduction
I. General Measurement and Diagnostic System.............................................................. 1
II- Classification of Signals ........................ ......................................................................... 3
HI. Fundamentals of Signal P rocessing............... .................... ...........................................4
IV. Biomedical Signal Acquisition and Processing.............................................................5
V. The B o o k ............................................................................. ...................... ........................6
References.................................................................................. ............................. ............... . 7

Chapter 2
The Origin of the Bioelectric Signal
I. Introduction.........................................................................................................................9
n. The Nerve C e ll ............ ............................................ ........ ................................................ 9
A. Introduction............................................................. ............................................. 9
B. The Excitable M embrane....................................................... ........ ................. 10
C. Action Potential Initiation and Propagation....................... . ................. ........ 11
D. The Synapse............................................. ............. ..............................................11
III. The Muscle ............................................ ........................................ ................................12
% A. Muscle Structure............................... ................................................................. 12
B. Muscle Contraction.............. .............................................................................. 12
IV. Volume Conductors.......................................................................................................... 12
References.......................................................................................................................................13

Chapter 3
Random Processes
I. Introduction.......................................................................................................................15
II. Elements of Probability T heory............................................... ....................................15
A. Introduction.................................... ..................................................................... 15
B. Joint Probabilities................................................................................ ............... 16
C. Statistically Independent E v en ts..................................................................... 17
D. Random Variables................................................................................................17
E. Probability Distribution Functions....................................................................18
F. Probability Density F unctions........................................................................... 19
IH. Random Signals Characterization..................................................................................21
A. Random Processes...................................................... ........................................21
B. Statistical Averages (Expectations)............................ ....................................22
IV. Correlation A nalysis....................................................................................................... 23
A. The Correlation C oefficient.............................................................................. 23
B. The Correlation Function..... ..............................................................................25
C. E rgodicity...................... ......................................................................................26
V, The Gaussian Process.......................................................................................................26
A. The Central Limit Theorem ...............................................................................26
B. Multivariate Gaussian P rocess...........................................................................27
References............ ............................................................ .............................................................. 28
Chapter 4
Digital Signal Processing
I. Introduction....................................................................................................................... 29
II. Sam pling............................................................................................................................. 29
A. In troduction..........................................................................................................29
B. Uniform Sampling................................................................................................30
C. Nonuniform Sam pling.........................................................................................31
1. Zero, First, and Second Order Adaptive Sampling...........................32
2. Nonuniform Sampling with Run Length Encoding.......................... 34
III. Q u an tizatio n ...................................................................................................................... 36
A. In troduction..........................................................................................................36
B. Zero Memory Quantization................................................................................ 36
C. Analysis of Quantization Noise......................................................................... 39
D. Rough Q uantization.............................................................. ..............................40
IV. Discrete M e th o d s............................................................................................................. 42
A. The Z Transform ................................................................................................. 42
B. Difference E quations.......................................................................................... 43
References........................................................................................................................................ 44

Chapter 5
Finite Time Averaging
I. Introduction........................................................................................................................45
II. Finite Time Estimation of the Mean Value .................................................................45
A. The Continuous C ase.......................................................................................... 45
1. Short Observation Time............................................. .........................47
2. Long Observation Time......................................................................... 48
B. The Discrete C ase................................................................................................51
III. Estimation of the Variance and Correlation.................................................................53
A. Variance Estimation — The Continuous C ase............................................... 53
B. Variance Estimation — The Disci ete C a se .................................................... 54'
C. Correlation Estimation.........................................................................................56
IV. Synchronous Averaging (CAT-Computed Averaged Transients).............................. 56
A. Introduction.............................................................................. ......................... 56
B. Statistically Independent R esponses.................................................................58
C. Totally Dependent R esponses........................................................................... 59
D. The General C a s e ................................................................................................60
E. Records Alignment, Estimation of L atencies.................................................61
References........................................................................................................................................ 64

Chapter 6
Frequency Domain Analysis
I. Introduction........................................................................................................................ 65
A. Frequency Domain Representation .................................................................. 65
B. Some Properties of the Fourier Transform ...................................................... 65
1. The Convolution T heorem .................................................................... 66
2. Parseval's T heorem ..................................................... .......................... 66
3. Fourier Transform of Periodic S ig n als............................................... 67
C. Discrete and Fast Fourier Transforms (DFT. F F T ).......................................68
II. Spectral Analysis ............................................................................................................. 71
A. The Pow'er Spectral Density F unction............................................................. 71
B. Cross-Spectra! Density and Coherence Functions..........................................72
m. Linear F iltering.................................................................................................................73
A. Introduction.......................................................................................................... 73
B. Digital Filters........................................................................................................ 74
C. The Wiener F ilte r ................................................................................................74
IV. Cepstral Analysis and Homomorphic Filtering........................................................... 76
A. Introduction.......................................................................................................... 76
B. The C e p stra........................................................... ...............................................76
C. Homomorphic Filtering.......................................................................................77
References........................................................................................................................................80

Chapter 7
Time Series Analysis-Linear Prediction
I. Introduction........................................................................................................................ 81
II. Autoregressive (AR) M odels.................................................. ........................................85
A. Introduction.......................................................................................................... 85
■p . Estimation of AR Parameters — Least Squares M ethod..............................85
III. ^Moving Average (MA) M o d els............................... .....................................................89
A. Autocorrelation Function of MA Process........................................................89
B. Iterative Estimate of the MA Param eters........................................................89
IV. Mixed Autoregressive Moving Average (ARMA) M o d e ls...................................... 90
A. Introduction.......................................................................................................... 90
B. Parameter Estimation of ARMA Models — Direct M eth o d .......................90
C. Parameter Estimation of ARMA Models — Maximum Likelihood Method
93
V. Process Order E stim ation................................................................................................95
A. Introduction.......................................................................................................... 95
B. Residuals F latness...........: ................................................................................ 95
C. Final Prediction Error (FPE)......................................................... .................... 96
D. Akaike Information Theoretic Criterion (AIC)............................................... 97
E. Ill Conditioning of Correlation M atrix............................................................. 98
VI. Lattice R epresentation..................................................................................................... 98
VII. Nonstationary Processes ..................................................................................................99
A. Trend Nonstationarity — ARIM A.................................................................... 99
B. Seasonal Processes .............................................................................................101
VIII. Adaptive S egm entation..................................................................................................101
A. Introduction........................................................................................................101
B. The Autocorrelation Measure (ACM) M ethod..............................................102
C. Spectral Error Measure (SEM) Method..........................................................103
D. Other Segmentation M ethods........................................................................... 105
References....... ......................................... .................................................................................... 106

Chapter 8
Spectral Estimation
I. Introduction....... ................................................................. ............................................. 109
II. Methods Based on the Fourier Transform.................................................................. 110
A. Introduction........................................................................................................ 110
B. The Blackman-Tukey M ethod........................................................................ I l l
C. The Periodogram ...................................... ......................................................... 112
1. Introduction............................................................................................ 112
2. The Expected Value of the Periodogram.......................................... 114
3. Variance of the Periodogram............................................................... 116
4. Weighted Overlapped Segment Averaging (W O S A )..................... 117
5. Smoothing the Periodogram .................................................................119
III. M aximum Entropy Method (MEM) and the AR Method........................................ 122
IV. The Moving Average (MA) M ethod........................................................................... 125
V. Autoregressive Moving Average (ARMA) M e th o d s............................................... 126
A. The General C a s e ............................................................................................ .126
B. Pisarenko’s Harmonic Decomposition (PHD)............................................... 127
C. Prony’s M e th o d ................................................................................................. 130
VI. M aximum Likelihood Method (MLM) — Capon’s Spectral Estim ation.............. 133
VII. Discussion and Comparison of Several M ethods...................................................... 134
References...................................................................................................................................... 137

Chapter 9
r\uaptive Filtering
I. Introduction...................................................................................................................... 141
II. General Structure of Adaptive Filters......................................................................... 142
A. Introduction........................................................................................................ 142
B. Adaptive System Parameter Identification.................................................... 142
C. Adaptive Signal Estimation...............................................................................142
D. Adaptive Signal Correction...............................................................................143
III. Least Mean Squares (LMS) Adaptive Filter...............................................................143
A. Introduction........................................................................................................ 143
B. Adaptive Linear C om biner..........................................: ...................................144
C. The LMS Adaptive Algorithm......................................................................... 145
D. The LMS Adaptive Filter..................................................................................147
IV. Adaptive Noise Cancelling............................................................................................ 147
A. Introduction........................................................................................................ 147
B. Noise Canceller with Reference In p u t............................................................148
C. Noise Canceller without Reference Input...................................................... 153
D. Adaptive Line Enhancer (ALE) .............................. .......................................154
V. Improved Adaptive Filtering.........................................................................................154
A. Multichannel Adaptive Signal Enhancement................................................. 154
B. Time-Sequenced Adaptive F iltering...............................................................156
References...................................................................................................................................... 158

In d e x ............................................................................................................................................... 161

TA BLE O F C O N T E N T S

Volume II

Chapter 1
W avelet Detection
I. Introduction........................................................................................................................ 1
II. Detection by Structural Features.................................................................................... 2
A. Simple Structural Algorithms............................................................................ 2
B. C ontour'L im iting..................................................................................................5
III. Matched F iltering.............................................................................................................. 6
IV. Adaptive Wavelet D etection........................................................................................... 9
A. Introduction...........................................................................................................9
B. Template A daptation......................................................................................... 10
C. Tracking a Slowly Changing W avelet..... ....................................................... 12
D. Correction of Initial T em plate.......................................... .............................. 12
V. Detection of Overlapping Wavelets . , ...................................................... ....................14
A. Statement of the Problem................. ....................................... ........................14
B. Initial Detection and Composite Hypothesis Form ulation............................ 15
C. Error Criterion and Minimization............... .................................................... .1 6
References.......... ............................................................................................................................17

Chapter 2
Point Processes
I. Introduction............................................. .........................................................................19
II. Statistical Prelim inaries...................................................................., ............................ 20
III. Spectral A nalysis..............................................................................................................24
A. Introduction.................... ......................................................... ..,2 4
B. Interevent Intervals Spectral A nalysis............................................................ 24
C. Counts Spectral A nalysis...................... ......................................................... 25
IV. Some Commonly Used M odels..................................................................................... 26
A. Introduction.......................................................................................................... 26
B. Renewal Processes....... ............................................................... ...................... 26
1. Serial Correlogram.................................................................................27
2. Flatness of Spectrum........................ ............................................. ........ 27
3. A Nonparametric Trend Test.............................................. .................28
C. Poisson Processes................................................................................................28
D. Other Distributions..... ........................................................ .............................31
1. The Weibull Distribution ......................................................................31
2. The Erland (Gamma) D istribution.......................................... ............32
3. Exponential Autoregressive Moving Average (E A R M A ).............. 32
4. Semi-Markov Processes......................................................................... 32
V. Multivariate Point Processes............................................................................., ............33
A. Introduction.......................................................................................................... 33
B. Characterization of Multivariate Point P rocesses.......................................... 33
C. Marked Processes....................................................................... ........................ 35
References...................................................................................... .............................................. 35

Chapter 3
Signal Classification and Recognition
I. In tro d u ction......................................................................................................................37
II. Statistical Signal Classification.......................................................................................39
A. Introduction.......................... ................................................................................39
B. Bayes Decision Theory and Classification................................. . . . .............39
C. k-Nearest Neighbor (k-NN) Classification...................................................... 50
III. Linear Discriminant Functions.................................................................. .................... 53
A. Introduction......................................................................................................... 53
B. Generalized Linear Discriminant Functions............................ ........................ 55
C. Minimum Squared Error M ethod.................................... ................................ 56
D. Minimum Distance Classifiers..........................................................................58
E. Entropy Criteria M ethods....................................................................................60
1. Introduction.................................... : ........................................................60
2. Minimization of Entropy........................................................................60
3. Maximization of E n tro p y ...................................................................... 62
IV. Fisher’s Linear Discriminant............................................................ ............................. 63
V. Karhunen-Loeve Expansions (KLE).............................................................................. 66
A. In troduction.......................................................................................................... 66
b. Karhunen-Loeve Transformation (KLT) — Principal Components Analysis
(P C A ).....................................................................................................................67
C. Singular Value Decomposition (SV D )............................................................. 69
VI. Direct Feature Selection and O rdering......................................................................... 75
A. In tro d u ctio n .......................................................................................................... 75
B. The D ivergence................................................................................................... 76
C. Dynamic Programming M ethods......................................................................77
VII. Time W a rp in g ...................................................................................................................79
References........................................................................................................................................ 84

Chapter 4
Syntactic Methods
I. Introduction........................................................................................................................ 87
II. Basic Definitions of Formal Languages....................................................................... 89
III. Syntactic Recognizers.......................................................................................................92
A. Introduction.......................................................................................................... 92
B. Finite State A utom ata.........................................................................................92
C. Context-Free Push-Down Automata (PDA).................................................... 95
D. Simple Syntax-Directed T ranslation...............................................................100
E. P a rsin g ................................................................................................................. 100
IV. Stochastic Languages and Syntax A nalysis...............................................................101
A. In troduction........................................................................................................ 101
B. Stochastic R ecognizers..................................................................................... 102
V. Grammatical Inference...................................................................................................104
VI. E x am p le s..........................................................................................................................104
A. Syntactic Analysis of Carotid Blood P ressure............................................. 104
B. Syntactic Analysis of E C G .............................................................................. 106
C. Syntactic Analysis of E E G .............................................................................. 110
References.......................................................................................................................................I l l

Appendix A
Characteristics of Some Dynamic Biomedical Signals
I. Introduction...................................................................................................................... 113
U. Bioelectric S ig n als.......................................................................................................... 113
A. Action P otential................................................................................................. 113
B. Electroneurogram (EN G )..................................................................................113
C. Electroretinogram (ER G )..................................................................................113
D. Electro-Oculogram (EOG)................................................................................114
E. Electroencephalogram (EEG)...........................................................................114
F. Evoked Potentials (E P )...................................................................... ...........117
G. Electromyography (EM G )................................................................................ 119
H. Electrocardiography (ECG, E K G ).................................................................. 121
1. The S ig n al..............................................................................................121
2. High-Frequency Electrocardiography.................................................124
3. Fetal Electrocardiography (FE C G ).................................................... 124
4. His Bundle Electrocardiography (H B E )........................................... 124
5. Vector Electrocardiography (V C G ).................................................. 124
I. Electrogastrography (EG G ).............................................................................. 124
J. Galvanic Skin Reflex (GSR). Electrodermal Response (EDR)..................125
III. Impedance ...................................... .................................................................................125
A. Bioim pedance.....................................................................................................125
B. Impedance Plethysmography ........................................................................... 126
C. Rheoencephalography (REG )...........................................................................126
D. Impedance Pneumography................................................................................126
E. Impedance Oculography (Z O G )......................................................................126
F. Electroglottography............................................................................................ 126
TV. Acoustical S ig n als...................... ...................................................................................126
A. Phonocardiography..................................................................... ...................... 126
1. The First Heart S o u n d ......................................................... .............. 126
2. The Second Heart S o u n d ................................................................... 127
3. The Third Heart S o u n d ....................................................................... 127
4. The Fourth Heart S o u n d ......................................................................127
5. Abnormalities of the Heart S o u n d .............. ..................................... 127
B. A u s c u lta tio n .................................................................................................... 127
C. V oice................................................................................................................... 128
D. Korotkoff S o u n d s........................ ..................................................................... 129
V. Mechanical S ig n als.............. . ........................................................................................ 130
A. Pressure S ignals..... ...........................................................................................130
B. Apexcardiography (A C G )............................................................................... 130
C. Pneumotachography...................... . .................................................................130
D. Dye and Thermal Dilution................................................................................ 130
E. Fetal M ovem ents..................... ...... .......................... ....................................... 131
V I. Biomagnetic S ignals......................................................................................................131
A. Magnetoencephalography (M EG)........................ ...........................................131
B. Magnetocardiography (M C G )......................................................................... 131
C. Magnetopneumography (M PG )........................................................................131
VII. Biochemical S ig n a ls.......................................................................................................131
VIII. Two-Dimensional Signals.............................................................................................. 132
References...................................................................................................................................... 134

Appendix B
Data Lag Windows
I. Introduction.......................................................................................................................139
II. Some Classical W indow s.............................................................................................. 139
A. Introduction........................................................................... ............................. 139
B. Rectangular (Dirichlet) W indow ...................................................................... 140
C. Triangle (Bartlet) Window .................................... ......................................... 140
D. Cosinea W in d o w s.............................................................................................. 141
E. Hamming W in d o w .................................................... . . . . ................................. 143
F. Dolph-Chebyshev W indow ....... ...................................................................... 145
References.............. ........................................................................................................................ 151

Appendix C
Computer Programs
I. Introduction..................................... ........................................ ...................................... 153
II. Main P ro g ram s.............. .............................. ................................................................. 154
• NUSAMP (Nonuniform Sam pling).................................................................154
• SEGMNT (Adaptive Sesm entation)............................................................... 158
• PERSPT (Periodosram Power Spectral Density E stim ation)..................... 162
• WOSA (WOSA Power Spectral Density E stim ation)................................. 162
• MEMSPT (Maximum Entropy [MEMJ Power Spectral
Density E stim ation)............................................................................................165
• NOICAN (Adaptive Noise Cancelling).......................... ............................... 167
• CONLIM (Wavelet Detection by the Contour Limiting! M ethod)............ 169
• COMPRS (Reduction o f Signal Dimensionality by Thijee Methods:
Karhunen-Loeve |KL], Entropy [ENT], and Fisher Discriminant (F I]... 171
III. S u b ro u tin es.......................................................................................................................174
• LMS (Adaptive Linear Combiner, W idow’s Algorithm)............................ 174
• NACOR (Normalized Autocorrelation Sequence)........................................ 175
• DLPC (LPC, PARCOR, and Prediction Error of AR Model of Order
P) .......................................................................................................................... 176
• DLPC 20 (LPC, PARCOR, and Prediction Error of All AR Models of
Order 2 to 2 0 )..................................................................................................... 177
• FTOl A (Fast Fourier Transform IF F T j)........................................................178
• XTERM (Maximum and Minimum Values of a Vector) .......................... 180
• ADD (Addition and Subtraction of M atrices)............................................... 180
• MUL (Matrix Multiplication)........................................................................... 181
• MEAN (Mean of a Set of Vectors).................................................................181
• COVA (Covariance Matrix of a Cluster of Fectors)................................... 182
• INVER (Inversion of a Real Symetric M a trix )............................................i 83
• SYMINV (Inversion of a Rea! Symetric Matrix. [Original Matrix
D estrosedi).......................................................................................................... 183
• RFILE (Read Data Vector From Unformatted F ile )................................... 185
• W’FILE (Write Data Vector on Unformatted F ile)...................................... 186
• RFILHM (Read Data Matrix From Unformatted F ile)................................187
• WITLEM (Write Data Matrix on Unformatted File)...................................188

Index................................................................................................................................................189
Volume I: Time and Frequency Domains Analysis 1

Chapter 1

IN TR O D U C T IO N

I. G E N E R A L M E A S U R E M E N T A N D D IA G N O ST IC SY STE M

This book is concerned with the analysis and processing of biomedical signals. It is
pertinent to discuss first what, in general, is a signal, what is a biomedical signal, and why
process it. A discussion on these general topics is presented in this chapter.
A signal is a means to convey information. It is sometimes generated directly by the
original information source. We may then want to learn about the structure or functioning
of the source from the extracted information (the signal). The signal available may not yield
directly the required information. We then apply some opeiations on the signal in order to
enhance the information needed. This may be the case, for example, when the visual
processing mechanism of the brain is of interest. We may present the eye with a Hash and
monitor the activity of the brain by means of electrodes located on the scalp. We shall find
that the required information that is related to the visual activity of the brain is “ buried"
in the signal which is mainly due to other activities of the brain. Special processing procedures
must be applied to the signal, so as to enhance the relevant information. We may want to
transmit the signal from point of acquisition to a remote location for monitoring or processing.
This may be the case, for example, in the intensive care unit when information on patients
is required at the central monitoring station, or when the information concerning a patient
at home is required in the hospital or physicians office. In these cases, the processing of
the signal i> required in order to match it with the requirement of the transmission channel.
In other cases, the information is to be stored for later use. Effective storing is needed such
that it will require minimum amount of storing space (computer memory, magnetic tapes)
and can be later reconstructed at will.
The general measurement and diagnostic system is schematically shown in Figure 1.
Usually this system consists of a transducer that is coupled to the information source and
extracts the required information. The original information may be in a form that is not
suitable for processing, storing, or transmitting (pressure, temperature). The transducer
converts the information into (most often) an electrical signal. With current technology, this
type of signal is most convenient for the above tasks. The processing of the signal is often
required for diagnosis as well. In this case, the processing has to classify the signal into
one of many given classes which may be the normal and various abnormal classes. After
classification, corrective measures that change the source may be taken.
Several books are available on the topic of biomedical instrumentation and measure
m ents.1-5 They deal with the various transducers and the hardware associated with the
acquisition and basic preprocessing. In this book, the various topics regarding processing
are discussed. The basic preprocessing is sometimes called “ signal conditioning" while
later processing is sometimes known as “ signal manipulation and evaluation". Since no
distinct definitions exist we prefer the general term “ processing” .
The topics discussed in this book are depicted in broken lines in Figure I and in more
detail in Figure 2.
The first step in the processing is usually that of segmentation. The signal may drastically
change its properties during time. We then observe and process the signal only in a finite
time window. The length of the time window depends on the signal source and goal of
processing. We may use a single “ window" predetermined length as we do, for
example, when electrocardiographic monitoring is performed or we may require some scheme
for automatically dividing the signal into varying length segments as is often done in
electroencephalography.
2 Biomedical Signal Processing

FIGURE 1. ' General measurement and diagnostic system.

aGNALl PRE - I 1 SEGMEN - I I TR«MgFQnl r | WAVELFT 1 .

PROCESSING) | TATION | j >HftNSFORj »| DETECT0R [ T

[FEATURE
lEXCTRACT. f

RECONSTRUCTED
RECONST CT.

FIGURE 2. Signal processing.

A variety o f methods are available for the enhancement of the relevant information in the
signal. The signal is either corrupted with additive and multiplicative noise or the information
required constitutes only a part o f the signal such that irrelevant portions are considered
noise. We then apply noise-attenuating-and-cancelling techniques or signal enhancement
methods in order to increase the signal-to-noise ratio. To do this, some a priori knowledge
on the signal and the noise is required. The more a priori knowledge that is available, the
better the processing. Enhancement methods that are optimal in some sense are discussed
as well as adaptive methods that automatically adjust themselves to varying conditions.
Very often the relevant information in the signal posseses some waveshape which is only
generally known. A good example is the electrocardiogram where the general shape of the
PQRST complex is known. It is required to extract the exact shape of the wavelet present
in the signal.
Not all the information conveyed by the signal is of interest. The signal may contain
redundancies. When effective storing and transmission are requiicu, ui when the signal is
to be automatically classified, these redundancies have to be eliminated. The signal can be
represented by a set of features that contain the required information. These features are
Volume I: Time and Frequency Domains Analysis 3

SIGNAL

[DETERMINISTIC] 1 RANDOM

PERIODIC NON"PERIODIC STATIONARY NON-STATtONARX

1 1 _ 1---------- " 1 r - .......... 1 SPECIAL

COMPLEX •a lm ost* TY PES
SINUSOIOAL TRANSIENT EROODIC NON-ERGOOIC
PERIOOIC PERIODIC NON STATION

FIGURE 3. Classification o f signals.

then used for storage, transmission, and classification. Reconstruction of the signal from its
features is often needed. The types of features used and their number dictate, on one hand,
the data reduction rate for efficient storing and transmitting and, on the other hand, the error
of reconstruction.
The various functions depicted in the blocks of Figure 2 are discussed in the forthcoming
chapters. The processing system of Figure 2 is a general system for signal processing
independent of application. Geophysical signals,6 for example, as well as biomedical signals
face the same general steps of processing. In this book, however, the general topics are
discussed with emphasis on biomedical signal processing. Additional topics have been
introduced, which are specifically oriented to biomedical signals.

II. C L A S S IF IC A T IO N O F SIG N A LS

Signals, extracted from biological and physical systems, may possess various properties
and characteristics. It is important to identify the general characteristics of the signal, so
that the appropriate processing tools be applied. The various types of signals are introduced
in this section.
Signals are classified into two main groups: deterministic and random signals (see Figure
3). Deterministic signals are those that can be described by explicit mathematical relation
ships. Random signals cannot be exactly expressed. It can be described only in terms of
probabilities and statistical averages. A philosophical question may be raised as to whether
a random or deterministic signal exists. In reality, we are not able to find a signal that can
be accurately predicted by means of an exact mathematical formulation. Even a sine wave
from a signal generator is not deterministic in that sense, since no one can tell when a power
failure will cause the sine wave to completely disappear or some generator malfunction will
cause its shape to change. On the other hand, it can be argued that random signals, in reality,
do not exist. Any signal is the result of some physical or chemical phenomena and is governed
by some laws. If these laws were completely known to us, we could have exactly expressed
the signal and predict its value. Since our interest here is to apply processing methods for
the analysis o f the signals, we do not have to enter into these philosophical arguments. It
is the goals and constraints of the problem at hand that will dictate our decision to consider
a given signal as random or deterministic.
When analyzing the electrocardiographic (ECG) signal, for example, we may be interested
in the general characteristics of the QRS complex and thus consider the signal deterministic,
or we may be interested in the changes of the R-R interval, thereby considering it a random
signal.
Deterministic signals are divided into two subgroups: periodic and nonperiodic signals.
4 Biomedical Signal Processing

Periodic signals are signals for which x(t) = x(t + T), where T is the period. Periodic
signals are convenient since one period is sufficient for complete description. In the frequency
domain, the description is given by means:of the Fourier series, where only the fundamental
frequency and its harmonics take part. Nor periodic signals consist of two classes. “ Alm ost"
periodic signals are those that are not periodic in the mathematical sense but have discrete
description in the frequency domain. This frequency description differs from the periodic
one in that the various frequencies participating are not harmonics of some fundamental
frequency. A combination of several unrelated periodic signals creates an “ almost” periodic
signal.
A transient signal is a deterministic signal not having the properties discussed previously.
Random signals are much more difficult to deal with. A random signal is a sample function
of a random process. One sample function of a random process differs from another in their
time description. They possess, however, the same statistical properties. The complete
(infinite) set of sample functions produced by the random process is called the ensamble.
The description of the random signal is given by the joint probability density function.
A stationary process is a process the statistical properties of which are not a function of
time. For such a process we can calculate, for example, the expectation by averaging the
values, x(t), over all the ensamble at any time, t. An important class of random signals is
the class of ergodic signals. For these signals, statistical averaging over the ensamble equals
time averaging over the time axis of any one sample function.
We shall see that stationarity and ergodicity are properties which allow the use of practical
processing methods. A process which is nonstationary (and thus nonergotic) is very difficult
to process. Very often we are forced to assume the process is ergodic even though it is
known a priori that the assumption is false.
When processing the electroencephalographic (EEG) signal, for example, we do not have
at our disposal the complete ensamble. We have only one sample function. We are thus
forced to assume ergodicity and estimate the required statistical properties from time (rather
than ensamble) averages. Since the tools for processing nonstationary signals are not very
effective, we are often dividing a nonstationary signal into segments, each assumed to be
stationary. The length of the segments depends on the properties of the nonstationarities. In
speech signals, segments are chosen with durations of about 10 msec w hile in EEG analysis,
segments may be of the order of a few seconds.
Another basis for signal classification, w hich has great significance from processing point
o f view, is that of continuous vs. discrete signals. Continuous time signals are signals which,
in general, are defined at any point in time. The tools that are applied to the processing of
these signals are the Fourier and Laplace transforms and other “ analog*' methods. In terms
o f hardware, these signals are treated by analog systems (filters, amplifiers, com pilers).
Discrete signals are signals that are defined only at given points in time. Usually these
signals are also “ sampled” in amplitude. We usually think of discrete signals as the result
o f continuous signals that had been time sampled and amplitude quantized, though there
may be signals which are discrete by nature. These signals are processed by means of discrete
signal processing methods such as the Z transform and the Discrete Fourier Transform
(DFT). In terms of hardware, these signals are treated by means of digital systems including
digital computers. The advances in digital technology in recent years have created a situation
in which most of the signal processing activity is in discrete signals."!i

III. F U N D A M E N T A L S O F SIG N A L P R O C E S S IN G

A very basic tool for signal t . l.ig is that of Titering. Its use is best applied in the
frequency rather than in the time domain.
The Fourier transform12 is an operator that transfers a signal, x(t), in the time domain
into the frequency domain. In the frequency domain, the signal is represented in terms of
Volume I: Time and Frequency Domains Analysis 5

its amplitude and phase as a function of frequency. A more practical transformation is into
the complex frequency, S, plane. The Laplace transform is a transformation that can transfer
a signal x(t) into the complex frequency plane. The spectral properties of the signal can be
shaped by means o f filters which are designed to attenuate or completely cut off portions
of the signals’ frequencies.
The same tools exist for discrete signals in which the complex Z domain is defined.
Discrete filters can be designed to shape the spectrum of the discrete signal as required.
The fundamentals o f continuous and discrete signal processing as well as filter design
theory are not discussed in this book. It is assumed that the reader is at least familiar with
the basics of signal filtering.
The frequency filtering is a powerful tool for random as well as deterministic signals.
When processing random signals, we apply the Fourier transform to the autocorrelation
function rather then the sample function itself. We then deal with the power spectral density
functic::. T1.-.. . design techniques are applied here.

IV . B IO M E D IC A L S IG N A L A C Q U ISIT IO N A N D P R O C E S S IN G

When performing a measurement from a living biological substance, several unique prob
lems arise that deserve special discussion, in addition to the one presented in Section I of
this chapter.
The living biological system is a very complex system governed by biochemical, physical,
and chemical laws not well understood as of yet. In particular, many aspects of the complex
hierarchical control, the genetic control, the neural information transfer and processing, and
other systems are still under extensive investigation. Very often we use a priori information
concerning the system, generating the signal of interest, in order to help us in the analysis
and processing procedures. When the underlying mechanism is not well understood, the
processing may become less effective.
The complexity o f the biological system often introduces difficulties in the measurement
and processing procedures. Unlike physical systems, the biological system (most often)
cannot be uncoupled in such a way that subsystems can be monitored and investigated
individually. Because of the complex hierarchical control linkages among subsystems and
due to many feedback paths not always well understood, the biological system under in
vestigation must remain in its natural environment during observation. The signals produced
by the system are thus influenced directly by the activity of the surrounding systems. The
signal is also inherently contaminated by noise produced by the neighboring systems.
When m onitoring, for example, the activity of the visual processing mechanism of the
brain by means of the visual evoked potential, we cannot isolate the visual system and
perform the clinical test under controlled conditions. The eye, its position, and its sensitivity
are controlled by voluntary and nonvoluntary actions of the brain. The visual processing of
the brain itself is dependent on many other surrounding activities, most of them beyond our
control. The result is that very often we do not know exactly under what conditions the
signal was taken and how' to interpret the analysis of the signal — whether to attribute it to
some significant phenomenon in the underlying mechanism (abnormality in system function)
or to some change in measurement conditions.
The large variations that exist in biomedical signals force us to rely heavily on statistical
methods. These variations exist in signals acquired from the same individual, and of course
among populations. Thus, the accuracies and confidence limits that come out of our biomed
ical signal processing are usually not very high, at least in terms used in other engineering
disciplines.
Biomedical signals are usually extracted from living organisms and, in most applications,
from human beings. The measurement system must be designed so as not to damage the
6 Biomedical Signal Processing

system and not to cause pain whenever possible. Noninvasive techniques are always pref
erable. This means that very often we cannot get the information required directly and we
have to infer it from signals that are noninvasively available. Fetal heart monitoring may
serve as an example. Rather than apply electrodes directly to the fetal’s skin, a procedure
requiring invasive methods, we place the electrodes on the m other’s abdomen. The signal
thus acquired is heavily contaminated with the mother’s strong ECG and other muscle
activities. The inferring of the fetal ECG from the signal requires a lot of processing efforts.
In many important biomedical applications, the signals of interest are two dim ensional.13
Conventional X-ray analysis, computer tomography (CT), nuclear magnetic resonance (NMR)
imaging, and ultrasonic imaging are examples. In principle, the processing techniques of
one- and two-dimensional signals are similar. In practice, however, there are major differ
ences. Sophisticated algorithms for two dimensional signal processing have been developed.
These are not discussed here.
Biomedical signals are mechanical, chemical, or electromagnetic in nature. With current
technology, almost all transducers used provide electrical output so that the signal to be
processed is presented as an electrical signal. One very important group of biomedical signals
are signals which are electromagnetical in nature — the bioelectric signals. Chapter 2 is
dedicated to the origin and characteristics of these signals.

V. T H E BO O K

The material covered in this book follows approximately the main blocks depicted in
Figure 2. As mentioned previously, Chapter 2 discusses the origin and characteristics of the
bioelectric signal. It is not intended to replace or compete with the excellent books available
on the topic. It was felt, however, that because of the importance of these signals, one
chapter should be devoted and addressed especially to the engineer or computer scientist
who has not been extensively exposed to the biological basis of the signals. Chapters 3 and
4 discuss basic principles of random and digital signal processing. The experienced engineer,
and especially the one whose field is measurements and signal processing, will be familiar
with the material presented in these chapters. Others, whose expertise lies elsewhere, will
find Chapters 3 and 4 a basic reference for two very important topics. The problem of finite
time observation records presents itself in almost every signal processing problem. Either
the time available for measurement is finite or the signal is nonstationary and short segments
o f it are used in order to apply stationary processing methods. In both cases, we are faced
with the problem of having a finite time observation record (a windowed record) from which
we have to estimate the characteristics of the process. Chapter 5 presents and discusses the
problems of finite time estimations in continuous and digital signals.
Frequency domain analysis techniques are presented in Chapter 6. This chapter also
includes a discussion on cepstral methods and homomorphic filtering which are probably
less familiar even to the experienced engineer.
Time series analysis has been originated in the statistical literature and has become ar
important analysis approach in modem signal processing. The basic idea here is to consider
the signal an output of a linear system, whose (inaccessible) input is white in spectrum.
Rather than deal with the signal itself, we want to deal with the system that generates it. It
turns out that the representation of the signal by means of the coefficients of the system’s
differential (or difference) equation, or the system’s poles and zeroes is a very efficient
representation. The features achieved by time series analysis provide effective data reduction
for signal storage and transmission as well as signal recognition. Chapter 7 presents the
theory and practical algorithms for ' : Cementation of time series analysis.
A related topic, with great importance in random signal processing, is that of spectral
estimation. It is presented in Chapter 8, with algorithms that arc designed for various types
of signals.
Volume I: Time and Frequency Domains Analysis 7

One o f the problems associated with biomedical signal processing is the complexity of
the signal and the lack of a priori information that can be used to simplify the processing
methods. Adaptive filtering methods have thus found wide applications in biomedical signal
processing. Here, one requires a minimum a priori information on the signal and noise
involved. The adaptive filter “ learns” from the running data the required information and
automatically adjusts its parameters in some optimal fashion. Chapter 9 deals with adaptive
filtering. The discussion is primarily based on W idrow’s work, who has also suggested
many biomedical applications.
In many biomedical applications, the signal contains a wavelet which is of interest. This
may be the case in electrocardiographic analysis where the QRS complex is Nought, or in
some cases of REG analysis where K-complexes or spindles are of interest. Methods for
wavelet detection are discussed in Chapter 1, Volume II.
Chapter 2 (Volume II) deals with point processes analysis. Point processes are processes
which ......:K • ' uy »he time of appearance of events. The parameter of interest is
appearance time and not the shape of the event. The major application of thi> theory is in
the analysis of neurosignals. Other applications such as R-R interval analysis in ECG or
pitch analysis in speech arc also presented.
Chapters 3 and 4 (Volume 11) deal with the important problem of automatic signal rec
ognition and classification. The more known approach, the theoretic decision approach, is
presented in Chapter 3 (Volume II) while the more novel approach, that o f syntactic analysis,
is presented in Chapter 4 (Volume II).
A discussion on some of the many biomedical signals appears in the appendices. The aim
of the appendices is not to give an exhaustive survey of all signals used in biomedicine, but
rather to list a few representative ones with their main characteristics.
Throughout the book, examples and references to various biomedical applications have
been provided. The emphasis, however, is on the presentation of various methods tsome of
them novel w hich have yet to be massively applied to biomedicine). The framew ork of the
presentation is that depicted in ligure 2 for a general signal processing system.

REFERENCES

1. DeM arre, I). A. and M ichales, D., Bioelectronic Measurements, Prentice-Hall. Englewood Cliffs. N.J.,
19X3.
2. G eddes, L. A. and Baker, L. R., Principles o f Applied Biomedical Instrumentation, John Wiley & Sons.
New York. 1968.
3. Strong. P ., Biophysical Measurements, Tekironix. Beaverton, Or.. 1970.
4. W ebster, J. G ., Kd., Medical Instrumentation Application and Design, Houghton Mifflin. Boston, 1978.
5. C rom w ell, L ., W eibell, F. J ., Pfeiffer, E. A ., and Usselmann, L. B., Biomedical ln.\:rumentatum and
Measurements, Prentice-Hall, Englewood Cliffs. N.J., 1973.
6. Robinson, E. A. and Treitel, S.. Geophysical Signal Analysis, Prentice-Hall. Englewood Cliffs, N.J..
1980.
7. Beaucham p, K. G. and Yuen, C. K ., Digital Methods fo r Signal Analysis, George Alien and Unwin,
Ltd.. London, 1979.
8. G old, B. and Rader, C . M ., Digital Processing o f Signals, McGraw-Hill, New York. I°6^
9. O ppenheim , A. V ., Ed., Application o f Digital Signal Processing, Prentice-Hall. Englewood Cliffs. N.J..
1978.
10. Chen, C. T ., One-Dimensional Digital Signal Processing. Marcel Dekker, New- York. N “9.
11. T retter, S. A ,, Introduction to Discrete-Time Signal Processing, John Wiley & Sons, New York, 1976.
12. Bracewell, R. N ., The Fourier Transform and its Applications, McGraw-Hill. New York. 1978.
13. Reichenberger, H. and Pfeiler, M ., Objectives and approaches in biomedical signal processing, in Signal
Processing 11: Theories and Applications. Shussler. H. W .. Ed., Elsevier/North Holland, Amsterdam, 1983.
Volume J: Time and Frequency Domains Analysis 9

Chapter 2

T H E O R IG IN O F THE B IO E L E C T R IC S IG N A L

I. IN TR O D U C T IO N

The most important information processing mechanism in the living biological system is
the neural netw ork.' The biological s\stem has several means for information transfer.
Probably the most important is the neural information transfer. Neurophysiology, the study
of neural function, has been the key field for understanding internal communication and
control in the biological system. Basic and applied neurophysiological research heavily relies
on our ability to measure chemical and electrochemical activities taking place in the single
cell, or collectively by groups of cells.
Many functions of neural and muscular cells are chemical in nature. These functions,
however, produce changes in the electric field which can be monitored by electrodes. The
so-called bioelectric potentials help the neurophysiologist study cell function. Direct meas
urements of the chemical phenomena, e.g.. ion concentration changes,2 can be performed
by means of special transducers (ion selective electrodes, for example). However, these
measurements are more difficult to perform.
The source of the bioelectric signal is the single neural or muscular cell. These, however,
do not function alone but in large group>. The accumulated effects of all active cells in the
vicinity produce an electric field which propagates in the volume conductor3 consisting of
the various tissues of the body. The activity of a muscle, or some neural network, can thus
indirectly be measured by means of electrodes placed, say, on the skin. The acquisition uf
this type of information is easy; clectr^des can be conveniently placed on the skin. The
information, however, is difficult to analyze. It is the result of all neural and muscular
activity in unknown locations transmitted through an inhomogeneous medium. In spite of
these difficulties, electrical signals, monitored on the skin surface, are of enormous clinical
and physiological importance. Electroencephalographic (EEG). electrocardiographic (ECG),
electromyographic (EMG). and other Mich signals are rou':nely used for the diagnosis of
neural and m uscular systems in the clinic. The interpretation of the information is based
mainly on the large statistical experience collected throughout the years.
This chapter explains the basic bioelectric phenomena on the cell level both in neural
cells and in muscle cells. A brief discussion on the volume conductor problem is then
presented to provide a link to the gross surface electric signals.

II. THE N ERV E C E L L

A. Introduction
The basic processing unit in the neurophysiological system is the nerve cell — the neuron.
Its task is information processing, transfer, and acquisition. Neurons which are used for
information transfer are usually long, and serve to transmit information to and from the
central processing body. Special nerve cells have been evolved that serve as sensors.
A variety of mechanisms and sensor shapes exist to transduce many kinds of stimuli
(pressure, light, temperature, etc.) into electrical and chemical signals. The central nervous
system deals with the task of information processing and control. Though there are many
types of neurons, the basic structure of these cells can be generally discussed. Figure 1
schematically depicts the structure of a nerve cell.
The important parts of the neuron are the cell body (soma), the dendrites, and the axon.
The cell body consists of the intracellular fluid with the various bodies required for the
10 Biomedical Signal Processing

- • ° T ACTION
POTENTIAL

FIGURE 1. The nerve cell.

functioning o f a cell. There is a great variance in the size of nerve cells. The diameter can
be as small as a few microns or as large as a few tens of microns. It is surrounded by an
' excitable membrane, the thickness of which is in the range of 50 to 150 A. The cell membrane
is extended in various places to generate root-like structures called dendrides. These exten
sions are used for interconnections with other nerve cells.
The axon serves as the output of the nerve unit. It is an extension of the cell with a length
that can be about 50 fmm (in the cerebral cortex) or up to several meters (in peripheral nerves
o f large mammals). The diameter of the axon can range from less than 0.5 (xm to about 1
mm (in the squid giant nerve fibers). Some axons are covered with interrupted myelin sheath
which increases the velocity of information transfer.
Information into the neuron from other neurons is introduced through a junction called
synapse. Synapses are located on the dendrites or on the soma. The synapses can cause an
increase or decrease of the voltage across the membrane. The cell function is based on the
integrative (in time and space) effects of these potential changes.
The tips of the axon serve as inputs to other neurons through synapses, or to activate
m uscles through special synapses — the neuromuscular junctions. Peripheral nerves are
bounded together into a nerve trunk. The electrical activities of the single nerve cell will be
discussed in the following sections. The signals that are picked up from the nerve trunk are
the result of the electric field generated by the various nerves in the trunk.

B . T h e Excitable M em brane
T he cell membrane can be considered as a dividing medium between the extracellular
and intracellular fluids. These two fluids have different ionic concentrations. The membrane
has different permeability to the various ions in the solutions. As a result of ions transfer,
by means of diffusion and other mechanisms, a voltage is generated across the membrane.
If we consider the effects of only the three main ions, potassium [K + ], sodium (Na f 1.
and chloride [CI“ ], we get the membrane potential, E, from the Nernst Equation:

Pk[K^]„ + PNJ N a " ]0 + PcA C t ] ,

In [
PK[K*]i + PNJ N a +]i + Pr ,[C £ -]0 :} (2.1)

where R, T, and F are the universal gas constant, the absolute temperature, and the Faraday
Volume I: Time and Frequency Domains Analysis 11

constant, respectively. Px is the permeability of the resting membrane to the ion X, and
[X]Q and [Xj, are the concentrations of the ion X in the extracellular and intracellular fluids.
The calculated resting cross-membrane potential is approximately 80 mV (the inside of
the cell being negative with respect to its outside). This value agrees well with neurophy
siological measurements.
Some m embranes have excitability characteristics. When the membrane is excited by
means of electrical, mechanical, or chemical stimulus, the permeabilities of the membrane
to ionic transfer undergo some changes. These changes cause the resting potential of the
membrane to increase, become positive for a short period of time, and later, when the
membrane repolarizes. to return to its normal resting potential. The time coune of the
potential change, the action potential. is depicted in Figure 1.
Nerve and muscle cells have excitable membranes. The shape and time durations of the
action potentials differ in the various cells. Muscle action potentials are usually much larger
in duration.
The excitation of the membrane is caused only if the stimulus exceeds a threshold level
(about 20 m V i. Once the threshold has been crossed and action potential elicited, the
threshold changes. Following the initiation of the action potential is a certain period (of the
order of 1 to 2 msec) when the threshold level becomes infinite. This period is called the
total refractory period in which no new action potential can be initiated. The threshold then
returns to its resting value according to some decaying function. The period in which the
threshold decay s to its resting level is called relative refractory period. In this period, a
new action potential can be elicited provided the stimulus is strong enough to cross the
relatively high threshold.

C . Action P o ten tial Initiation and Propagation

The decision as to whether a neuron elicits an action potential is a function of “ outside”
inputs it receives. In sensors, these may be the amount of photons absorbed or the value of
the pressure applied. In regular ner\c cells, the “ inputs’* come from nerve endings of other
neurons. Each >ynapse can cause a slight increase in the resting membrane pptential of the
neuron (excitatory synapse) or can lower the resting potential (inhibitory synapse). The
membrane potential is determined b\ the integration over all the synaptic effects, both in
space and in tim e. When this cumulative effect causes the threshold to be crossed, an action
potential is elicited.
The special properties of the membrane ensure the propagation of the action potential
from the cell body along the axon toward its endings. When an action potential has been
elicited, internal electric fields are generated sufficiently strong to excite neighboring portions
o f the m embrane. The action potential thus propagates in one direction. It cannot propagate
back, since the threshold of that portion of the membrane that has recenth been activated
is still very high.
The action potential, as propagates along the axon, is locally regenerated. Therefore, it
propagates without attenuation. The information carried by the neuron is not in the shape
o f the action potential but in the interspike intervals. The neuron can be considered a stimulus-
to-frequency converter. In most signal processing applications, we are not monitoring single
action potentials (spikes) but rather the field generated by a trunk of fibers. We find then
that both amplitude and frequency contents of the signal relate to the neural activity.

D. The Synapse
The axon of a neuron terminates with junctions to other neurons or to muscles. One axon
can be connected by means of such junctions to many neurons or muscle fibers.
The s’ napse in the junction between one of the axon endings of one neuron and the
dendrite or soma of another. The presynaptic region is the axon's ending. It does not actually
touch the dendrite (or soma). A spacing of about 200 A, known as the synaptic cleft, exists.
12 Biomedical Signal Processing

T he region in the dendrite (or soma) on the other side of the cleft is called postsynaptic
region.
When an action potential arrives at the presynaptic region, it causes the membrane char
acteristics to change. This change increases the ability of certain chemical substances (trans
mitters) to diffuse from the presynaptic region into the cleft. The transmitters that cross the
cleft are captured by receptors in the postsynaptic region and cause membrane potential
change. The change may be excitatory (excitatory postsynaptic potential, EPSP) or inhibitory
(inhibitory postsynaptic potential, IPSP) depending on the type of transmitter released.
The complete process of transmitter release, cleft crossing, and postsynaptic receiving is
relatively slow and is of the order of 0.5 msec. The transmission of information through
the nervous system, though fast when compared with other biological mechanisms (hor
m ones), may be considered slow when compared with electronic or optical systems.

III. T H E M U SC LE

A . M uscle S tru c tu re
The skeletal muscle consists of cells with excitable membrane. The membrane is similar
in principle to the neuron’s membrane. Its function, though, is not to transfer or process
information but to generate tension. The muscle is constructed from many separate fibers.
The fibers contain two kinds of protein filaments, actin and myosin. These are arranged in
parallel interlacing layers which can slide one into the other causing shortening of the muscle
length. The sliding of the fibers is caused by chemical reactions that are not yet fully
understood.
The generation of motion or force by the muscle is activated when the fiber membrane
is excited. An action potential then propagates along the surface membrane of the fiber,
triggering chemical reactions that, in turn, cause fiber contraction.
When a muscle contracts, the action potentials generate an electric field that can be
monitored by means of surface (skin) electrodes. This field is a result of the contribution
o f many fibers at different times and with different rates. The signal (EMG) monitored this
way will thus be a random signal with statistical properties that depend on the muscle
function.

B . M uscle C ontraction
The neuron that activates the muscle is called motor nerve. The axon endings of the motor
nerve are similar to synapses but rather than activate another neuron, they are connected to
muscle fibers. The motor neuron-muscle connection is called neuromuscular junction or end
plate.
The chemical substance that serves as a transmitter in the end plate is acetylcholine (ACh).5
It is released from the axon endings when an action potential has arrived, diffuses toward
the muscle membrane and is absorbed there at the receptors sites, causing muscle membrane
potential change. When the change is sufficiently high and threshold level is crossed, an
action potential is generated and propagates along the muscle membrane.
The process of transmitter release, diffusion, and reception at the muscle lasts about 0.5
to 1.0 msec. Additional delay in contraction is due to the dynamic properties of the muscle
itself.

IV. V O L U M E C O N D U C T O R S

The source of the bioelectric signals are the action potentials generated by single neurons
and muscle fibers. The current densities generated by the membrane activity cause current
changes in the surrounding medium. The surrounding tissues, in which induced current
changes occur, are called the volume condu tor.
Volume I: Time and Frequency Domains Analysis 13

In most clinical applications and in many neurophysiological applications, we monitor

the fields o f the volume conductor and not the bioelectric source itself. This is ^erta:nly the
case when skin surface electrodes are used to monitor heart or brain activities. Even in
neurophysiological studies, where electrodes are inserted into the tissue, we monitor the
volume conductor effects.
It will be o f extrem e importance to be able to exactly deduct the underlying bioelectric
source from gross measurements of the volume conductor. This is, however, a complex
task, especially when the characteristics of the complex biological medium are considered.
Mathematical models of the current flow field in volume conductors have been developed
with various degrees of success. The reader is referred to Plonsey’s work3 for detailed
presentation.

REFERENCES

1. K atz, B ., S 'en e. Muscle and Synapse, McGraw-Hili. New York, 1966.

2. G eddes, L. A ., Electrodes and the Measurement o f Bioelectric Events, Wilev-Interscience. New York.
1972.
3. Plonsey, R .. Bioelectric Phenomena, McGraw-Hill. New York, 1969.
4. Hodgkin, A. L. and H axley, A. F ., A quantitative description o f membrane current and its application
to conduction and excitation in nerve, J. Physiol., 117. 50, 1952.
5. M ountcastle. V. B ., Medical Physiology. 13th ed.. C .V. Mosby. St. Louis. 1974.
Volume I: Time and Frequency Domains Analysis 15

Chapter 3

RA N D O M PR O C ESSES

I. IN TR O D U C T IO N

Randomness appears in biomedical signals in two major ways: the source itself maybe
stochastic (as are indeed all information conveying signals) or the measurement system
introduces external, additive or multiplicative, noise to the signal. Whether a signal is
considered stochastic or deterministic is a matter of definition. An ECG signal can be
considered deterministic, and even “ almost" periodic, when some characteristic of the QRS
are of interest, or it can be considered stochastic, where R-R interval variations are of
interest.
Probability theory plays an important underlying role in the analysis of random signals.
Therefore, we provide a brief review of probability theory in the opening of this chapter.
The concepts o f probability theory are then extended to the characterization and analysis of
random signals. The emphasis in this chapter is on definitions and basic presentation of
material directly required for the understanding of the topics discussed in later chapters. For
a more detailed and rigorous presentation of the material, the reader is referred to the many
textbooks av ailable.1 '
Special attention is given, in this chapter, to the topic of correlation analysis, since it has
importance as a detection method often used in biomedical signal processing The multi
dimensional gaussian process is introduced at the end of the chapter. In several analysis
methods discussed in the course of this book, the assumption is made that me signal is
iiaussian. and reference is made to its distribution and other characteristics.

II. E LE M E N TS O F P R O B A B IL IT Y T H EO R Y

A. Introduction
Consider an experiment, the outcome of which can be one of several events. The outcome
of the experiment depends upon the combination of many factors which are unpredictable.
The events are called discrete random events. We can not predict the exact result of such
an experiment: we can, however, comment about the average outcome of a large number
of experiments. A throw of a die serves as a popular example for such “ experim ents" where
the events are the numbers on the face of the thrown die.
Assume we have performed the experiment N times. Out of the N resulted events, the
event A( has oceured n, times. We define the relative frequency, f>, as:

The probability of event A,, P(A,), is then given as the limit of the relative frequency:

(3.2A)

with

(3.2B)

Note that we have assumed that the limit in Equation 3.2A does exist.
16 Biomedical Signal Processing

Two events are called mutually exclusive events if the occurrence of one makes the
appearance of the second impossible. If A; and Aj are mutally exclusive, then the probability
that Aj or Aj will occur is P(A; or Aj), with

P(A; or Aj) = P(Aj) + P(Aj) (3.3A)

and more generally, if the random variables A,, i = are mutally exclusive, then
M

P(A, or A2 o r,...,o r AM) = ]£P(Ai) (3.3B)

i= i
For any event A we have an event B = NOT A which are mutually exclusive, hence:

P(A) + P(NOT A) = P(certain event) = 1 (3.4)

B. Joint Probabilities
When an experiment has many (rather than single) outcomes, we speak about joint prob
abilities. Consider, for example, the result of a blood test. The test outcome consists of
several parameters. We can talk about the probability that the outcome of the blood test will
be some given values for all the parameters; the probability of this happening is the joint
probability. We denote the joint probability of the random variables A ,B ,C ,...,J by
P (A B C D ,...,J) with the meaning: the probability that A and B and C a n d ,..., and J will
occur.
Often the probability of one event is influenced by another event. We may want to consider
the probability of one event occurring, given that the other one has already occurred. This
is known as conditional probability. The probability of event A occurring, given event B
has occurred, is written as:

P(A|B) = Probability (A occurs given B has occurred) (3.5)

As an example, consider the following experiment: two cards are successively drawn from
a deck (without returning the drawn card to the deck) and the probability of the first being
an ace and the second a king is sought. The problem can be posed as follows: what is the
probability of drawing a king given an ace was previously drawn?
Consider now the relationship between the joint and conditional probabilities. Assume an
experiment the result of which is given by two simultaneous events performed N times. Let
nA denote the number of times event A appeared in the outcome and nAB the number of
tim es the event A and B appeared. The probability of the joint event AB is

P(AB) = lim = lim • (^ 7 • — ) (3.6A)

n-* * VN / n— \ N nA /

Assuming that the number of experiments is sufficiently large such that nA is also very large,
then we can rewrite Equation 3.6A as:

P(AB) = P(A) P(B|A) (3.6B)

Therefore, we shall get the conditional probability

(3.6C)
P(B|A> = P(A) (f°r P(A) * 0)
Volume I: Time and Frequency Domains Analysis 17

and

P(AB)
P<A 1B) = - ^ 7 (for P(B) * 0) (3.6D)

also:

M B . - (3.6E)

and

P ( B |A ) = M 1M B )
(3.6F)

The last equations are known as the Bayes' rule.

C. Statistically In d ep endent Events

Two events are said to be statistically independent if their conditional probability equals
the elementary probability of occurrence. Consider the conditional probability P(B|A); if

P(B|A) = P(B) (3.7)

then the information about the occurrence of event A has added nothing to the knowledge
about event B. The two events are said to be .statistically independent. Introducing Equation
3.7 into Equation 5.6C . we get:

P(AB) = P(A) * P(B) (3.8A)

namely. the joint probability of statistically independent events equals the multiplication of
the individual probabilities. In general, for the case with n statistically independent events
A,, i = 1.2...... n. ve have:

P(A„A2,...,A n) = nP(Aj) (3.8B)

i= I

In addition, for all combinations i, i < j < k ,...,^ n

P(A,) = P(Aj) P(Aj)

PCM jA,) = P ^ ) PCAj) P(Ak) (3.8C)

P(A,,A2,A3,...,A n) = P(A,) P(A2),...,P (A n)

D. R an d o m V ariab les
The outcome of an experiment can be a number or some other description of the event.
Let us assign a real number, or a set of numbers to each possible outcome of the experiment.
We may have some n discrete values for the description of the outcomes, in which case the
experiment will be described by a discrete set of numbers (vectors); we denote the set as a
discrete random variable. In other cases, continuous values are required to describe the
outcomes; this set is termed continuous random variable.
Consider the discrete random variable x. The values x,, x2,..,x n are n discrete values
constituting the random variable x. The probability of the event to which the number Xj was
18 Biomedical Signal Processing

assigned is Px(x = x^), which denotes the probability that the random variable x will have
the value x;. If, for example, the n outcomes of the experiments are mutually exclusive,
then:

2 P „ (x = x,) = 1 (3.9)
i= I

since Equation 3.9 describes the probability of the certain event. When the outcome of the
experiment consists of two events, x and y, we define as before the joint probability Pvv(x
= Xj, y = yj), the probability that the random variable x will get the value \ t, and the
random variable y get the value y^ The certain event is the summation over all i and j,
hence:

S S p jx = *i-y = y,) = i (3.10)

The conditional probabilities and the Bayes’ rule are similar to Equation 3.6:

5 > XU = xjy = y^ = 1 (3.10A)

2 py(y = yj* = *i) = (3.10B)

(3. IOC)

(3.10D)

(3.10E)

E . P robability D istribution Functions

Consider now the continuous case where the real random variable, can get any value
on the real axis - ==s x x . We define the probability distribution function (sometimes
called the cumulative distribution function) P(x X). namely, the probability that the random
variable x will get a value less or equal the constant X. Clearly.

(3.11)

P(X x) = 1 (3.12)

Also, the probability that the random variable x will get a value in the range X2 < x ^
X, is nonnegative and is given by:

P(x ^ X,) - P(x X2) = P(X2 < x ^ X,) (3.13)

We conclude that for every X, > X2 we get P(x ^ X,) 2= P(x X2), hence the probability
distribution function is a nonnegative, nondecreasing function bounded by zero and 1 (see
Figure 1).
Consider the case where two random variables, x and y, are participating in the range
( —00 ^ x s? x s —x y oc). Define the joint probability distribution function, P(x ^
X,y =£ Y). The joint probability distribution function of X and Y is the probability that the
random variable x will get a value x ^ X and the variable y will get a value y ^ Y. The
following relations are obvious:
Volume I: Time and Frequency Domains Analysis 19

p (X )
1 1
1 1
! \
Jp ( x ) <j x

--------------------- § m ^ —
X2 X, X

FIGURE 1. Probability functions. (A) Probability distribution function; (B) probability

density function (PDF).

P(x *£ X,y ^ oc) = P(x ^ X) (3.14A)

P(x ^ x ,y ^ Y) = P(y Y) (3.14B)

P(X -- x ty sc Y) = 0 = P(x ^ X,y -o c) (3.14C)

P(X ^ oc,y ^ x ) = 1 (3.14D)

F. Probability Density Functions

Consider the continuous random variable x and the probability P(X - AX ^ x =s X).
namely, the probability that x will get a value close to X; denote this probability by p(X) *
AX. We shall consider an infinity small neighborhood AX, and use Equation 3.13 to get:

AX)
p(X) = lim (3.15)
AX—0 AX

which, by definition, is the derivative operator, hence:

(3.16A)
p(X) = dx P(x 85 X)

P(x X) p(x)dx (3.16B)

In Equation 3.16A , we have assumed that the derivative exists, or can be expressed in
terms o f delta functions.- The following relations are obvious:
20 Biomedical Signal Processing

P(X2 < x ^ X.) = [ p(x)dx (3.17B)

Jx2

Note also that for continuous random variable, the probability of getting a certain value,
say Xo, is zero: i

P(x = Xo) = 0 (3.17C)

When the experiment consists of more than one random variable, we shall use the joint
probability density function defined on the joint distribution function in a similar manner to
Equation 3.16:

P(X,Y) = s id Y P (x 55 x , y 15 Y ) (3 ,1 8 A )

P(x « X,y Y) = j J p(x,y)dxdy (3.18B)

In Equation 3.18A, we have assumed that the partial derivatives exist, or can be expressed
in terms of delta functions.2
We shall be interested in situations where a random variable is to be investigated given
some conditions on the other random variable. Consider the probability of the random variable
y being less or equal to some Y, given that x is in the range X - AX < x X:

P(y ^ Y|x = X) = — — -------

f. p(X,y)dy
(3.19)
p(X)

We define Equation 3.19 as the conditional probability distribution function, if this function
has derivatives, we define the conditional probability density function, p(YjX), by:

p(Y|x) = ^ _ ^ X ) (32QA)

■ P(y =£ Y|x = X) = £ p ( y |X )d y (3.20B)

The following relationships are easily shown from the last definitions and previous relations:

p(Y|X) 3= 0 (3.21 A)

P(Y, < y « Y,|X) = £ p(y|X)dy (3.21B)

Jj> (y |X )d y = I (3.21C)

p(Y|X) = (3.21 D)
Volume 1: Time and Frequency Domains Analysis 21

FIGURE 2. The ensamble of the random process x(t).

III. R A N D O M SIG N A LS C H A R A C T E R IZ A T IO N

A. R an d o m Processes
Up until now we have considered random variables that were “ the outcome of an ex
perim ent” . We now consider a time function x(t). The value of the function at any time t,,
x(tj). is a random variable. The variable t is chosen since most of the signals that will be
considered here are. time-dependent signals. In general, of course, x may be a function of
distance or any other variable.
We assume that we have a source generating the random function x(t), which is denoted
as a sample function. The source generates many sample functions which together are known
as the ensam ble. At any time;, t , , we can observe the values of all sample functions, to get
many “ outcomes o f the experiment” . Figure 2 depicts n sample functions out of the ensamble
of random process x.
For exam ple, consider the EEG signal (see Appendix A) taken by means of surface
electrodes located at a certain location on the scalp. We want to investigate the properties
of the EEG. recorded at the particular location, of a given population segment, say. healthy
22 Biomedical Signal Processing

children in a certain age group. We hypothesize that the EEG recorded is a sample function
o f a common random process. After recording n sample functions, we have for each time
t; n values of the random variable x(tj).)We can use these values to estimate the probability
distribution function of x(t4). j
Assume we have an ensamble of n sample functions of x(t). For large n we can use the
ensamble for estimating the probability, P(x(t) ^ X). We can also estimate the joint prob
ability, P(x(t,) ^ X,, x(t2) X2,...,x (tN) XN); we shall denote this as P (x,,x2,...x N) and
the joint probability density function of the random process x(t) by p(x,,x2,...,x N). It is easy
to show that the probability density function has the following properties:

p(x,,x2,...,x N) ^ 0 (3.22A)

j ... J p(x,,x2,...,x N)dx,...... dxN = I (3.22B)

, , p (x ,,x ,...... xv)

P(Xj +, ,Xj+2, ... ,xN x, ,x2, ... ,Xj) = ^ (3.22C)
p(x,,x2...... Xj)

Suppose we have two random processes x(t) and y(t). These are said to be statistically
independent random processes if:

p(x,,x2,...x N,y,,y2,...,y M) =

= p(x,,x2,...,x s)p(y,,y,...... yM) (3.23)

The joint probability density function of x(t) was estim atedat the time t = t, (refer to Figure
2). Let us also estimate the joint probability density function at another time, t = t, + t .
A process in which the joint probability density functions are identical at all times, and
for all N is called a stationary process.

B. S tatistical Averages (Expectations)

Consider two random variables, x and y, the joint probability function of which is given.
Let a new random variable, z = f(x,y), be a single-valued function of these two variables.
We define the expectation, E{z}, by:

E{z} = E{f(x,y)}-= J J f(x,y)p(x,y)dxdy (3.24)

The expectation is also known as the statistical average or mean. If x and y are discrete
such that x gets the values x^ i = 1,2 ,...,I and y gets yj9 j = 1 .2 ....,J, than the expectation
is

i j
E{z} = E{f(x,y)} = ^ X f(x..yJ)PCx1.y1> (3.25)
i=lj=l

It is easy to show that the expectation is a linear operator.

Consider for example the function z = xn. The expectation of xn is known as the nth
moment o f x:

E{xn} = xnp(x)dx (3.26)

Volume I: Time and Frequency Domains Analysis 23

The first moment o f x, m = E{x}, is called the mean. In a stationary random process it is
the t4d c” com ponent. The nth central moment, |xn, is defined b y

|i n = E{(x - m)n} = | J x - m)np(x)dx (3.27)

The second central moment has a special importance; it is called the variance and is denoted
by crj:

o-; = (jl, = E{(x - m)2} = j (x - m)2p(x)dx (3.28)

The square root o f the variance is called the standard deviation. The (n + k i;;? order joint
momt’t't F fv ' ' r - m r i j0iM central moments, |xnm, are similarly defined:

E{x"ym} = | J ^xnymp(x,y)dxdy (3.29)

M-nm = E{(x - mx)"(y - my)m} =

L J - J K ~ m*My ~~ my)mP(x,y)dxdy (3.30)

O f a special importance is the joint central moment, ju l ,

,, called the covariarce:

jxH = E{(x - m v)(y - my)} (3.31)

IV. C O R R E L A T IO N A N A L Y SIS

A. The C o rrelatio n Coefficient

Consider two random variables x and y. Let us assume that the two have some linear
dependence. By this we mean that if we were to plot the samples of the rarcom variables
on the x - y plane, the scatter diagram of the points thus plotted will be distributed around
a straight line:

yp = ax + b (3.32)

The random points (x,y) will not fall exactly on the line (points (x,yp)) but w ill be scattered
in its vicinity. The mean square error, e, between the random points and the line is given
by:

e = E{(y - yp)2} = E{(y - ax - b)2} (3.33)

The best line that is drawn through the scattered points in such a way that e is minimized
is known as the regression line. The parameters a and b of the regression line are given by
minimizing Equation 3.33:

— = 0 = -2 E {(y - ax - b)x} = -2E{xy} + 2aE{x2} + 2bE{x}

= o = —2E{(y - ax - b)} = -2E{y} + 2aE{x} + 2b (3.34)

6b
By solving, we get:
28 Biomedical Signal Processing

A gaussian random process is a random process in which for every time instant, tj, the
random variables x^tj), i = are jointly gaussian distributed.
Since the gaussian process is completely described by the first and second moments, it
follows that a wide sense stationary gaussian process is also stationary in the strict sense.
It can also be shown that a linear transformation of a random gaussian process yields
gaussian random process. This property of the gaussian process is important in signal
processing since any linear amplification and filtering will not alter the gaussian nature of
the process.

R EFER EN C ES

1. Papoulis, A ., Probability, Random Variables and Stochastic Processes, McGraw-Hill, New York, 1965.
2. D avenport, \V. B. and Root, W. L ., An Introduction to the Theory o f Random Signals and Noise,
McGraw-Hill. New York, 1958.
3. Bendat, J. S. and Piersol, A. G ., Random Data: Analysis and Measurement Procedures, Wiley-Inter-
science. New York, 1971.
Volume I: Time and Frequency Domains Analysis 29

Chapter 4

DIGITAL SIGNAL PROCESSING

I. INTRODUCTION

Most of the signals of interest in biomedicine are continuous (analog); namely, they are
defined over a continuous range of the variable (usually time). It is important, however, to
analyze discrete signals, namely, signals that are defined only at discrete instants. Modern
digital technology, both in terms of hardware and software, make discrete time processing
advantageous over analog processing. The advantages are such that usually it is worthwhile
to convert the analog signal into a discrete one so that discrete processing can be applied.
The conversion is done by analog to digital (A/D) conversion systems that sample and
quantize the signal at discrete times. Usually the sampling is performed uniformly; however,
sometimes nonuniform sampling is used.
Some of the main tasks o f digital signal processing are to apply filtering, to estimate
various signal parameters, and to perform transformations of the signal (i.e., Fourier trans
form). When the results of the processing are not required immediately following the signal,
off-line processing methods are used. When results are required during and immediately
after the signal sample has been acquired, real-time or on-line processing methods are used.
Depending on the application, the processing time and the size of memory required are
of importance in digital signai processing. Off-line processing can be performed on general
purpose computers. Real-time processing usually requires special dedicated machines or
systems such as fast correlators, dedicated Fourier transform machines, array processors, or
hardware multiply-accumulate circuits.
This chapter briefly discusses the problem of sampling and quantization. The z transform
is introduced and applied to digital signal processing.
The material presented in this chapter is used as a background for later chapters. For a
comprehensive discussion of this material and other important topics such as the fast Fourier
transform (FFT) and digital filtering, the reader is referred to the literature on digital igna!
processing.1

II. SAMPLING

A. In tro d u ctio n
Consider a band-limited continuous signal, x(t). which is bounded in amplitude by |x(t)|
A. A band-limited signal has a.Fourier transform, X(w). with X(w) = 0 for all jwj 2*
wnm. It is desired to process the signai by digital processing means, in general, by a digital
computer. The digital machine requires the conversion of the continuous signal, x(t), into
a series o f discrete numbers {x(k)} (refer to Figure 1). The first stage of the conversion is
sampling or time discretization. Assume that we apply uniform samplirv^ with interval T0
namely, with sampling frequency f, = 1/TS. We then generate out of the signal a sequence
of sampled data {x(nTJ}, n = 0 ,1 ,... Note that each sample. x(iTJ, is continuous in
amplitude in the range —A ^ x(iT„) A. A discretization of amplitude is.now- required to
get the necessary sequence of discrete numbers. A quantizer device is used for this method.
The combined process of sampling and quantization is known as analog-to-digital conversion
(A/D). We consider the quantization as a transformation of the amplitude continuous quantity,
\(iT J, into discrete number. x,,(iTj. A detailed discussion on the quantization process
is given in the next section.
In the general processing system, the signal is required in an analog continuous form after
30 Biomedical Signal Processing

SAMPLER QUANTIZER COMPUTER

ANALOG-TO-DIGITAL CONVERT. DIGITAL-TO-ANALOG CONVERT.

FIGURE 1. Digital processing o f analog signals.

processing. Consider (see Figure 1) the processed sequence of numbers, yM(n). To get an
analog signal, we perform the digital-to-analog conversion (D/A). The D/A most often
consists o i zero order hold (ZOH) circuit, followed by a low pass filter (LPF).

B. U niform Sampling
The ideal sampler, described in Figure 1, consists of a switch that is closed for an infinitely
short duration every Ts seconds. It can be considered as an ideal switch driven by a train
of impulse function, 8 ,(t), where:

ST(t) = 2 5ft - nTs) (4.1)

and where 8(t) is the delta (or dirac) function. The sampled signal can be considered as the
multiplication of the continuous signal, x(t), with the train, 8,(t). The time discrete sampled
signal is written, thus, as a time function, denoted by x*(t):

x*(t) = x(t)5,(t) = X x(t)8(t - n T j (4.2)

It can be shown that if x(t), the band-limited signal, has a Fourier transform given by X(\v).
then the Fourier transform of the sampled signal, X*(w), is given by:

X*(w) = — X(w + nws) (4.3)

where wN = 2 llfs = 2FI/TS. Figure 2 shows an example of the transform for two cases. In
Figure 2b, the sampling frequency obeys ws > 2wmax where wmax is the largest frequency
of the signal x(t). We note that the sampled signal in the frequency domain consists of
nonoverlapping functions. Consider the effect of a low pass'filter that will pass all frequencies
in the range —\vmax =5 w wmaK undistorted, w hilezeroing all frequencies outside this
range. The Fourier transform of the signal at the output of the filter equals that of x(t). Since
the Fourier transform is unique, we can restore the original signal from its samples by such
low pass filtering operation, provided the sampling frequency obeys:

ws 5= 2wmax (4.4)

This is known as the sampling theorem. Condition 4.4 is known as the Nyquist rate. Figure
2c shows the Fourier transform of the sampled signal when the sampling frequency is less
than the Nyquist rate. In this case, the functions in the frequency domain overlap and low-
pass filtering cannot restore the signal without distortions. The phenomenon of overlapping
is called aliasing infrequency. Note that when sampling a continuous signal that is not band
limited, aliasing always occurs no matter how large ws is.
In practical cases, when a signal has a large wm.lx, it is often preprocessed by analog low
Volume I: Time and Frequency Domains Analysis 31

(b )

(c)

FIGURE 2. Sampled band-limned signal in ihe l‘requenc\ domain, (a) Spectrum >'i ihe
band-limited signal: (b) spectrum oi’ the sampled signal. -w,,,.,,: ic) spectruir. o: the
sampled signal. wv ■' 2\v,r-i>. Nine aliasing.

pass filter in such a wav that the high frequencies are eliminated so as not to cause aliasing
problems. In theory, the signal can be sampled at the lowest Nyquist rate, w, = 2wm.1N.
The reconstruction of a signal such sampled requires an ideal rectangular low pass filter,
which is impossible to implement. The need to use realizable filters for the reconstruction
of the signal makes it necessary to sample at frequencies higher than the Nyquist rate.
Sampling at frequencies of 2.5 to 10 times w.. v are often used.

C. Nonuniform Sampling
Uniform sampling rate is convenient since the information is contained in the value of
the sample only. No time information is required since it is known a priori that samples
are equally spaced by T„ seconds. Sometimes, however, the signal consists of some inter
mittent occurrences of fast changing and relatively quiescent intervals. One would then tend
to sample at a high rate during fast changing periods, while reducing sampling rate during
the quiescent intervals. This Calls for an adaptive, nonuniform sampling. The ECG signal
is an exam ple where such sampling scheme m ay be effective.
Two main reasons exist for using nonuniform, adaptive sampling. The first is when
effective storage is required. The problem is to store the signal using minimum storage size,
retain in g /11' **Mity to recons*mct the signal within a given error. The second is when an
effective transmission is required. The problem here is to reduce the transmission rate (bit
per second), retaining the ability to reconstruct the signal at the receiver side within a given
error.
32 Biomedical Signal Processing

Several data compression techniques to reduce transmission rate and storage requirements
have been developed for communications application. The differential pulse code modulation
(DPCM) is one of the most popular schemes. An error signal is generated and nonuniformly
quantized. The error is the difference between the original signal and a signal estimated
from the output of the quantizer. Thus, only the error is quantized and transmitted reducing
the amount o f information. The output of the quantizer is uniformly sampled. An improve
ment to the accuracy of the above scheme is the introduction of an adaptive quantizer that
automatically adapts the step size of the quantizer, q, according to the signal. Adaptive delta
modulation (ADM) is such a scheme often used in synchronous communication systems.
A significant reduction in data compression can be achieved by using nonuniform sam
pling.7'8 Consider a scheme in which information is sent only when the source signal crosses
a threshold level. This will cause periods in the signal, where fast changes exist, to be
sampled at a higher rate than periods with slow variations. Note, ‘however, that the trans
mission is now asynchronous, since the receiver does not know a priori the exact location
o f the sample on the time axis. In storage application, information must be added to indicate
the time of the sample.

/ . Zero, First, and Second Order Adaptive Sampling

Three basic schemes for adaptive, nonuniform sampling are discussed in this section. A
comparative study of these three methods, with applications to the ECG, has been reported.9
The zero order adaptive method is also known as the voltage triggered method. Assume
that at the time t;, a sample was sent. The next time a sample will be sent is the time
t: ^ ,, when the absolute value of the difference, (x(tj + ,) — x(tj)). first exceeds a given
threshold, R(). Hence, the ith sampling interval is given by Ti? such that:

|A x ( tj,T j)| = |x(t; + Tj) - X(t;)| > R(, (4.5)

Since the signal is assumed to be band limited by wmax, there is no use to sample it at a
rate higher than ws = kwmax (where k is a constant in the (empirical) range 2 ^ k ^ 10).
When Equation 4.5 yields sampling interval Tj < 2n/kw max, we replace it by ts = 211/
kwmax. Thus the maximum instantaneous sampling frequency of the adaptive scheme is
bounded by kwma^. An example of the voltage-triggered nonuniform sampling of the ECG
is given in Figure 3.
The first order method is known also as the two points projection method. Here, the first
samples are used to estimate the slope of the signal. As long as subsequent samples fall
within some specified error of this slope, they are ignored. The first sample that falls outside
the error tolerance is stored (or transmitted) and used to estimate the next slope. Denote the
derivative of the signal, at time ts, by x (t;). Assume that at t;, the sample x(ts) has been
stored. The next sample to be stored is the sample at time t* 4- t {, x(t4 + t ;), for which the
absolute value of the slope’s difference first crosses the threshold, R,

|Ai(t,,T,)| = \k(tx + T,) - i(t,)| > R, (4.6)

Note that here w'e compare the siope of points at time (t; + t ) with the slope at the last
point to be stored. When R, is crossed, we store the sample x(t; -f t ,) and use the new
slope x (t, -f Tj) as a new reference.
The slope of the signal has to be estimated. Consider the uniform sampling of the signal
at a maximum rate of = kwmax, yielding the samples {x(nTs)}, n = 0 .1 ... The slope can
be estimated by:
Volume I: 'l ime and Frequency Domains Analysis

ZERO

■hi in in in i
FIRST

I hi hi i «
SECOND

FIGURE 3. Nonuni form sampling of FCG. Synthesized FCG and sampling instances
and reconstructed signal for zero, first. and second order adaptive sampling methods.
34 Biomedical Signal Processing

If the signal contains additive noise, the estimation 4.7A can be modified by:

i i x((n + j)T.) - i 2 W n - j)Ts)

~ ----------------- (4-7B)

where (2M — 1) is the number of samples used to smooth the data. The slope is then
estimated every (2M - 1)TS seconds.
The application of the two points projection method to the ECG is demonstrated in Figure
3.
The second order nonuniform sampling method is known as the second differences method. 10
It examines the slope just before the current sample and just after it. If the absolute value
o f these two adjacent slopes is larger than a given threshold, R2, the sample is stored. Hence,
here we are considering the local change oi slopes. The method is formulated as follows:
The sample x(t,) is stored if:

|i(t,+) - x (tf)| > R2 (4.8)

In practice, we have to estimate the time derivatives. This can be done, again, by uniformly
sampling the signal at a maximum rate. The examination of the sample X(nTs) at time tn
= nTs by Equation 4.8 can be done by using the slope estimation of Equation 4.7B. If we
choose again a window of (2M — 1) samples for smoothing the data, we get (assuming M
is odd):

M -1 \ \ 2 <M
« ,W / / M -1
: f1. M
x n -------------------------------------------Ai i ------------ :---------------------------- <4 9 A >
( 2 ) Ts
2 // M -1 \ \ 2 // M -1 \ \
^ ,?o X( ( " * — + M - M T1 I * ((" - — - J) T- ) _
x(t, ) = ----------------------------------------- 7 ir = ir --------------------------------: -------- (4-9B)
( 2 ) s

The application of the second difference method to ECG is shown in Figure 3. Blanchard
and Bar9 showed that the voltage triggered and second differences method had serious defects
in sampling their synthetic ECG. They concluded that the two points projection method best
suited the signal yielding average compression reduction of about 1:14.

2. Nonuniform Sampling with Run Length Encoding

The asynchronous transmission of the nonuniformly sampled signal does not offer com
patibility with digital storage. Information concerning the time interval between samples
must be provided. Several methods have been suggested for such schemes; we shall present
here a method suggested by Mark and Todd7 in which the time intervals are coded by means
o f run lengths.
We employ a nonuniform sampling encoder (NSE), which samples the signal each time-
it crosses a threshold level. The NSE divides the dynamic range of the signal ± A into
equally spaced threshold levels and transmits a sample each time one o f the levels is crossed.
The information encoded for each sample is the time interval from the last level crossing
and the direction of crossing.
Let the time of the ith crossing be denoted by ti? and the time interval between the ith
Volume J: Time and Frequency Domains Analysis 35

and (i - 1) crossings, denoted by t , = We shall quantize the time axis such

that the time quantum be tq. The quantized ith interval, t?, is given by:

if Tj tq
(4.10)
if t < Tj

= tT/tq + 0.5], with [•] being the integer smaller than or equal to the argument. We
shall encode each sample as follows: Denote each time quantum, tq, in t? by a pair of zeroes
(00), an upward crossing by (01), and a downward crossing by (10). A digital word is now
generated by placing the pairs of zeroes (minus one pair) followed by the crossing code.
For exam ple, a downward crossing that occurred 3tq sec after the previous crossing will be
denoted by 00 00 10. We now code this word by letting the first digits be the number of
pairs o f zeroes followed by a single bit — one for downward and zero for upward crossings.
For the example above, we shall get the coded number 101. This coding is the run length
of the initial w ord.7 The length of the encoded word in the example is n = 3 bits. This
allows for two digits for the registration of the interval. If no crossing occurs in the period
of n*tq, a word o f n times “ 00” is generated (for n = 3 this word is coded into 110 denoting
3 groups o f “ 0 0 ” ). Long quiescent periods with no level crossings will be coded into a
succession o f the words 110.
Consider the compression ratio provided by the method. Assume that the probability of
downward crossing equals the probability of upward crossing. We then get:

P(01) = P(10) = i(l - P(00)) (4.11)

since P(01) 4- P(10) + P(00) = 1, the certain event.

If we further assume that the NSE output sequence is independently and identically
distributed, then the mean run length,7 is

Ml21

where m = 2" ~ 1 - 1 is the number of time quanta that can be coded with block length of
n bits. The rate of information is given by the required bits per seconds. If the rate of the
coded information is denoted by rc, then:

rc = ----- - (4.13)
m • ta

Consider now the case where the signal is sampled uniformly with sampling frequency,
fs = l/T s, and where each sample is described by a word length of N bits. The rate of
information, r, is

(4.14)
40 Biomedical Signal Processing

The last two terms on the right side of Equation 4.20A are due to the saturation parts of
the quantizers. In Equation 4.20A it was assumed that x() = and xN = In general,
we shall adjust the input signal such that p(x) = O for x > xN _ , and x < x,; hence, the
last two terms o f Equation 4.20A are zero and:

2p<V q? (4 .20B)

The most common quantizer is the uniform quantizer for which q, = q, i = 2 ......N -
.1. For these quantizers we can further simplify the expression for the noise variance:

q2 q2
<= p 2 P(xM,)q = — (4.21)

since:

S p ( x qi)q - Jp(x)dx' = 1

The approximate value for the quantization noise variance given by Equation 4.21 is the
one most often used. Uniform quantizers, with small quantization step q, and with saturation
levels that can be ignored, have uniformly distributed noise,12 in the range —q/2 ^ nq ^
4 2 with variance cr2 - q2/12.
To justify the assumption that saturation effects can be neglected, we make sure the
extreme quantization levels x, and xN _ , are some multiple of the input signal variance.
Define the loading factor, Lq (for symetric uniform quantizers):

where o\. is the standard deviation (rms) of the input signal. A common choice for the
loading factor is Lq = 4 (four sigma loading). For such a quantizer, we have N — 2 levels
for - 4 x ^ 4o\, hence:

q = 8or/(N - 2) (4.23)

A signal-to-noise ratio, SNR, for the quantizer can be defined by:

SNR - 10 log (cr2/cr2) (4.24)

The SNR of the symetric uniform quantizer is given by introducing Equations 4.21 and 4.23
into Equation 4.24:

SNR = 10 log((N - 2)2 • 0.1875) - 20 logN - 7.3 = 6n - 7.3 (4.25)

In Equation 4.25, we have used the relation N - 2n and the assumption N > 2. In practical
cases, we may use a 10-bit word, which results in a high SNR of more than 50 db.

D. Rough Quantization
In the previous section, the quantization noise was analyzed under the assumption that
the number of quantization levels is high and quantization step is small. In many applications,
these assumptions are not valid; we then talk about rough quantization. A more detailed
Volume I: Time and Frequency Domains Analysis 41

(a)

, (b)

H C il'R E 7. Rough quantizers tclippers>. iai One bit quantizer (N =

2). (b) Ternary quantizer (N - 3).

analysis is required for the noise geneiated by rough quantization. Figure 7 shows two rough
quantizers. T h e first is a one bit quantizer (N 2) in which the quantized sample is the
sign of the input, x - Sgn(x). The second rough quantizer has N = 3, where the description
of the quantized output requires 2 bits. The importance of these two rough quantizers (also
known as clippers) is due to the fact that processing of the quantized data is extremely
simple. Digital correlation, for example, of signals quantized by these quantizers requires
no multiplication. Such correlators have been suggested1' and applied14 to biomedical signal
processing.
The statistical analysis of rough uniform quantization noise has been given by Wid-
row .,5Kl Widrow has proven the quantization theorem which is, in some sense, analog to
the Nyquist sampling theorem. The time samples of x(t) have continuous amplitude prob
ability density function p(x). The quantized output. xq, assumes only discrete amplitudes
and thus has discrete probability density function pq(xq). This function consists of a series
of uniformly distributed impulses, each one centered in a quantization region. Figure 8
dcpcits the two density functions.
Widrow has considered the output density function as the sampled form of the input
density function. If the input density, p(x), is bounded by frequency (namely, its Fourier
transform P(u) has the property P(u) = 0 for all juj umax), then there exists a quantization
level, qs, such that quantized signals, with quantization levels q qs, contain all the
information on the original‘distribution, p(x). In other words, we can generate the original
probability density function of the input signal from the quantized one, provided the quantizer
obey the quantization law, q qs. The quantization law states that in order to have all
informatk i on the probability density function, the quantization step must obey:

q qs = IT/umax (4.26)
42 Biomedical Signal Processing

FIGURE 8. Probability density functions 'PDF). Upper trace: PDF of the

random variable, x; lower trace: PDF of the quantized random variable. .\ r

Having developed the probability density function of the quantized signal, the noise
statistics (such as the variance and the correlation) can be calculated in general. The variance
of the quantized noise of Equation 4.21 is a special case of the general result given by
W idrow.

IV . D IS C R E T E M E T H O D S

A. T he Z T ransform
Consider the sampled signal, x*(t), given by Equation 4.2. If its (one-sided) Laplace
transform is denoted by X*(S), then:17

X(S) = X x(nT)eP< ~ nT S ) (4.27)

where T is the sampling interval and S is the complex frequency.

Define a new complex variable, Z:

Z = exp(ST) (4.28)

and the Z transform X(Z):

Z{x} = X(z) = X *(S)|„r,s, , , , = ^ x ( n T lZ ( 4 .2 9 )

n <>
Volume I: Time and Frequency Domains Analysis 43

Equation 4.29 is known as the one-sided Z transform in which we assume that the signal
x(t) = 0 for t < 0. It is easily shown that the Z transform is a linear operator. Several
important properties of the transform make this operator an important tool for the solution
o f difference equation and the analysis of sampled data system.
One o f the important properties is the shift property. It can easily be shown that:

Z{x(t + mT)} = Z'"X(Z) - Zmx(0) - Zm-'x(T ) - ...... - Zx(mT - T) (4.30)

For example, for m = - 1, Equation 4.30 yields the Z transform of the sampled sigi.al,
x(nT), delayed by one interval, in terms of the Z transform of the original signal:

Z{x(t - T)} = Z 'X(Z) - Z 'x(O) (4.31)

uiicic ............ . initial conditions, the value of the signal at t = 0.

Note that Equation 4.29 describes the Z transform in terms of a series:

X(Z) = x(0) + x(T)Z 1 + x(2T)Z ‘2 + ...... + x(nT)Z " + .... (4.32)

The inverse transform, x(nT), can be determined by inspection. The inverse transform can
be determined also analytically by the residue theorem, through an integration in the complex
plane.17

B. Difference Equations
A time invariant linear system, with input u(t) and output y(t) which are J;*fined only at
discrete instances t = kT, can be described by a difference equation:

y(kT) + a,y(kT - T) + a:y(kT - 2T) + ...... + apy(kT - pT -

b„u(kT) + b,u(kT - T) + ...... + aqu(kT - qT) (4.33)

The difference equation 4.33 can be solved by means of the Z transfonr; in a similar
manner in which differential equations arc solved via the Laplace transform. Denote Z{y(t)}
= Y(Z), Z[u(t)} = U(Z) and transfer both sides of Equation 4.33 into the Z domain using
the shift properties. Assuming all intial conditions to be zero, we get:

(1 + a .Z " ' + a2Z 2 + ... + apZ P)Y(Z) -

(b„ + b,Z 1 + b2Z “ - + ... + b4Z q)U(Z) (4.34A)

or:

b() + b ,Z ~ 1 + ... + buZ q

Y(Z) = H(Z)U(Z) = -7 —---------- — ------ — V T V M ^4 -34B)
1 ! a ,Z -i + ... + apZ p

The output signal is given in the Z domain by the ratio of the two polynomials. H(Z) (the
Z domain transfer function) describing the system, and the input, U(Z). Apply ing the inverse
transform operation on Equation 4.34B will give the required output signal in the time
domain. ;
The transfer function, H(Z), can represent a digital filter operating on the signal, u(t),to
improve its quality in some sense. We shall also see in later chapters that v\e sometimes
use H(Z) as a means for effective description of the signal, y(t). In these cases, we assume
that y(t) is the output of a linear system driven by u(t), a white noise source We identify
H(Z) and use the parameters a, and b, to represent the signal y(t).
44 Biomedical Signal Processing

REFERENCES

1. G old, B. and Rader, C. M ., Digital Processing o f Signals, McGraw Hill, New York, 1969.
2. Beaucham p, K. G. and Yuen, C. K ., Digital Methods fo r Signal Analysis. George Allen and Unwin,
Ltd., London, 1979.
3. Tretter, S . A ., Introduction to Discrete Time Signal Processing, John Wiley & Sons, New York, 1976.
4. Chen, C .-T ., One Dimensional Digital Signal Processing, Marcel Dekker, New York, 1979.
5. Oppenheim , A. V., E d., Application o f Digital Signal Processing, Prentice-Hall. Englewood Cliffs, N.
J., 1978.
6. Ahm ed, N . and Rao, K. R ., Orthogonal Transforms fo r Digital Signal Processing, Springer-Verlag,
Berlin, 1975.
7. M ark, J. W . and Todd, T. D ., A nonuniform sampling approach to data compression, IEEE Trans.
Commun., 29, 24. 1981.
8. Plotkin, E ., Roytman, L ., and Swamy, M. N. S., Nonuniform sampling of band limited modulated
signals. Signal Process., 4, 295, 1982.
9. Blanchard, S. M . and Barr, R. C ., Zero, first and second order adaptive sampling from ECG’s, in Proc.
ofth e35th A C E M B . Philadelphia, 1982, 209.
10. Pahim , O ., Borjesson, P. D ., and W erner, O ., Compact digital storage o f ECG's. Comput. Programs
Biomed., 9, 293, 1979.
11. G ersho, A ., Principles o f quantization, IEEE Trans. Circuits. Syst., 25, 427. 1978.
12. Sripad, A. B. and Synder, D. L ., A necessary and sufficient condition for quantization errors to be
uniform and white. IEEE Trans. Acoust. Speech Signal Process., 25, 442, 1977.
13. Landsberg, D. and Cohen, A ., Fast correlation estimation by a random reference correlator, IEEE Trans.
Instrum. M eas., 32. 438, 1983.
14. Cohen, A. and Landsberg, D ., Adaptive real-time wavelet detection. IEEE Trans. Biomed. Eng., 30,
332, 1983.
^ 15. W idrow, B ., A study o f rough amplitude quantization by means of Nyquist sampling theory, IRE Trans.
Circuit Theory, 3. 266, 1956.
16. W idrow, B ., Statistical analysis of amplitude quantized sampied-dala systems. A IE E Trans., (Applications
and Industry), II. 555, 1961.
17. Derusso, P. M ., Roy, R. J ., and Close, C. M ., State Variables fo r Engine ering. John Wiley & Sons,
New York, 1965.
Volume I: Time and Frequency Domains Analysis 45

Chapter 5

FINITE TIME AVERAGING

I. INTRODUCTION

It is often necessary to estimate statistical properties of a random process gr.en only a

finite time sample function. This may be the case, for example, when the membrane potential
is measured by means of electrodes and a high input impedance amplifier. The recorded
signal can be considered a constant (or “ almost” constant) voltage corresponding to the
membrane signal, corrupted with additive, zero mean noise. The corrupting >ignal is the
result of the electrode and amplifier noise as well as background noise. To recover the
constant membrane voltage out of the noisy signal, its mean must be estimated from a finite
time sample.
Other applications call for the estimation of the variance, the auto- or cross-correlation,
and other statistical parameter^.
The need to estimate statistical parameters, from a finite time sample, arises mainly from
one or two o f the following constraints:

Availahility of long records — Often only short time records are available for processing.
This may be due to the fact that the phenomenon monitored existed only for a >hort time
or due to the fact that the acquisition system has allocated only a given time slot to the
signal at hand.
Stationarity — Most often the signal to be processed is nonstationary. It is convenient,
however, to assume stationarity so that powerful (stationary) signal processing ’techniques
can be employed. The signal, therefore, is divided into segments, such that each can be
considered stationary. Rather than estimating the statistics of a nonstationary signal, the
problem now is to estimate the statistics of several “ stationary” signals represented by finite
time segments.
This chapter deals with the problems associated with finite time estimation. The errors
involved with these types of estimators are discussed, as well as the im provem ent in signal-
to-noise ratio achieved by the estimation.14
An important case, when the signal to be processed is a repetitive one, is analyzed. In
this case, synchronous averaging5'9 (known also as coherent averaging) techniques are em
ployed in order to estimate the averaged waveshape of the repetitive signal.
EEG evoked potentials (EP) are classical examples of a signal treated b> means of
synchronous averaging.
Finite time averaging techniques are implemented by software on general purpose com
puters, on dedicated computers, and on special digital circuits.1012 In practice, all signal
processing is time bounded, hence the importance of the knowledge of the estimation errors
involved.

II. F IN IT E T IM E E ST IM A T IO N O F T H E M EA N VALUE"

A. The Continuous Case

Consider a single sample.record, x(t), from a stationary ergodic random process. {x(t)}.
The record is given for the time te(0,T ). Without any loss of generality one can assume
that the process is given by

x(t) = fxx + n(t) (5.1)

w'here is the expectation of {x(t)}

46 Biomedical Signal Processing

M* = E{x(t)} (5.2)

and n(t) is a sample record of a stationary zero mean process.

Estimation of jjix by means of the given finite time record is required. Define the estimate

M-x = ~ I x(t) dt (5.3)

Clearly,

E{jxJ = jxx (5.4)

hence, the estimator 5.3 is an unbiased estimator.

The variance of the estimator yields the mean squared error, hence

Var [ | i j = E{(|ix)2} - ^ (5.5)

The first term of the right side of Equation 5.5 can be rewritten, using Equation 5.3:

E{|ix)2} = ^ Jf^ jf E{x(£)x(-r})}d£dr) (5.6)

by definition of the correlation function

E{x(£)x(t])} = rx(r) - £) = rx(r) (5.7)

where t = r\ - £.
Since stationarity was assumed rx (t ) is independent of r| and <:. is an even function of
t , and has a maximum at t = 0.
Equation 5.6 in terms of the new variable t can be rewritten as:

E{|lx)2} = ^ f [ rx(7) dT d t (5.8)

T" Jo J-z

Note that the integration of Equation 5.8 is carried out on the variables t and £ over the
region shown in Figure 1.
Changing the orders of integration leads to:

E{(|iJ-} = ^ J ^J tx( t ) d£dr + (T - 71 r.(T) d-r j =

f -2 J_ T (T " M) dT <5 '9 >

T, , ____ - (5.10)
T~ J - t

We get from Equations 5.5, 5.9, and 5.10:

Volume I: Time and Frequency Domains Analysis 47

FIGURE!. Integration boundaries.

V'jr ||1 J = i J ( (1 - (r„<T) - |xi) dT (5.11)

In order to have the mean square error of the estimated quantity. |i N, one has to have the
autocorrelation function t x( t ) . This function is, however, usually unavailable. Note that when
Equation 5.1 is used, one has

r„(T) = rx(T ) - (5.12)

where rn<7) is the autocorrelation function o f the noise process: hence,

V a r |( i,l = ^ J. ( I - r„(T)dT (5.13)

For gaussian processes and most physical random processes, Equation 5.? is a consistent
estimator. We shall deal First with two general cases.

/. Short Observation Time

In this case, the giveii record is very short so that the following approximation holds:

rn(T) s rn(0); for all T e (0 , 1 ) . (5.14)

Using the last approximation to solve Equation 5.13, we get

48 Biomedical Signal Processing

lim Var [ j l j = rn(0) = rx(0) - = Var [n(t>] (5..15)

which states the trivial conclusion that for very short observation tim es, the variance of the
estim ated quantity equals that of the record itself, hence no improvement in signal-to-noise
ratio is achieved.

2. Long Observation Time

Assume the noise autocorrelation function obeys

lim ~ f
r - T J r
| rn(T) | dT = 0 (5.16)

T hen, for large observation times we get from Equation 5.13

lim Var fjXJ = — I rn(x) dT (5.17)

t—
*^ T J-*

The signal-to-noise ratio may be defined as the ratio between the expectation E{x(t)} =
}jlx (the signal) and the variance, Var [x(t)]. The signal-to-noise ratio o f the signal is thus

SNR; = -----—---- = -----—— (5.18)

' Var (x(t)] Var (n(t)]

and that of the estimator

SN R " = Var [fxj (5I9)

The ratio between the SNRs can be considered a figure of merit for the estimator.
Thus, for very large observation tinvs. the expected improvement achieved by the esti
m ator 5.3 is

SNR,, _ Varfn(t)] _
SNR, 1 [' ( |t |\
tL(* -V)r“(T
)dT
For large observation times, the improvement in SNR approaches infinity.

Example 5.1
The resting potential of a cell membrane is to be estimated by means o f the system depicted
in Figure 2. The cell membrane potential is measured by the two high impedance glass
electrodes and the amplifier. The signal s(t) is the amplified membrane potential and the
noise n(t) is the noise (referred to the output) of the electrodes, the amplifier, and the noise
picked up by the electrodes.
We assume that the additive noise is zero mean with autocorrelation function

i'„(t) ~ Exp { - a |t|) (5.21)

Using Equation 5.13. the variance of the |i x can be calculated:

Volume I: Time ami Frequency Domains Analysis 49

FIGURE 2. Measurement system for membrane potential.

Var [(1J = ^ f 1 - “«T 0 - Exp ( - a ) ) Jl

a l 1.
(5.22)

and by using Equation 5.20. the signal-to-noise ratio improvement is written:

SNR() «T
------ - = — --------- ;--------------------------- - (5.23)
SNR.
2^1 - “ j; (1 - Exp( - a T ) ) J

For very large observation times, the improvement becomes

SNR(I uT
ltm ------- = — (5.24)
SNR, 2

For large observation times, T, the improvement in SNR becomes linear.

The estimator in Equation 5.3 uses a finite time integrator. For long observation times.
Equation 5.3 is difficult to implement. Consider, instead, a low pass filter with an impulse
response h(t). This filter replaces the integrator such that the estimation of the quantity m
in Equation 5.2 is given by

p,x(T) = | hi?) x(T - t) dt (5.25)

Define a new filter impulse response, h(t,T), which is equal to the low-pass filter's impulse
response throughout the observation interval and is zero elsewhere; hence,

fh(t): 0 t T
h(t.T) = j (5.26)
0: elsewhere

Introducing Equation 5.26 into Equation 5.25.

50 Biomedical Signal Processing

jlx(t) = J h(T,T) x(T — t) dT (5.27)

We would like to investigate the mean square error of the estimator. The expectation of the
filter’s output is

(5.28)

which is biased. The expectation of its square is

(5.29)

hence, the variance

Var [ f lj - E{(|Xx)2} - [E{|lx}]2 =

Substituting i = u — v and defining

(5.31)

we get from Equation 5.30

(5.32)

Comparing Equations 5.32 and 5.13 we see that when an ideal integrator is used, the variance
w as calculated by integrating rn(T) with a weighting function ~

while here, with a low-pass filter, the weighting function becomes rh (t ,T).

Example 5.2
Repeat the problem discussed in Example 5.1 with a simple RC low-pass filter replacing
the integrator.
Denoting ( R C ) '1 = p , the impulse response for the filter is given by

h(t) = (3 exp( - (3t); t 0 (5.33)

The weighting function rn(T,T) for this case is

Volume I: Time and Frequency Domains Analysis 51

0; t > T

Jj* h(v) h(v + t) dv; 0^ t ^ T

(5.34)

j h(v) h(v + t) dv; -T ^ t ^ 0

0; t < -T

Introducing Equation 5.33 we get

0 7 > T
P (exp | ~ ( i 7 l - expl - (J(2T - 7 )} 0 ^ 7 ^ T
rh(7,T) = - < (5.35)
3 (exp ( p T| - exp[ - (3(2T + 7 )] - T ss 7 0
0 7 < -T

Assume, again, that the noise is exponentially correlated (Equation 5.21); then, from Equa
tion 5.32:

Var | (1J - j r„ (7. T) ex p [ - c x H l d7

Defining the dimension less parameters.

i!j = a /|i and 6 = T • (3 (5.36)

The improvement in signal-to-noise ratio can he w ritten as

u!> - I)
( il/ I )( I - e x p l -«!>*»!» -f l)j - (*]/ - 1)exp( - 26)( 1 ~ exp(-(b(v!/ -

SNR,
1I1 * I
SNR,

<i/ - I (5.37)
0.5 — expl - 2|/0.5 + <}>)

For long observation times, we get from Equation 5.37

S N R (, a
l i m ------- - = ij, + 1 = - + (5.38)
SNR, M p

Comparing Equations 5.24 and 5.38, we note that while the signal-to-noise ratio can be
infinitely improved by the integrator, it is bounded when using the low-pass filter estimator.
Note also that in both cases the improvement is linearly proportional to the noise correlation
exponential factor ex.

B. The Discrete Case

Consider the case where the signal x(t) o: Equation 5.1 is given in the discrete form. We
shall denote x(t - kAt) by xK
52 Biomedical Signal Processing

xk ~ M-x + nk» k = 0, 1, 2, ... (5.39)

where { x j and {nk} are the signal and noise sample sequences.
The estimate of the mean }ix at time kAt is defined as the average over the current and
M - 1 previous samples:

(5.40)

The estimator is clearly unbiased. The variance of the estimator (Equation 5.5) is given by

Var l |i x (k,M)] = '

M- I M- I

E{xk_, (5.41)

Assuming {xk} is a stationary process, the last equation can be expressed in terms of the
correlation function r- = E{xk xk +T}

Var [|XV(M)] = — X X r ' (5.42)

M i=» j=o j-

Consider the symetric (MXM) correlation matrix

rn Tm -1

Rx = . rL_,

(5.43)

the elements of which are the correlation coefficients used in the first term in the right side
o f Equation 5.42. This term, therefore, is the sum of the matrix elements. Due to the
symmetry of the matrix, the variance becomes

V ar[|lx(M)] = ^ £ r(x, + ^ - t) - jjl2 (5.44)

and by introducing Equation 5.12 in its discrete form,

1 I 2 M 1 "1
V ariK (M )j = ^ k + <M - t) rl* (5.45)

The last equation is the equivalent of Equation 5.13 for the discrete case.
The improvement in SNR for the discrete case is thus given by
Volume I: Time am i Frequency Domains Analysis S3

S N R ,,_________ Var | i \
(5-46)
SNR,
5 [‘ *5 ? > - « ]
Example 5.3
Consider again the problem given in Example 5 .1 . Assume now that the noisy signal x(t)
is given in terms o f its finite samples sequence {x,}: t = k - M + l , k - M + 2, k -
1, k.
Assume also the noise process has an exponential correlation function as in Example 5.1
(Equation 5.21 ). From Equations 5 .1 8 . 5.19. and 5 .4 6 , we get the improvement in signal-
to-noise ratio due to estimator 5.40:

^ -------------- (5.47)
1 + 2 ^ (1 - t/M ) exp ( -o rr)

For long observation times. Equation 5.47 becomes

lim S J ^ = M » - exp( —a j (5 48)

NI SNR, 1 + e x p (-a )

The improvement in signal-to-noise ratio is linearly proportional to the observation time

M. as is the case in Example 5.1. Figure 3 depicts the effect of estimating the expectation
of a random proces> by means of the estimator 5.40.

III. ESTIM ATIO N OF THE VARIANCE A N D CO RRELATIO N 1

A. Variance Estimation — The Continuous Case

Consider the sample record from an ergodic random process given in Equation 5.1. It is
required to estimate the variance of {x(ti given finite observation time.
Denote the desired variance by Sx such that

Sx - E {.\u) - E{x(t)})2} (5.49)

The finite time estim ator SX(T) is given by

! fT
SX(T) = - J (x(t) - fix)J dt (5.50)

where fix is the estim ator for the expectation of {X(t)} given by Equation 5.3. It can be
shown, using Equations 5.8 and 5.12 that the expectation of Estimator 5.50 can be written
as

E{SX} = S, - - | ^ r./C) dg (5.51)

The variance estim ator (Equation 5.50) i\ a biased estimator. In most practical cases, how
ever, the noise process is such that

lim | £ r„(£) d£ = 0 (5.52)

i . / i ■-

For these cases, the bias term approaches zero for large observation times.
54 Biomedical Signal Processing

XMAX» 0.250E 03
SNR IN =2.0
XHIN* 0.100E 01
YMAX- 0.873E 00
N=2 SNR OUT *3.579 YNIN— 0.994F-01

FIGURE 3. Finite time averaging of membrane voltage. (A) M = 2. SNR,, = 3.58; (B) M =
16. SNR,, = 28.15: (C) M - 64, SNR,, = 52.72.

The variance of the estimator (Equation 5.50) is a function of the fourth moment of the
process {x(t)} and thus its calculation is of a limited practical value.

B. Variance Estimation — The Discrete Case

Consider now the discrete case where the variance of {x(t)} is to be estimated from the
finite sample sequence

{xj; t = k - M + 1, k - M + 2, ..., k - 1, k

Define the variance estimator as

j M -1

S^ M) = - £•, <■< - >»2 (5.53)

where jlx is the discrete estimation of E{x(t)} given by Equation 5.40.

The expectation of the estimator 5.53 is

E{SX(M)} jlx(k - i))2} =

Sx - (5.54)
M(M - 1 ) ." ,

Equation 5.54 is the discrete equivalent of Equation 5.51. The estimator is a biased one.
Volume I: Time and Frequency Domains Analysis

XMAX* 0.250E 03
SNR IN *2.0 XMIN* 0.100E 01
YMAX- 0.873E 00
N=16 ' SNR OUT -28.151 YMIN— 0.994E-01

FIGURE 3B.

XMAX= 0.250E 83
SNR IN =2.0 XNIN® 0.I00E 01
YMAX- 0.873E 80
YMIN=-0.994E-01
K=64 SNR OUT =52.716

FIGURE 3C.
56 Biomedical Signal Processing

For most practical noise processes, however, the bias term in Equation 5.54 approaches
zero for long observation times. As in the continuous case, the variance o f the estimator
(Equation 5.53) depends on the fourth moment of the process. Its calculation is thus of very
limited practical value.
Note that for uncorrelated noise process, both the continuous estimator (Equation 5.50)
and the discrete one (Equation 5.53) are unbiased.

C . Correlation Estimation1
The autocorrelation function rx (t) of the ergodic process {x(t)} and the function rxy(T)
(which is the cross-correlation function between the two ergodic processes {x(t)} and {y(t)})
are of great importance in signal processing. It is often required to estimate the correlation
functions given only finite sample functions of the processes.
When the sample functions x(t), y(t) are given in the finite time te(0 ,T + t ) the estimator
for the cross correlation can be defined by

?xv(t ,T) = ^ [ x(t) y(t + t) dt (5.55)

T Jo

The expected value of the estimator is

E{fx%(7,T)} = rxv(7) (5.56)

Hence, the estimator (Equation 5.55) is an unbiased estimator of the correlation function
r x (7). The mean square error is given by the variance of the estimator. The variance depends
on the fourth moment of the processes.
Ii can be show'n that for jointly gaussian processes with zero means the variance becomes

Var iU -r.T )] = l- I j^ l - r„© ry<£) -

M ? + t) r„ (£ - t) j d£ (5.57)

If rv(£), rv(£), and rxx(£) are absolutely integrable over ( ^-ccfx ) t the variance of the estimator
approaches zero as the observation time increases. Thus, for these cases, the estimator is a
consistent one.
The estimation of the autocorrelation function is performed in a similar manner with results
given by Equations 5.56 and 5.57 with {y} replaced by {x}. The correlation function can be
efficiently estimated indirectly. The power spectrum density function is first estimated
(Chapter 8) and then subjected to an inverse FFT to yield the correlation estimation*

IV . S Y N C H R O N O U S A V E R A G IN G (C A T -C O M P U T E D A V E R A G E D
TR A N S IE N T S )

A. Introduction
In many biomedical applications, the problem of detecting a repetitive random signal
heavily corrupted with noise arises. It is often possible to trigger the signal by controlling
the cause. This may be the case when analyzing evoked potentials (EP) in the EEG. The
stimulus can be repeated at known times, and the response can be analyzed by methods
discussed in this chapter.
Volume I: Time and Frequency Domains Analysis 57

Let the finite, stochastic response to the \th stimulus be denoted by

[Si(T); O ^T^T i
s,(t “ 0 =|
0; otherwise

where t; is the :ime of initiation of the \th stimulus and Tj is the length of the \th response.
The observed signal to be analyzed. z(t), can be expressed as a series of the responses
Sj(t) corrupted with additive noise

z(t) = X S,(t - t.) + n(t) (5.58)

i i

where n(t) is a zero mean process, statistically independent of Sj(t). Let the observation z 4(t)
be defined as

z,(t) = S|(t - I,) + n,(t - tj); ^ ^ t ^ tf + T (5.59)

where n,<t ) = niti: tel tj, i,- + T] and T is larger or equal the length of the longest response

T s? Max (T,) (5.60)

Assume also that ihe autocorrelation function of the noise obeys

r,i('1 ~ 1 ,) ~ 0 - l° r i-j (5.61)

We are interested in estimating the average response S(N.t) over N stimuii

1
S (N .i) = - ^ B j S , (t - u} (5.62)

Let the estimate of the required average response be

j N 1 N

S(N -t) = r; S z.(t> == r T S (Si(l ~ 0 + ~ 0) l^.63)

N j I N ,1

Assuming stationary zero mean noise process, with variance ex2. the expectation of the
estimator is

E{S(N,t)} = S(N,t) (5.64)

hence, the estim ator is unbiased.

The variance of the estimator. cr~. is given by

07 = E{S:iN .t)| “ (L{S(N,t})- =

1 I I \ \
~ li\ ^ z;’(t - t.) 4- V / (t - t,) z,(t - t,)} - (E{S(N,t}: (5.65)
N- I I . ‘ i , i
58 Biomedical Signal Processing

In general, the responses S;(t) are dependent on one another. This may be (in the case of
EP analysis) due to phenomena like learning or fatigue.
Define the nonstationary spatial cross correlation function rs(T,t) by

rs(T,t) = E{Si(t - ts) Sj(t - tj)}

t = j —i (5.66)

Denote the expectation of the response by

E{Ss(t - tj)} - m(t); for all i (5.67)

and its variance

a 2 = E{S2(t — ts)} - rrr(t); for all i (5.68)

Noting that

rs(o,t) = E{Sf(t - tj)} = or + m: (t) (5.69)

and

E{z,2(t - t^} = rs(o,t) + a 2 (5.70)

and

N N- I

2! r>(T.t) = 2 (N - T)rN(7.t» (5.71)

it can be shown that

j j N- 1

or = — (cr; + <j2n) + E (N ~ T) [rs(-r-t» - m2(t)l (5.72)

Equation 5.72 gives the variance of the estimator in the general case, with the assumption
of statistical independence between noise and response.
Consider now two important cases.

B. Statistically Independent Responses

For EP analysis, for example, this assumption can be made if the number of stimuli, N,
is such that the influence of learning and fatigue can be neglected. Hence, we shall assume
the above assumption to be vaiid for N ^ Nmax.
For this case, we have for N ^ Nniax

E{s2(t - t^}; k - 0

rs(k ,T ) (5.73)

m2(t); otherwise

hence, for Equation 5.72

Volume 1: Time and Frequency Domains Analysis 59

(il = -- (a; + aj) = 7- cr; ; N ^ Nmax (5.74)

N N

which states that the variance of the estimator can be made as small as required by increasing
the number o f responses participating in the averaging window. In most practical cases, the
variance cannot be made zero due to the limit on N.
The signal-to-noise ratio of the observation z 4(t) is defined by the ratio of the mean response
(the “ signal” ) to the variance of the observation:

SNR, = (5.75)

Similarly, the signal-to-noise ratio of the estimator is

SNR; = (5.76)

Using Equations 5.74, 5.75. and 5.76. the improvement in signal-to-noi>e ratios can be
written as

f ! r ; = N; N " N.... ■" (5J7)

Note that the improvement is independent of the signal to noise ratio of the observation z(t).

C. Totally Dependent Responses

In this case we assume that S,d) = S(t), namely, all responses are samp.es of the same
random process. Hence. Equation 5.66 becomes

rs(T.t) = E {S,(t - t.) Sj(t - tj)} = E{Sr(t - t,)} (5.78)

Substituting Equation 5.78 into Equation 5.72 we get

I ,
07 = v ; + — a- (5.79)
N

Note that in this case the estimator is not a consistent estimator. Its variance can only
approach the variance of the signal: as N approaches infinity, it will not approach zero.
The improvement in signal-to-noise ratios in this case can be written as

SNR. p + 1
----- : = N JL------- (5.80)
SNR, Np + 1 1 ;

where p = —; is the ratio of variances of signal and noise in Zj(t). Note that in this case
(j-
the improvement does depend on the signal-to-noise ratio of the observation. The improve
ment is always greater than I. For very noisy observations (p approaches zero), the im
provement approaches N.
60 Biomedical Signal Processing

D . The General Case

In some applications, it may not be possible to assume the basic assumptions made in
the last sections.
An evoked response to a photic stimulus may be followed by a long persisting rhythmic
wave. Assumption 5.60 may not hold in some cases. For exai.iple, when analyzing EPs,
the main source of noise is the background EEG. The assumption that the EEG and EP are
independent is not always valid. Correlation between the two may arise when the subject
under examination is aware of the experiment and is expecting a stimulus.
The synchronous averaging method described in this chapter estimates the average response
assuming the responses lie in the T time window. It often happens that the responses are
not exactly synchronized with stimuli but have a latency t such that *

(5.81)

where is the latency of the i/7? response. Synchronizing the averaging process in Equation
5.63 to the known stimuli times tj means that the averaged responses will not be properly
aligned with one another. The estimator will thus yield a “ smeared” template o f the response.
To overcome this problem, one must estimate the latency, and synchronize the observations
zs to times (t; + t.) rather than tj. One way to overcome this problem is discussed in the
following section.
In some applications, the knowledge of the average response in Equation 5.62 is not
sufficient. In these cases, one is required to analyze the single EP. Techniques such as
sophisticated adaptive filtering (Chapter 9) and waveform detection (Chapter 1, Volume II)
have to be applied.

Example 5.4
The estimator 5.63 is a random variable with mean that equals the desired quantity S(n.t).
The probability distribution of the estimator is unknown: however, with the use of very
well-known bounds, confidence limits can be set for the design of the synchronized averaging.
Consider the Chebychev inequality:

Prob[(m(t) - kcrj 3= S(N,t) (m(t) + kcr-)] *£ (5.82)

which states that the probability of the estimate to be outside the range of ± k a s from the
mean m(t) is less than or equal to l k-. Hence, the probability to have an estimate outside
the range ± 3cr; is less than or equal to 0.11. With a probability (confidence) of 0.889 (—0.9
or 90%), the error in the estimate will be in the range of —3ov The experimental requirement
can be phrased as follows: determine the number of trials (N) required such that with certainty
of 90%, the error in the estimate will be less than or equal to — For statistically independent
responses, we have (Equation 5.74)

3a- = 3 e; N ^ Nni„ (5.83)

(N)1-

and the required number of trials

SNR, xt
------ - = N 2* (5.84)
SNR,

For the second case, where the responses are statistically dependent. Equation 5.79 is used
to get
Volume I: Time and Frequency Domains Analysis 61

FIGURE 4. Synchronous averaging. (A* The signal (SNR = * ): (B) raw data, signal with
additive noise. M = 1; (C) averaging with M = 200: *D) averaging with M = 800; (E) signal-
to-noise ratio vs. M. (See pages 62 and 63. >

3(7, — 3j <r; - ~ a.2,

SNR;
(5.85)
SNR,

Signal-to-noise ra.io enhancement by means of synchronized averaging is demonstrated

by simulation. A repetitive signal S,(t) = Sit) was generated by

S(t — tj) = A e x p (-a < t - tj) B ex p(-(3(t - t,)) :

0 (t - t;) *£ T (5.86)

The observation signal z(t) was generated by adding uniformly distributed Pseudo Random
noise. Figure 4 shows the signal s(i), (SNR = =c ) and the estimator output for various N.
The signals were sampled and the estimator (Equation 5.63) was implemented on the computer.

E. Records Alignm ent, Estimation of Latencies

Consider again the problem of latencies in the wavelets, described by Equation 5.81. If
the unknown latencies, t *, have large variance, *’ie averaged “ smeared” response may not
be sufficient. An adaptive method for estimating the latencies has been suggested.1314 It is
sometimes known as Woody filtering.
We shall proceed with the regular averaging method (Equation 5.63). Let us denote the
“ sm eared” averaged response, s()(N,t). hence:

s„(N.t) = - V <s;(t - t,) + nj(t - t,)) (5.87)

N ,

We shall use s„(N.t) as the first estimate of the average response. To improve on this
62 Biomedical Signal Processing

estim ate, each of the N responses will be cross correlated with the estimate. Consider the
correlation of the ith response:

1 (T+\
,(Nt) • Zj(t - X)dt (5.88)
= T 1. s“'

We shall look for the time Xj for which the cross-correlation is maximum:

Max(R^(X)) = R^(Xj) (5.89).

The time Xj is the time shift required to best align the observation Zj with the estimate of
Volume I: Time and Frequency Domains Analysis 63

FIGURE 4D.

FIGURE 4E.

the average response. Therefore, we shall shift the observation Zj(t) by that amount. The
next estimation for the average is given by:

1N
s.(N,t) = - £<s,(t - t, - K) + n(t - t, - X.) (5.90)

We shall now repeat the correlation process of Equation 5.88 replacing s,.,(N,t) by the
new estimate s,(N,t). The correlation of all N responses with the new estimate. R'JXX), will
provide new estimates, for realignment. This procedure is r e p e a t some halting
criterion is met. One such criterion can be the difference in the areas under two successive
correlations, namely:
64 Biomedical Signal Processing

11 R li+'W dX - [ R«'(X)dX| « e (5.91)

JO Jo '

It has been shown that for a correlation function and noise which are well behaved, the
above procedure results in an asymptotically stable estimate.14 In other cases, the procedure
must be stopped after some arbitrary number of iterations; visual examination of the data
may be recommended.

REFERENCES

I H e n 'l l V s -nr! Pierson, A. G ., Random Data Analysis and Measurement's Procedures, Wiley-Inter-
science, New York. 1971, chap. 6.
2. Ernst, R. R ., Sensitivity enhancement in magnetic resonance. I. Analysis of the method of time averaging.
Rev. Sci. Instrum .. 32 (12), 1689, 1965.
3. Bendat, J. S ., Interpretation and application of statistical analysis for random physical phenomena, IRE
Trans. Bio — Med. Electron., 9. 31, 1962.
4. Davenport, W. B .. Johnson, R, A ., and M iddleton, D., Statistical errors in measurements on random
time functions, J. Appl. Phys., 23 (4), 377, 1952.
5. Ruchkin, D. S ., An analysis of average response computations based upon a periodic stimuli, IEEE Trans.
Eng ., 12, 87, 1965.
6. Bendat, J. S., Mathematical analysis of average response values for nonstationary data, IEEE Trans.
Biomed. Eng., i i . 72. 1964.
,7 Kovacs, Z. L., On the enhancement of the SNR of repetitive signals by digital, averaging, IEEE Trans.
Insirum. Metis., 28 (2). 152. 1979.
8. Shiavi, R. and Green, N ., Ensemble averaging o f locomotor EMG patterns using interpolations. Med.
Biol. Eng. Com put. . 21, 573. 1983.
9. Volkers, A. C. \V ., Van de Schee, E. J ., and G rashuis, J. L .,-Electrogastrography in the dog: waveform
analysis by a coherent averaging technique, Med. Biol. Eng. Comput.. 21. 56. 1983.
10. Shvartsman, V ., Barnes, G. R ., Shvartsman, L ., and Flowers, N. G ., M ultichannel signal processing
based on logic averaging. IEEE Trans. Biomed. Eng., 29, 531. 1982.
i i . Gethner, J. S., Woodin, R. L ., Rabinowitz, P., and Kaldor, A ., Multiparameter matrix signal averaging.
Rev. Sci. Instrum.. 53 (9), 1398, 1982.
12. Thomas, C. W .. Rzeszotarski, M . S ., and Isenstein, B. S., Signal averaging by parallel digital filters.
IEEE Trans. Acoust. Speech Signal Process., 30, 338, 1982.
13. W oody, € . D ., Characterization of an adaptive filter tor the analysis of variable ’latency neuroelectric
signals, Med. Biol. Eng. Comput., 5, 539. 1967.
14 Senm oto, S. and Childers, D. G ., Adaptive decomposition of a composite signal o f identical unknown
wavelets in noise. IEEE Trans. Syst. Man Cybern., 2. 59. 1972.
Volume I: Time and Frequency Domains Analysis 65

Chapter 6

FREQUENCY DOMAIN ANALYSIS

I. INTRODUCTION

A. Frequency Dom ain Representation

Signals are most often given as a function of time. It is sometimes required to represent
the signal in the frequency domain, where the distribution of amplitudes and phase are given
with respect to the frequency. Frequency domain representation is advantageous in many
applications such as filtering. The two representations are related via the Fourier :ransform
(FT) or Foiuier integral and inverse Fourier transform (IFT). Consider the signal v:i) given
in the time domain. Its representation in ihe frequency domain, X(w). is given by :he Fourier
integral:

X(w) - J x(t)exp(-jw t)dt (6.1)

The last equation is symbolically written as X(w) - F{x(t)}.

The inverse transformation, from the frequency domain into the time domair.. is given
by the inverse Fourier transform:

x(t) = -j- I X(\vjexp(jwt)dw (6.2)

The last equation is symbolically written as x(t) = F '{X(w)}.

We consider s ig n a l in the time domain which are real. In the frequency domain, .owevcr,
the representation Xt w) is. in general, a complex value of w. We write X(w) a>.

X(w) = IX(w)jexp(jO(w)) (6.3)

where iX(w)j is the amplitude and 0(w ) is the phase of the frequency domain representation.
Not all real functions can be transformed into the f r e q u e n c y domain by Equation 6.1.
Sufficient conditions for the existence of X(w) are given by the Dirichlet conditions:

1. | |x(t)|di < namely, x(t) is absolutely integrable.

2. x(t) has a finite number of discontinuities and a finite number of extrema points in
any finite interval.

There are some very usciui iunctions such as the impulse (deita) function, step function,
or even sine and cosine functions which do not obey the Dirichlet conditions. These functions
do not have a Fourier transform; they do. however, have the transform in the limit. Wre
shall talk about the FT of these functions where, in fact, we shall mean the limi: that does
exist

B. Some Properties o f the Fourier Transform

The FT is a linear operator. If X,(w) = F{x,(t)} and X,(w) = F{x:(t)}, then lor any
arbitrary constants, a, and a,.

F{a,x,(t) + a,x,(t)} = a,X,(w) + a,X,(w) (6.4)

66 Biomedical Signal Processing

The FT possesses many interesting properties that help in the calculations of direct and
inverse transformations. The reader is referred to references and textbooks dealing in details
with the subject.12 We shall deal here only with some important properties that bear direct
relation to the material presented in later sections J

1. The Convolution Theorem

The convolution theorem is an important tool in frequency analysis. The convolution
integral o f two functions x,(t) and x2(t) is defined (for stationary functions) by:

(6.5A)

The convolution integral is symbolically written as:

x(t) = x,(t) * x2(t) (6.5B)

The importance of Equation 6.5 in systems analysis stems from the fact that the output of
a linear system is given by the convolution of the input with the impulse response of the
system. It can be easily shown that the FT of x(t) is

F{x(t)} = F {x,(t) * x 2(t)} = X ,(w ) • X : ( w ) ( 6 .6 )

where Xj(w) = F{x;(t)}. Hence the convolution operation in the time domain, which is a
relatively complex operation, becomes a simple multiplication operation in the frequency
domain.
Consider now the convolution of two functions. X,(w) and X2(w•i. in the frequency domain:

X(w) = J X,(u)X2(w - u)du = X,(vv) * X2(w) (6.7)

It can be easily shown that the IFT of X(w) is

x(t}. = F !(X(w)) - 211 • x,(t) • x2(t) ( 6 .8 )

Hence, convolution of two functions in the frequency domain is given by 211 times the
multiplication of the two functions in the time domain.

2. Parseval's Theorem
Consider the energy, E, of the time function x(t):

(6.9)

We shall replace x(t) in Equation 6.9 by F '{X(w)}, hence:

(6.9A)

Interchanging the order of integration,

Volume I: Time and Frequency Domains Analysis 67
For a real x(t), we have X(w) = X*( - w) (where the symbol * denotes complex conjugate),
hence

( 6 . 10)

Equation 6.10 is known as Parseval's theorem which states that the total energy of the signal
can be calculated by the integral in the time domain and in the frequency domain. Note that
if the energy in a band of frequency w, w ^ w , is required, we have to integrate over
this band and the band of negative frequencies - w, ^ w ^ - w , . Since |X(w)|2 is an even
function, in w, we gel:

(6.11)

We can define the energy spectrum density, S(w), by:

S(w) = —iX(w)|2 ( 6 . 12)

Then Equation 6.11 can be written:

(6.13)

and the total signal energy is

(6.14)

3. F o u rie r T ran sform o f P e rio d ic Signals

A periodic signal docs not satisfy the Dirichlet conditions and thus does not have a FT.
Howev er, if we assume the signal exists only in a finite time interval ( —r/2. t 2). then the
FT does exist. We then find the FT in the limit as t becomes infinite.
We can. however, calculate the FT of a periodic signal in another way. Consider the
complex Fourier series of the signal:

(6.15)

where T = 2Il/w„ is the period of the signal, and a„ are the coefficients of the expansion.
Applying the FT to Equation 6.15 yields:

X(w) = f | £ anexp(jww0t) I = 2H ^ an5(w - n w j (6.16)

Hence, the FT of a periodic signal is a train of impulse functions located at the harmonics
of the fundamental frequency, w„, each multiplied by 211 times the corresponding coefficient
of the Fourier series.
68 Biomedical Signai Processing

C. Discrete and Fast Fourier Transforms (DFT, FFT)

Consider a band-limited time signal x(t) that has been sampled at a sampling interval, Ts.
Assume that the sampling frequency ws = 2II/TSobeys the sampling theorem such that the
sampled sequence contains all the information present in the signal.
We denote the finite sampled sequence by {x(nTs)}, n = 0 ,1 ,...,N — 1.
Define tLe discrete Fourier transform (DFT), as a linear operator on the sequence {x(nT)}
such that:

The sequence is, in general, a complex sequence. The transformation that maps

f / v x^
the complex sequence <}X ^k ) j back into the sequence {x(nTs)} is called the inverse

discrete Fourier transform (IDFT):

Note that T s and w„/N are constants, hence w^e denote xn = x(nTs) and Xk •= X

f h e transformations in Equations 6.17 and 6.18 are symbolically written as:

Xk = DFT{x„}; x„ = IDFT{XJ (6.19)

Let us now consider the sampled sequence {x(nTj} as a time function generated by
multiplying the signal x(t) with the ideal sampler operator, 5x(t) (see Equation 4.2). We
denote the sampled signal by x*(t) and get:

N - 1

X *(t) = ^ x ( t) 8 ( t - nTs) (6 . 20 )
n= 0

The FT of the signal x*(t), namely X*(w), is easily calculated, noting that:

F{5(t - nTs)} = ex p (-jw n T s) (6 . 21)

Then, from Equation 6.20 we get:

N - 1

X(w) = F{x(t)} = 2 x(nTs)e x p (- jwnTs) (6 .22 )

n=0

Equation 6.22 describes the FT of the sampled signal. This FT, X*(w), is a continuous
function of w. Let us sample, in the frequency domain, this continuous function at frequency

intervais of ws/N = 2II/NTS. We shall get a sequence of FT samples j X *^k ^ J ' | ’ ^ ~

- 1,0,1...

= 2 x (n T J e x p (^ - j2 1 1 k k = ...,-2,-1.0.1,2, (6.23)
Volume I: Time and Frequency Domains Analysis 69

X(k^) =DFT {x(nTs)}

(c.)

t
i
!T.
11 >
I
I 4
f i t
N/2

it iu>s
- i r —1
A a) = c u c / n

FICiURE- !. The relations between tiic Fourier transform (FT) of \(t). the FT
of jp.d the DF-T. (A) The FT of x(t): (B) the FT of x*(t); (C) the DFT.

Note that ihe set of N members k - 0 .1 ......N — 1 of the infinite sequence (Equation 6.23)
equals the DFT of Equation 6.17.
We recall also from the discussion in Chapter 4 (Equation 4.4 and Figure 4.2 i that the
FT, X*(w). of the sampled signal is the repetition of the FT of the continuous sig:nal X(w)
centered at (w v. When we sample the FT, the samples —N /2 ,..., —l,0 ,l,...,( N 2 - 1) are
samples of the FT centered at w = 0. The rest of the samples of the sequent onvey no
new information since they represent the same samples shifted to (w + €ws), t = . - L L ...
This can also easily be seen from Equation 6.23. The functions exp(—j2IIk —) arc? periodic
N
functions with period N. Hence X *^k —J = X*( (k + £N) for any integer (. Since
the FT of real signals is symetric, we can represent the samples of the FT by the sequence
X *^k k = 0 .1 ...... N - 1.

As a conclusion we state that the DFT (Equation 6.17) is a sequence of N ?miformly

distributed samples of the DF of the signal, x(t). Refer to Figure 1 where an example
depicting the relations between the FT and the DFT is given. Here the signal un? ; sampled
at the Nyquist rate. Note that the DFT samples k = 0 ,1 ......N/2 are the samp] es of the
positive frequencies of the FT. The rest of the samples k =(N /2 + 1),..., N - i are the
70 Biomedical Signal Processing

samples of the negative frequencies of the FT centered at ws. Since, in our cases, the FT
is symetric, these samples contribute no additional information. From the sequence of N
samples of the DFT, we require only the N/2 + 1 first ones (or last ones); the rest are
redundant. In the example given in Figure 1, wx = 2wmax, and hence for k = N/2 we get

X ^k = ^ Wmax^' ^ samP^nS rate were higher, ws > 2wmax, the frequency

corresponding to k = N/2 would be higher than wmax, where the FT is zero. In that case,
we would get a range of zero DFT samples separating the two symetric values of the FT.
The DFT has some properties which are similar in nature to those of the FT. The DFT
is a linear operator, hence:

DFT{a,x,(nT) + a2x2(nT)} = a,DFT{x,(nT)} + a2DFT{x2(nT)} (6.24)

The convolution of two signals was defined in Equation 6.5. A similar operation can be
defined for two sequences. Let us define the cyclic convolution of the two sequences {x,(n)}
and {x2(n)} by:
N- 1

{x(k)} = {x,(n)} ® {x2(n)} = X x i(n)x2(k ~ n);

n=0

k;n = 0 ,1 ,...,N - 1 (6.25)

where the symbol ® denotes cyclic convolution.

It can be easily shown that:

DFT{x(k)} - DFT{x,(n> 0 x2(n)} - DFT{Xl(n)} • DFT{x2(w)} (6.26)

which is known as the cyclic convolution theorem.

Consider now the discrete Par.aval’s theorem. Let us define the energy of the real sequence
{x(n)} by:

N- i

E = 2 x(n2) (6.27)
n=0

It can be easily shown that if X(k) = DFT{x(n)}, then:

E = 2 x(n2) = i 2 |X(k)|2 (6.28)

n=0 IN k = 0

The DFT is an important tool for discrete signal processing for the same reasons the FT
was important for continuous signal processing. The direct computation of the DFT requires
approximately N2 complex multiplication and addition operations. In 1965, Cooley and
Tukey, in their famous paper, presented an efficient method for calculating the DFT. Their
method, known as the fa st Fourier transform (FFT), requires only N log2 N operations
(where N is a power of 2). For N = 1024, the number of operations required by the FFT
is ten times less than the number required for direct computation.
Many different FFT algorithms have been derived for software and hardware implemen
tations. Two commonly used algorithms are known as the decimation in time and decimation
in frequency algorithms. The interested reader is referred to the vast literature on this sub
je c t.3"5
Volume I: Time and Frequency Domains Analysis 71

II. SPECTRAL A N A L Y SIS

A. The Power Spectral Density Function

In many applications, we shall be interested in the distribution of the energy o f the signal
in the frequency domain, rather than the distributions of amplitude and phase. When dealing
with energy and power distributions, we lose information concerning the phase o f the
signal. In the previous section, the energy spectrum density function was introduced (Equa
tion 6.12). For a random process, however, this definition has to be reconsidered. AH the
samples o f the random process are assumed to exist in the range —« *£ t « and cannot
be described analytically. In addition, each sample function is different from other samples
of the same process. Since the random process is given in terms of its statistical properties,
it makes sense to define the power distribution also in these terms.
Consider the sample function x(t) of a stationary process in the range —T/2 ^ t ^ T/2
and its FT, X ,(w ), and define the power spectral densityi PSD) (^r tb° hy:

SJ„,. Iims&s®
T— X T
(62))
In Equation 6.29, we use the expected value of the random variable |XT(w)p which is the
energy. By introducing the FT we get:

|XT(w)p = X,(w)X;.(w) = J ^ J x ( t,) x (t2)e x p (-jw (t2 - t,))dt,dt2 (6.30)

Introducing Equation 6.30 into Equation 6.29 yields:6

S,(w) = J rx(T)exp(-jwT)dT = F{rx(7)} (6.31A)

and

i\(t) = ~ J Sx(w)exp(jwT)dw = F~'{Sx(w)} (6.31B)

where tx( t ) is the autocorrelation function of the process:

rj r ) = E{x(t)x(t + t)} (6.32)

Equations 6.3 1 A and B state that the PSD and the autocorrelation are a Fourier pair. Equations
6.31 A and B are known as the Wiener-Khinchin relations. Note that since the autocorrelation
function is an even function, the PSD is real.
Consider a stationary random signal x(t) which has an autocorrelation function:

r(T) = « 8 (t) (6.33 A)

Namely, the values of the signal x(t) at the time t and at the time t + t (for all t not equal
zero) are uncorrelated. The PSD function of such a process is

S(w) = a (6.33B)
72 Biomedical Signal Processing

The power is equally distributed along the frequency axis, hence the process is called white
noise. A random process with power unequally distributed is called colored noise.
One principal application of the PSD function is related to the analysis of linear systems.
Consider a linear system with an| impulse response, h(t), driven at its input by a random
sam ple function, u(t). The output! of the system, x(t) is given by the convolution:

x(t) = u(t) * h(t) (6.34)

Consider an input signal that is stationary in the wide sense. We then calculate the
autocorrelation of x(t) and take its FT; the result is:7

Sx(w) = |H(w)|2Su(w) (6.35)

Hence, when the input, u(t), is white, Su(w) is constant. The output signal x(t) is a nonwhite
noise, colored by the frequency response of the system.

B . Cross-Spectral Density and Coherence Functions

We are often interested in the relations between two random processes. As an example,
we can consider the EEG signal recorded simultaneously over various locations of the scalp.
The cross-correlation function, rxv, between the two stationary random signals x(t) and
y(i) was defined in Equation 3.43. Let us define the cross-spectral density function (or cross
spectrum) as the FT of rxy:

Sxx(w) =, F { r xv(T)} (6.36)

The cross-correlation function is not necessarily even, hence the cross-spectral density is,
in general, a complex function:

Sxy(w) = |Sxv(w )|exp(-jB x,(vv)) (6.37)

It is easily shown that the cross spectrum is bounded by:

|Sxv(w)p ^ Sx(w)S,(w) (6.38)

An important application of cross spectra is related to the analysis of linear systems.

Consider a linear system with impulse response h(t), stationary input signal u(t). and output
x u ). It can be shown that the cross spectrum of input and output signals is

Sllx(w) = H(w)Su(w) (6.39)

Hence, the frequency response of the system can be calculated from the cross spectrum and
input spectrum.
A convenient real value bounded quantity is defined, named the coherence function:

|Sxv(w)|2
7xv(w ) = Sx(w)Sv(w)
o W , \ 55 1 (6 -40>

W hen Y^y(w) = 1 for all frequencies, x(t) and y(t) are said to be fully coherent: when for
some w = w0, YJy(w0>- ° :;(t) and y(t) said to be incoherent at w(). When x(t) and
y(t) are statistically independent, then YJv(w) = 0 for all w.
Coherence function is useful in investigation of signals which are only slinhtlv correlated.
Volume I: Time and Frequency Domains Analysis 73

Hence, it is the low coherence values that are of interest. In practice, the exact values of
the various spectra are not known and must be estimated (cee Chapter 8); hence, the coherence
function is always given in terms of its estimates. Common and novel1011 estimation methods
exist for the coherence function. Estimation may cause large inaccuracies in the coherence
function, and its application must be carefully considered.9
The coherence function has been applied to EEG analysis9,12,13 for the investigation of
brain asymmetry, localizing epileptic focus, the study of relations between cortical and
thalamic activity, and more.

III. LINEAR FILTERING

A. In tro d u ctio n
In the design of signal acquisition and processing systems we must often alter a given
...... /..at some parts of it are enhanced, or attenuated, its phase is changed, parts of
it are delayed, smoothed, or “ predicted". The signal may be deterministic, random, con
tinuous, or discrete. Many of the desired alterations can be achieved by linear transformation.
We then design a linear system, or filter, that operates on the signal with the required
transformation.
The basic filter is the time invariant filter, or fixed parameters filter. It is usually designed
to meet the required specifications, given some a priori information concerning the signals
and noise involved. Filters can be designed to meet the required specifications while optim
izing some performance criterion; these are called optimal filters. One example, the Wiener
filter, is discussed in this chapter. Filters in which the values of the pjrameters are functions
of time are called time varying filters. An important class of time varying filters is the class
of adaptive niters which is discusscd in Chapter 9.
Consider the signal u(t) which is to be processed. It is desirable to apply a linear trans
formation such that its outcome will be x(t). We can consider the linear system depicted in
Figure 2 to be the filter driven by the input signal u(t) with the output signal xm .
The relations between the two signals are generally given in terms of the differential
equation:

dn.\(t) d" ‘x(u dx(t)

...... + a , ------ + a„x(t) =
dt

dmu(t) , du(t)
bm — — + ...... + b, —— + b0u(t) (6.41)
dtm dt

A general solution to Equation 6.41 is given in terms of the impulse response, hit). Since
the system is linear, its output is composed of the linear combination of the response to the
impulse function.14 It can be shown that the output. x(t). is given by the convolution of the
impulse response with the input:

x(t) = h(t) * u(t) (6.42A)

or when taking the FT:

X(w) = H(w) • U(w) (6.42B)

We see that U(w) can be shaped into a desired X(w) by designing the right filter H(w). The
advantages of the design in the frequency domain become obvious from Equation 6.42B.
Only the basics of digital and optimal filters design will be discussed here. For detailed
discussion of the material, the reader is referred to the literature of these topics.1s The topic
74 Biomedical Signal Processing

3 ud) h it) x (0
INPUTo- -o OUTPUT
U(w) H(w) X(w)

FIGURE 2. Linear system in the time and frequency

domains.

o f cepstral analysis and homomorphic filtering with applications to biomedical signal proc
essing are discussed in detail at the end of this section.

B. D igital Filters
The availability of low cost and efficient digital computers and dedicated processing
circuits have made the implementation of filtering, by digital means, very attractive. Even
when ? ^ | og environments, where both input and output signals are continuous,
it is very often worthwhile to apply analog-to-digital conversion, perform the required
filtering digitally, and convert the discrete filtered output back into a continuous signal.
Digital filters are linear discrete systems governed by difference equations (see Chapter
4). Two classes of digital filters are used— finite impulse response (FIR) and infinite impulse
response (HR).
FIR filters are characterized by finite duration impulse response which, in the Z domain
means:

X(Z)
H(Z) = 77777 = b° + b 'z ' ' + ...... + b - z_m (6-43)

where X(Z) and U(Z) are the Z transforms of the input and output sequences. Equation 6.43
states that the FIR filter is a moving average (MA) filter (see Chapter 7), or an all zero
filter. FIR filters are always stable.
IIR filters have, in general, infinite duration impulse response, they possess zeroes and
poles (ARMA filters — see Chapter 7), and their transfer function in the Z domain is

b0 + b ,Z _1 4- ......• bmZ n
H(z ) = "• — 7 - , ^ ---------— ^ 7 1 (6.44)
1 + a,Z + ...... + anZ

IIR filters are stable if all the poles of H(Z) are within the unit circle in the Z domain.
IIR and FIR filters can be synthesized recursively via the difference equations, or by
means of the FFT. Since continuous filter design is well established, one of the approaches
for designing digital filters is to find a difference equation, with the associated H(Z), that
yields an output sequence close to the samples of the analog output signal. This approach
is termed the impulse invariant method. Another approach is to transform the analog filter,
by means of the bilinear transformation, into the Z domain yielding a digital filter, H(Z).
The resultant filter will not possess the same impulse response since the transformation
introduces frequency scale distortions. This method is known as the bilinear transformation
method. A third approach for digital filter design is the frequency sampling method. This
method is based on the approximation of a function by a sum of sine functions. Detailed
discussion of the design steps can be found for example in Gold and Rader 4 and Chen.K

C . T he W iener Filter
Consider now the problem of optimal filter design. Assume, for example, that a signal
s(t) is corrupted with additive noise n(t); it is required to estimate, by linear operations, the
value of the signal s(t + t]), nq ^ 0 from the observation signal x(t):
Volume I: Time and Frequency Domains Analysis 75

x(t) = s(t) -I- n(t) (6.45)

We assume that both s(t) and n(t) are stationary in the wide sense. Note that for tj = 0 the
problem is that o f smoothing, namely, extracting the current value of s(t) from current and
past values o f the observation signal. For t) > 0 the problem is that of prediction, namely,
extracting the future value of s(t + t^) from current and past values of the observation signal.
Assume we have a linear Filter, h(t). We apply the signal x(t) to its input. Let us denote
the output of the Filter by §(t + iq). We shall look for the optimal filter in the sense of
minimization o f the mean square error between the output of the Filter and the actual desired
quantity:

E{e2(t)} = E{(s(t + t]) - s(t + tj)]2} =

E{(s(t + -n) - | h(T)x(t - T)dT]2} (6.46)

It is required to m inim ize^2 overall possible, realizable h(t). Performing the minimization4
yields the condition:

rvx(T + ti) = h(£)rx(T - £)d£; t ^ 0 (6.47)

where rsx and rv are the cross correlation of the observation signal with the desired signal
s(t) and the autocorrelation of x(t). respectively. Condition 6.47 is known as the Wiener
H opf condition i?>ee also Equation 9.10).
When the optimal filter, given by Condition 6.47, is used the minimum squared error is6

E{e2(t)} = rs(0) - J rxs(T)h(T)dT (6.48)

and

E{e(t)x(t - t )} = rex(T) = 0 (6.49)

The last result states that under optimal conditions, the error and observations are uncor
related; since E{e(t)} = 0, the two are also orthogonal. If we remove the realizability
constraint, the solution of Equation 6.46 will be similar to Equation 6.47 but with a lower
integration boundary including all negative values; namely, integration boundaries will be
minus to plus infinity. Note that the right-hand side of Equation 6.47 is the convolution of
tx(t ) with h (r). Taking the FT of the equation yields:

Ssx(w)exp(jwT)) = H(w)Sx(w) (6.50)

The exponent in the left side of the last equation is due to the time shift present in rsx of
Equation 6.47. The required optimal filter is thus given in the frequency domain by:

H(W) = fl(w) exptiWT^ (6,51)

We have assum ed x(t) and s(t) to be stationary; therefore, the filter h(t) given by Equation
6.51 is a stable filter. However, the PSD function Sx(w) is an even function of w, hence
the filter will have poles in the right half plane (RHP) of the complex S domain. A stable
76 Biomedical SignalProcessing

system ^£di|p61es at,^M R H P must have nonzero impulse response, h(t), at t < 0 which

o p tim ^ li^ iz a b l# 3 ^ ie n e r filter can be calculated6 from Equation 6.47. Its error will
o r equafeto^that o f the optimal filter. Similar arguments can be applied to
sdigitaT filters. Optima£Wiener FIR and IIR filters can be designed.®
#VC-'n ’ - . V- :
W .C E P S T R A L ANALYSIS AND HOMOMORPHIC FILTERING

A. Introduction
The concept of cepstrum was first introduced in the early sixties in an attempt to analyze
a signal containing echoes. The power cepstrum was defined as “ the power spectrum of
the logarithm of the power spectrum” . Later the definition was changed to make its con
nection with the correlation function clearer and to provide it with units of time. The new
definition became “ the inverse transform of the log of the power spectrum ” . The term
“ cepstrum ” was derived from the “ spectrum” by reversing the order of the first four letters.
T he domain of the cepstrum was termed quefrency, a term derived from frequency. Additional
term s have been defined, such as “ lifter” (derived from “ filter” ), but these were not accepted
well in the literature.
Cepstral analysis is applied mainly in cases where the signal contains echoes of some
fundamental wavelet. By means of the power cepstrum, the times of the wavelet and the
echoes can be determined. The complex cepstrum is used to determine the shape of the
wavelet. These techniques15'21 have been discussed in the literature with various applications.
Itihas been applied to the analysis of EEG signals,17 21 to ECG signals,20 and to the speech
sig n al.19

B. T he C ep stra
The complex cepstrum, x(t ), of the real signal x(t) is given by:

x(t ) = F~'{log F{x(t)}} (6.52)

Since the argument of the logarithm in Equation 6.52 is complex and may be negative, we
shall introduce the complex logarithm of a complex function V:

log(V) = log|V| + j arg(V) (6.53)

W e shall also need to perform the inverse operation, n a m e ly exponentiation; therefore, let
us define complex exponentiation of V:

exp(V) = exp(log|V|) • exp(j arg(V)) (6.54)

T he power cepstrum, x p( t ) , is defined by:

xp = (F-'OoglFW t)}!2})2 (6.55)

In the discrete case, when the data is presented in terms of the sequence {x(nT)}, the
cepstra are defined16 by means of the Z transform.
The power cepstrum of the sequence {x(nT)} is the square of the inverse Z transform of
the logarithm of the magnitude squared o f X(Z). Thus, we write the power cepstrum
xp(nT):

xp(nT) = (Z-'{log|X (Z)|2})2 = ^ | l o g | X ( Z ) | 2Z " - 'd Z j 2 (6.56)

Volume /: Time ottffrequency Domains Analysis 77

Thefinal squaringjnI
does not contain lnfo4nati^|
complex cepstrum.of the^set
logarithm o f X(Z): ~ : ^

If the sequence x(nT) is the convolution of two sequences u(nT) and h(nT), namely,
x(nT) = u(nT)*h(nt), then:

X(Z) = U(Z) • H(Z) (6.58A)

log X(Z) = log U(Z) + log H(Z) (6.58B)

and since the inverse transfonp is a linear operation, the complex cepstrum is

x(nT) = u(nT) + h(nT) (6.58C)

Hence, the com plex cepstrum of the convolution o f two sequences equals the sum of their
cepstra. The com plex cepstrum is thus an operator converting convolution into summation.
Its application to deconvolution problems becomes apparent. Assume that x(nT), u(nT), and
h(nT) are the output, input, and impulse response sequences of a descrete linear system,
respectively. If u(nT) and fi(nT) occupy different quefrency ranges, then the complex ccp-
strum can be liftered (filtered) to remove one. In the complex cepstrum, phase information
is retained therefore it can be inverted, to yield the deconvolved h(nT) or u(nT).
The com putation o f the complex cepstrum in Equation 6.57 has to be carefully considered
since the com plex logarithm is not singled valued. The imaginary part of the complex
logarithm (Equation 6.53) is the phase. If it is presented in module 211 form (principal
value), then discontinuities will appear in the phase term. This will occur due to the jump
from 2 n to zero, when the phase is being increased over 211. Phase unwarping algorithms
must be em ployed to overcome this problem. A simple solution is to compute the relative
phase between adjacent samples, add them together in order to get a cumulative, unwarped
phase.
The complex cepstrum can be implemented22 by means of the DFT replacing the Z
transform. This is true since the sequences are of finite length. The region of convergence
for the Z transform includes the unit circle allowing the Z transform and its inverse to be
evaluated for Z = exp(jw); therefore:

x(nT) = IDFT{log(DFT{x(nT)})} (6.59)

Equation 6.59 is o f great computational importance since the DFT and IDFT can be very
effectively calculated by the FFT algorithm.
The upper part o f Figure 3 depicts schematically the operations involved in the complex
cepstrum com putations.

C . H o m o m o r p h i c Filtering
Let us consider again the example given by Equations 6.58A — C. Here the sequence
{x(nT)} can be the samples of a speech signal, the sequence {h(nT)} the weighting sequence
of the vocal tract, and {u(nT)} the samples of the pressure wave exciting the vocal tract
during voiced utterance, when the vocal cords are vibrating. The pressure {u(nT)} can be
78 Biomedical Signal Processing

FIGURE 3. Homomorphic filtering.

m o d ele d as a train o f very narrow p ulses appearing at a frequency k n o w n as the fundam ental
freq u en cy or the pitch. W e are interested both in the seq uence { h (n T )} in order to learn about
the v ocal tract characteristics, and in the seq uence {u(nT)} in order to estim ate the pitch.
E quation (6 .5 8 C ) g iv e s the co m p le x cepstrum as the sum o f the cepstra o f the input and
the v o ca l tract resp on ses. A ssu m e that in the quefrency range w e have:

h(nT) = 0 for n ; (6 .6 0 A )

and

u(nT) = 0 for n < n() (6 .6 0 B )

T h erefore, th ese are separable in the q uefrency dom ain. C onsid er tw o lifters, a short pass
lifter, Y ,(n T ), given by:

1; n < n0

Y ,(n T ) (6.61 A )

0; oth erw ise

and a long p a ss lifter, Y 2(n T ), g iv e n by:

0; oth erw ise

Y2(nT) = (6.61 B)
Volume I: Time and Frequency Domains Analysis 79

When x(nT) is fed into the input of these two filters, the output of Y, will be fi(nT) and
that of Y : will be u(nT). We now want to transfer u(nT) and fi(nT) from the quefrency back
into the time domain. We have to subject the sequences to the inverse operation. This
involves first the DFT followed by complex exponentiation (Equation 6.54) and IDFT. The
complete operation o f the homomorphic frftering is depicted in Figure 3.
Homomorphic filtering has been applied20 to the automatic classification of ECG. Normal
inverted T-wave and two types o f premature ventricular contractions (PVC) have been
considered. It has been found that feature selection for diagnostic purposes could be more
efficient using homomorphic filtering than by conventional methods. It has been also dem
onstrated that the basic wavelet of normal ECG signal evaluated by the homomorphic filtering
closely approximates the action potential spike in the cardiac muscle fibers.
Scnmoto and Childers21 have used homomorphic filtering to decompose visual evoked
response (VER) potentials. It has been suggested that the recorded VER signals can be
expressed as an aggregate of overlapping signals generated by luuuipL disparate
sources whose basic signal waveforms are unknown and have to be estimated. The as
sumption. therefore, is that the wavelets are identical in waveshape. We shall consider here
the decomposition of tw'o wavelets. The extension to the multiple case can be easily done.
Let x(t) be the composite signal and s(t) the wavelet; then:

x(nT) = s(nT) + as((n - n0)T) (6.62)

where the shape o f s(t), the delay n(), and the echo amplitude a < 1 are unknown. x(nT)
can be written in terms of the convolution:

x(nT) = s(nT) * p(nT) (6.63A)

where

p(nT) = 5(nT) + a5((n - n0)T) (6.63B)

T ak in g the Z transform :

X(Z) = S(Z) * P(Z) (6.64A)

with

P(Z) = 1 + aZ~no ... (6.64B)

Taking the complex logarithm:

log(X(Z)) = log(S(Z)) + Iog(l + aZ-"°) (6.65)

The second term in the right side of Equation 6.65 can be expanded in a power series
yielding

a2 a3
log(X(Z)) = log(S(Z)) + a Z - “ - - Z ~ 2™ + - Z - ,ro...... (6.66)

The complex ccpstrum, x(nT). is given by the inverse Z transform:

80 Biomedical Signal Processing

a2
x(n T ) = Z - ‘{log(X(Z)} = s(nT ) + a8(n T + n0T) - - 5 (n T - 2noT) +

j S(nT - 3n0T) (js.67)

Thus, the complex cepstrum of the composite signal consists of the complex cepstrum of
the wavelet plus a train of 8 functions located at positive quefrencies at the echo delay and
its multiples. A comb notch lifter can be used to remove the train of delta function. After
smoothing, the wavelet is reconstructed by inverting the operations used for the computation
o f the complex cepstrum, as shown in Figure 3.
A similar procedure can be used for the processing of dye dilution curves (see Appendix
A, Volume II).

R E FE R E N C E S

1. Bracewell, R. N ., The Fourier Transform and Its Applications, M cGraw-Hill. Kogakusha, Tokyo. 1978.
2. Papouiis, A ., Signal Analysis. McGraw-Hill int., Auckland. 1977.
3. Tretter, S. A ., Introduction to Discrete Time Signal Processing, John Wiley & Sons, New York, 1976.
4. Gold, B. and Rader, C . M ., Digital Processing o f Signals, McGraw-Hill, New' York. 1969.
5. Brigham, E. O ., The Fast Fourier Transform, Prentice-Hail. Englewood Cliffs, N .J., 1974.
" 6. Laihi, B. P., An Introduction to Random Signals and Communication Theory, Ini. Textbook C o., Scrantcn.
Pa., 1968.
7. Davenport, W . B. and Root, W. L., An Introduction to the Theory o f Random Signals and Noise, McGraw-
Hill. New York, 1958.
8. Chen, C. T ., One Dimensional Digital Signal Processing. Marcel Dekker, New York. 1979.
9. Glasser, E. M. and Ruchkin, D. S ., principles o f Neuruhiological Signal Analysis, Academic Press. New
York, 1976.
10. Nuttal, A. H ., Direct coherence estimation via a constrained ieast-squares linear predictive fast algorithm,
Proc. of ICASSP. IEEE, Paris. 1982, 1104.
11. Yotm, D. H ., Ahmed, N ., and Carter, G. C-. Magnitude squared coherence function estimation: an
adaptive approach, IEEE Trans. Acoust. Speech Signal Process., 31, 137, 1983.
12. Shaw, J. C ., Brooks, S., Colter, N ., and O ’Connor. K. P ., A comparison of schizophrenic and neurotic
patients using EEG power and coherence spectra, in Hemisphere Asymmetries o f Function in Psychopath
ology, Gruzelier, J. and Flor-Henry, P., Eds , Elsevier-North Holland, Amsterdam, 1979.
13. Beaumont, J . G -, Mayes, A. R ., and R ugg, M . D ., Asymmetry in EEG alpha coherence and power:
effect of task and sex, Electroencephalogr. Clin. Neurophysio!.. 45. 393. 1978.
14. Derusso, P. M ., Roy, R. Y., and Close, C . M ., State Variables fo r Engineers. John Wiley & Sons, New
York, 1967.
15. Oppenheim, A. V,, Generalized linear filtering in. Digital Processing o f Signals. Gold. B. and Rader,
C. M ., Eds.. McGraw-Hill, New- York, 1969.
16. Childers, D. G ., Skinner, D, P., and Kenerait, R. C ., The cepstrum: a guide to processing. Proc. IEEE,
65, 1428, 1977.
17. K emerait, R. C. and Childers, D. G ., Signal detection and extraction by cepstrum techniques, IEEE
Trans. Inf. Theory, i8, 745, 1972.
18. Oppenheim, A. V ,, Kopec, G. E ., and Tribolet, J. M ., Signal analysis by homomorphic prediction,
IEEE Trans. Acoust. Speech Signal Process., 24, 327, 1976.
19. K opec, G. E ., Oppenheim, A* V ., an d Tribolet, J. M ., Speech analysis by homomorphic prediction,
IEEE Trans. Acoust. Speech Signal Process.. 25. 40. 1977.
20. M urthy, I. S. N ., Rangaraj, M. R .. U d upa, K. J ., and Goyal, A. K „ Homomorphic analysis and
modeling of ECG signals, IEEE Trans. Biomcd. Eng., 26. 330, 1979.
21. Senm oto, S. and Childers, D. G ., Adaptive decomposition of a composite signal o f identical unknown
wavelets in noise. IEEE Trans. Syst. Man Cybern., 2. 59. 1972.
22. Oppenhein, A. V. and Schafer, R. W ., Digital Signal Processing, Prentice-Hall, Englewood Cliffs, N .J.,
1975.
Volume I: Time and Frequency Domains Analysis 81

Chapter 7

TIM E SERIES ANALYSIS-LIN EAR PREDICTION

I. INTRODUCTION

Modern signal processing techniques are applied to a variety of fields such as econometrics,
speech, seism ology, communications, and biomedicine. A major problem in these appli
cations is the need to analyze and process finite time samples of random processes. In
general, the processes are nonstationary2 and nonlinear. The theoretical basis for modern
time series analysis has been developed by mathematicians and statisticians such as Mann
and W ald.'
Recent developm ents in both theory and com; u t : m . ; l : h m s of linear stationary
signal analysis ' provide powerful tools for signal processing. Some of the techniques are
well established (with available computer program packages (Reference 11, for example)).
When a nonstationary signal is to be processed, it is usually regarded in such a way that
each segment can be considered stationary. Stationary signal processing methods then*can
be applied.
A favorable approach for stationary signal processing is the parametric modeling proce
dure. The process is modeled by some causal rational parametric model. The signal is then
represented by means of the model parameters. Such a procedure is attractive from the point
of view of data compression. Rather than handling (for processing, storing, or transmitting)
the complete time --ample, or sequence, only a reduced number of parameters are used.
Consider, for exam ple, the problem of analyzing and storing EEG data1- in neurological
clinics. It would be of great help if these data could be reduced and compressed for storage
purposes in such a w ay that the signal can be regenerated at will. Another example may be
the storing of compressed ECG data (or a complete medical file) on a personal credit card
carried by the patient in such a way that it can be reproduced at will anywhere.
Signal compression is also attractive from the point of view of classification (diagnosis).
Effective algorithm.'- f o r the automatic classification of signals typically representing various
pathological states arc available.
Since most m odem signal processing is implemented by digital computers, we consider
the sampled signal S*(t) sampled at the frequency of fs = 1/T

S(t) = 2 s (l - kT) (7.1)

The finite time windowed sampled signal is given by the sequence ]S(kT)} k - 0. 1, ...,
N - I

|S(KT)} - S(0), S(T), S(2T), ..., S ((N - 1)T) (7.2)

For the sake o f brevity, w'e shall denote the sequence by {SJ without the loss of generality.
The sequence in Equation 7 2 is to be modeled by a parametric model.
A very effective parametric model is that of the transfer function (TF). The sampled signal
i the sequence), { S ^ i s assumed to be the output of a linear system driven by an (inaccessible)
input sequence {L J and corrupted by an additive noise. The sequence { S j is thus given as
the solution to the difference equation

s k -- - Y
i i ■-(1 i r.
(7.3)
82 Biomedical Signal Processing

in Equation 7.1; { £ j is the additive noise sequence. It is usually convenient to work with
noise sequences which are white, with zero mean. Consider, therefore, the sequence {£J to
be; the output of a noise filter driven by white noise nk such that
I

2 c& - i = £ dink-, (7.4)

Defining the operators

A(z ') = X a*z 1 J ao = 1

i= 0

q
B(z ') ■= E bjZ-1

C (z“ ') = ^ c:z ' !

D(Z" 1) - 2 ^:Z ’’ (7.5)

and transfering Equations 7.3 and 7.4 into the z domain, we get

8 (z ~ ') D( z ^ )
S(z) - - U(z) - N(z); (TF) (7.6)
A(z 1) C(z ')

In Equation 7.6 the sequence {S(k)} is modeled by means of the system parameter vector,
Psr and the noise parameter vector, g n, where

& = [aG, a,, ... ap. bu. b, ... o j

31 - [c0, c,, ... c„, d0, d, ... dm] (7.7)

The problem of identifying the above parameters when the input is available is well covered
in the literature on system identification.1’ In signal processing modeling the input sequence
{Uk} is assumed to be a white unaccessible sequence. The parameter vector |3S is thus
describing a linear transformation transfering the white sequence into the (colored) signal
sequence. The transfer function model can be decoupled into the deterministic and noise
models

A (z_1)Y(z) = B (z- , .)U(z); Deterministic system

C (z_ i)£(z) = D(z" ’)N(z); Noise model
S(z) = Y(z) + £(z): Observation equation

Here the sequence {Y J is the unaccessible noise-free output of the system.

Figure 1 shows schematically the transfer function model of the sequence { S j. Several
time series models have been derived from the TF model of Equation 7.6.
The Autoregressive moving average exogenous variables (ARMAX) model is derived
from Equation 7.6 by letting
Volume I: Time and Frequency Domains Analysis 83

F (z‘)

((k)

B (z-') Y(k ) J

A (r') ~ ~ Vc

FIGURE 1. Transfer function ■jodel.

C (z " ‘) = A (z " ‘) . (7.8)

(see, for exam ple, Reference !!). The ARM AX model can thus be written a^

A (z' ’)S(z) = B(z ')U(z) + D(z ')N(z); (ARMAX) (7.9)

or in terms o f the difference.equation

sk = - 2 a.si. ■+ z •>!ui -i + X dink-i . (7.10)

The autoregressive moving average (ARMA) model is derived from Equ tion 7.6 by
assuming there is no external noise, hence

B(z ')
S(z) ~ U(z): (ARMA) (7.11)
A(z "')

or in terms of the difference equation

Si = - 2 a ,S k i + S b i u k-i
(7.12)
i I ! -- 0

The last model is known as ARMA of order (p.q): in engineering t e c h n o lo g y , It is known

as a pole-zero m odel.1516
In Equation 7.10, the current (kth) sample of the sequence is expressed a> linear com-
bination of the past p sequence samples and (q + 1) input samples. Hence, the models
discussed here are also know'n as linear prediction models. Figure 2 depict- the ARMA
model in the z and in the time domains.
Two additional models are commonly used: the autoregressive (AR) model an<:i rhe moving
average (MA) model. The AR model17 is an all pole model given by assumir: : B(z !) =
G

G
S(z) U(z); (AR) (7.13)
A(z

Sl = 2 a, S ( k - i ) + GU(k) (7.14)

while the MA, all zero model, is derived from the ARMA model by assumi; ^ A(z ') =
I

S(z) = B(z !)U(z); (MA) (7.15)

sk = i b,Uk _; (7..16X
84 Biomedical Signal Processing

0 U(h) BU'*)
A (i-')
S(k)

FIGURE 2. Autoregressive moving average (ARMA)

model.

AR MODEL

nan

XbjUU-iV

MA Model

FIGURE 3. Autoregressive (AR) and mening average (MA) models,

q
s k = 1X=0 bA - , (7-16)
Figure 3 shows, schematically, the AR and MA models in the frequency and time domains.
Other models such as the IMA or ARIMA3 (autoregressive integrated moving average)
are used when homogeneous nonstationary signals are to be modeled (see Section VIII).
The basic idea behind linear predictive modeling is that by assuming the sequence {S j
to be the output of a linear system, one can express the sequence (in a reduced parametric
manner) by means of the system parameters. Given a sequence {SJ, one has to identify the
parameters of the system. Since the input is inaccessible, well-known algorithms for system
Volume I: Time and Frequency Domains Analysis 85

identification 13 cannot be directly applied. The common approach has been to model the
sequence by means o f a system driven by an input with white spectrum (i.e., white noise,
impulse). Algorithms for the estimation of system parameters (without the need to have an
access to the input) are available. These will be discussed in the following sections.
The AR and ARMA models are the most commonly used. These will be discussed in
more detail in following sections.

II. AUTOREGRESSIVE (AR) MODELS

A. Introduction
The AR m odel17 is very often used because of its simplicity and because of the fact that
effective algorithms for the estimation of the AR parameters are available. Note that an
ARMA model can be approximated by means of an AR model. Assume an ARMA model
given by the polynomials B(z ') and A u :), of order q and p. By long v.^ wu.,
get

B(z~') 1
(7.17)
A(z ''j A ( z '!)

hence, the ARMA model can be expressed by means of an AR mode!. In general, however,
the polynomial A (z“ ') will be of infinite dimension. It is possible to approximate the ARMA
model by an AR model having the first p coefficients of the A(z"'') polynomial.
The choice of the order p of the AR model depends on the accuracy required.
When modeling a stationary process, one must make sure that the model is stationary. In
order to ensure weak stationarity, the autocovariance and autocorrelations of the modeled
sequence must satisfy a set of conditions.' For a linear process, these are ensured if the
complex roots of the characteristic equation

A (z- ') = 1 + a, z - a2z~ ?- + ... -r apz p = 0 (7.18)

are all inside the unit circle in Hie / p la n e or outside the unit circle in the z '" 1 plane.
The inverse filter H _l(z _ ')

H 'U _ ') = ( ^ 7- t t ) ' = A(z->) (7.19)

is known as the .\R whitening filler. When the sequence {SJ serves as the input to the AR
whitening filter, the resultant output will have white spectrum. The simplest AR process is
that o f the first order. It is known as the Markov process, given by the difference equation

Sk = - a , Sk_, + GUk (7.20)

which is stationary for all |a ,|< l.

B. Estimation of AR Parameters — Least Squares Method

It is required to estimate the order of the process, p, its coefficients as, i = I. 2, ..., p,
and the gain factor. G. Since the input is inaccessible, the estimation must be performed
without the sequence {U j.
Assume we have at our disposal the samples S,, j - 0. 1, ..., (k - 1). We can estimate
the incoming sample Sk by the estimator
86 Biomedical Signal Processing

Sk = - 2 a , S k_i (7.21)
i= 1

where a circumflex (•) denotes estimated value. For the time being, we shall assume that p
is given (for example, by guessing). At time t = kT we can calculate the error e(k) (known
as the “ residual” ) between the actual sequence sample and the predicted one:

ek = Sk - Sk = Sk + 2 (7.22)
i= I

Note that the residuals { e j are the estimates of the inaccessible input {GUJ. The least squares
method determines the estimated parameters by minimizing the expectation of the squared
error.

Min E{e;} = Min E{(Sk + 2 A Sk-,)2} (7.23)

a, a; i=1

Performing the minimization by

5 E{e;;
; i = 1 ,2 , p (7.24)
d a;

and assuming the sequence to be stationary, we get p linear equations

p
2 «»r:-i = “ r.

i = 1 .2 ....... p (7.25)

where = E{SK_j Sk_s} ■=

Equations 7.25 are known as the Yule-Walker equations or the normal equations. They
can be solved for the least squares optimal parameters a^ j — 1 ,2 , ..., p, if the p + I
correlations r, j = 0, 1, ..., p are given.
It can also be show n,17 using the optimal parameters, that the minimum average error
denoted by E,-,

Ef, = r„ + i a; rj (7.26)
1- 1

The correlation coefficients are not given; hence, they have to be estimated from the given
finite seqnence {S(k)}. Assume the sequence [ S j is given. For k = 0, 1, 2, ..., (N - 1)
we can estimate the correlation coefficients by

f, = ' s k Sktl (7.27)

In Equation 7.27, we have assumed all samples of {SJ to be zero outside the given range.
These estimations (Equation 7.27), known as the autocorrelation method, will be used instead
o f the correlation coefficients of Equation 7.25. For sake of convenience, we shall continue
to use the symbol rs where indeed f; must be used. Equation 7.23 can be written in a matrix
form
Volume /: Time and Frequency Domains Analysis 87

r„ r, r, . • rn n a, r,

r. r„ r, . ■ .fp-2 a, r.

rP-2 r„ r.

rr •
r, r«, a,-,

Ra = r (7.28)

where the correlation matrix R, vector r, and the AR coefficients vector a are defined in
Equation 7.28. It can he shown17 that, for the deterministic case, a similar equation exists
for the estimation of the AR parameters vector.
The direct solution of Equation 7.28 is given by inversion of the correlation matrix

R - 'r (7.29)

The correlation matrix is symmetric and, in general, positive semidefinite. Efficient al
gorithms for the solution of Equation 7.28 exist. Note that the correlation matrix is a Toeplitz
matrix (the elements along any diagonal are identical). Durbin11' has develop;-i an efficient
recursive procedure

E, - r„ (7.30A)

V ' r, j)/Ej (7.30B)

(7.30C)

+ k, ajL j = K 2 ....... (i - I) (7.30D)

E, = (1 k,: > E, (7.30E)

Equations 7.30A through E are solved for i - I, 2, ..., p.

The optimal AR vector for any model of order k, k = 1 .2 .......p is given from Equation
7.30D bv

a. - a-k); j = 1 ,2 , ..., k (7.31)

Hence, the Durbin procedure for model of order p also yields all models of order less than
p. A flow chan for the calculation of Equations 7.30A through E is given in Figure 4.
An additional byproduct of the Durbin’s algorithm is the minimal average error of the ith
order model E,. It can easily be shown that

0 E; (7.32)

E0 - r(0) (7.33)

One way for determining the model’s order is to evaluate Equation 7.30 for some large
88 Biomedical Signal Processing

FIGURE 4. A How chart for the Durbin's algorithm for LPC.

order n and then choose the model with order p < p for which tne minimal average error
is small enough.
The coefficients K-,; j = 1, 2, ..., p calculated by Equation 7.30B are known as the
reflection coefficients or the partial correlation coefficients (PARCOR).3’8 Sufficient con
ditions for the stability of the model are

jk j < 1 i — 1, 2, ..., p (7.34)

Since the Durbin procedure yields the PARCOR at no extra calculational cost, stability is
easily determined without the need to solve the pth order Equation 7.18.
It can be show n17 that the estimated gain G is related to the correlation coefficients by

p
G2 = E„ = f„ + 2 a; ?* (7.35)

Several methods have been suggested for the estimation of the m odel's order, p. One of
the well-known methods is the one suggested by Akaike19 22 which will be discussed later
in this chapter. An important application of AR analysis is that of spectral estimation; this
will be discussed in detail in Chapter 8.
Volume I: Time and Frequency Domains Analysis 89

III. M OVING AVERAGE (M A) MODELS

A. Autocorrelation Function of MA Process

Consider the difference equation for the MA process given in Equation 7.16. The auto
correlation, r*, of the MA sequence is calculated from Equation 7.16 by

it = E{sk st+ J =

i (7.36)
j- 0 Ii ~0
where r" is the autocorrelation of Ihe white input where

= H\) (7.37)

Hence, terms in Equation 7.36 will bo nonzero only for

i f f - j = 0 ' (7.38A)

2 > ,b . ; 0 « i «s q (7.38B)

Equations 7.38 A a ad B arc: q -f J nonlinear equations. Assuming the correlations r-: i

= ' 1, 2, .... q are given (or are estimated). Equations 7.38A and B can be solved for the
unknown MA c o e ffic ie n ts .

B. Iterativ e E stim ate the MA P aram eters

Consider the estimation of the correlation coefficients of the MA process, for example,
by

i Nv
f'; = z , — I 2 V i . , (7.39)
:vi 1 i~0

Using Equation 7.39 in Equation 7.38 y ields

it = £ b; bj. ; ; 0 as i q (7.40)
i-'O

These can be written as

fe = f; - V
j• !

v - -
b. - b0 1 (fr - Z b. bJ+ ,)’ i = 1 ,2 , ..., q (7.41)
j i

Equations 7.41 can be solved iteratively by

t> r - (ft, - E ir
t i
m = 0, 1, 2
90 Biomedical Signal Processing

b-m> = (b ^ ’) ” ' (f? - 2 bS"’_l>t>j+r") <7-42>

where (*)(m) is the value of the mth iteration of (•) and (*)j<m) = 0 for m < 0.
Equations 7.42 are iteratively solved until some convergence criterion is satisfied. Such
a Criterion may be

2 (b«"’> - « e (7.43)
1= 0

where € is a predetermined convergence limit. Equations 7.42 converge linearly. Better

procedures which converge quadratically, as well as methods based on maximum likelihood
(ML) estimations, have been suggested.3 The estimate of Equation 7.42 is recommended
only as a rough estimate of the MA parameters, while the more elaborate ML is recommended
for a more precise model identification.

IV . M IX E D A U T O R E G R E S S IV E M O V IN G A V E R A G E (A R M A ) M O DELS

A. Introduction
In cases where the process does have a number of influential zeroes, the approximation
7.17 will require very large dimension of the AR model. This may become undesirable for
signal compression applications. An ARMA model can then be used.
Assume an ARMA (p,q) process given by Equation 7.12. The process is stationary if the
roots of the characteristic Equation 7.18 are outside the unit circle in the z 1 plane.
The inverse filter

H „ A" ’• '■ ,7.44,

B ( z - ')

is called the ARMA whitening filter since when driven with the given sequence {Sk} will
yield white noise as its output.23

B. Parameter Estimation of ARMA Models — Direct Method

When the input to the model is accessible, input-output identification techniques can be
used. Many methods, based on maximum likelihood, least squares, instrumental variables,
and others, are available. When modeling a signal, the input is inaccessible so that special
methods must be used.
Box and Jenkins3 have suggested an iterative method based on the maximum likelihood
functions. Other iterative techniques have been suggested based on the whitening filter
approach.24 Durbin18 has suggested a method where an AR model is first fitted to the
sequence. The white input noise is estimated by means of the inverse AR model fitted.
Having now both input and output sequences, identification methods are used to estimate
the ARMA parameters. An improved version of this algorithm was suggested by Mayne
and Firoozan.25
A somewhat simpler method26 will be described here. Consider again the ARMA model
of Equation 7.11, where the ARMA (p,q) filter is approximated by long division

H (z_l) = — = __!— . (7.45)

A (z” ') A (z~ ')

The sequence is now modeled by an AR model

Volume I: Time and Frequency Domains Analysis 91

sk = - iI I a , s k. , + GUt (7.46)

In Equation 7.46, r is the order of the A(z ') polynomial. It is chosen to be sufficiently
hl^h (see Section II). The AR parameters a,, i = 1, 2, ..., r are identified using the Durbin’s
algorithm (Equation 7.30). By cross multiplication, we get from Equation 7.45

(1 + b ,z 1 4- b 2z 2 + . . . . 4- b^z l,) ( l 4- a, z 1 4- a2z ~ 2 4- . . . . 4- arz r)

= (1 4- a ,z 1 4- a ,z 2 4- . . . . a,-,z >') 4- e(z r ') (7 .4 7 )

w here e (z i is a p olyn om ial with i = r 4- 1, r 4- 2, ... C om paring terms and c h o o sin g p

and q such that r 5s p 4- cj w e net n + q equations

a, = a, + b,

= a, + a,b, 4- b :

= a, 4- a,b, 4- ajb : 4- b ;

“ ^ ~ ••••

a- 4- 1 = a i(. , a^b, j - ........ 4- a , b t)

= a,-, 4- a.-, | b, a,-, 2 4- ................. 4- a,-, v,b0 (7.48)

0 = a. apb ,

o = a(S.^ + a,-,,C|. , b, 4- .......... 4- a,-, bl( (7.49)

In Equations 7.48 and 7.49 ah i = 1. 2, ..., r are known and the ARMA parameters a,, j
= 1 ,2 ........p and bk, k = 1, 2, .... q are to be determined. The order (p,q) of the ARMA
model is estim ated by (p,q). Equation 7.49 provides a set of q equations with q MA
parameters, b . as unknowns
92 Biomedical Signal Processing

- - - — — —
a*. a^-i •••• a^+i-q b, ^p+1

^p+i a? •••• a^+2-q b2 ap+2

ap-j+q — 3p +q

K pA = ~ !<p-» (7-51)

where the matrix A^,, b, and a ^ +1) are defined in Equation 7.50. The solution to the MA
parameters b is thus given by

b = - ( A {p)y l ^ +i) (7.52)

V
Note that the square q x q matrix A(^ has to be nonsingular for the inversion in Equation
7.52.
The p AR parameters, a, can now be calculated using the set of Equations 7.48. Define
the augmented (p x 1) dimensional vector, bA, as

b X = [ b „ b 2, . . . , b „ 0 , ...,0 ] =

ibT : o w.„ ] (7.53)

the a(1) vector by

[a,, a2, ...., ap] (7.54)

and the (p x p) matrix AD by

1 0 0

a, 1 0

a, a, 1

Ad =

ap_, a^_ a, 1
(7.55)

The required AR parameters are thus given by

Volume I: Time and Frequency Domains Analysis 93

a = a(1) + Ad bA (7.56)

Equations 7.52 and 7.56 constitute the ARMA estimation. The estimation, however, is very
sensitive to noise and ill conditioning of

C . Parameter Estimation o f ARMA Models — Maximum Likelihood Method

An ARMA parameter identification method, based on maximum likelihood estimation,
was suggested by Cadzow.27
Consider the ARM A difference Equation 7.12. Multiplying by Sk+j and taking expectation,
we get the Yule-W alker equations for the ARMA case

r, + Z airj i =

0; j > q (7.57)

where the sequence h(k) is the weighting sequence (the response to Kronecker delta) of the
ARMA filter. Equations 7.57 show that the AR parameters, a, appear in a linear fashion
for all j > q. Consider the set of t equations; evaluated over the range q + 1 j ^ q +
t, these yield

V . "l" ‘o '

a, 0

1q + t 1q + t - I v«- aP 0 (7.58)
_„
or in matrix notation

R.a = 0 (7.59)

where R, is the (t x (p + 1)) autocorrelation matrix, 0 is the ((p + 1) x 1) null vector,

and a is the AR augmented vector:

a T = [ 1• aT] (7.60)

If the order (p.q) is known and the correct (and not estimated) correlations are used to form
R„ Equation 7.59 will have a unique solution for all values of t 3= p. This is true since the
rank of Rt is equal to min (p,t). However, both (p,q) and the exact correlation coefficients
are not given; thus, estimation methods must be applied.
The unbiased estimate f ’..j autocorrelation coefficients are adopted here

M -j- I
1
V (7.61)
M - j
94 Biomedical Signal Processing

where M is the number of samples in the finite sequence over which the estimation is
performed. Estimating the ARMA order (p,q) by (p,q), the elements of the estimated cor
relation matrix Rt are given by
‘ r ■■■ ■
i. r(i,j) = f4+I+i_j

(7.62)

The left side o f Equation 7.59 with the estimate of the correlation matrix will no longer
equal 0. We can define an error vector e such that

R, a = e (7.63)

Since the error is the result o f the sum of many random variable products, its joint density
function can be assumed gaussian with zero mean and covariance matrix W.

p(e) = (2 tt)-'/2 |W |- ,/2 exp('/2 - eT W - 'e ) (7.64)

where the covariance matrix W is given by

W = E{e eT} (7.65)

The estimate of the AR parameter vector a can be performed by maximizing the joint
density function. The maximum likelihood3 estimate of a is given by the equation

R^ W _ 1 R ta = p (7.66)

where the (p + 1) x 1 vector p is defined by

C pT = [p, o,...., o] (7.67)

p is a constant selected such that the first component of a is one. Cadzow16 has suggested
taking the (p x p) covariance matrix as the identity matrix W = I, hence the estimate of
the AR parameters becomes

= & = (RJ Rt) - 1 p (7.68)

The dimension t should be selected in the range p ^ t ^ M — q — 1. Selecting the maximal

value o f t will improve model fidelity at the expense of higher calculation load.
Having identified the AR parameters of the ARMA model by Equation 7.68, its MA
parameters must be estimated.
Consider transfering the sequence {S(k)} through a filter A(z~ ‘) whose parameters are the
estimated AR parameters (Equation 7.68). Denote the output o f the filter by (S(k)}, hence

S(z) = S(z) • A (z -') = A (z -') U(z)

A (z"')

= B (z -') U(z) (7.69)

Volume 1: Time and Frequency Domains Analysis 95

Assuming A (z-1) =* A(z"‘)» then the MA model of Equation 7.69 is close to the MA part
of the ARMA model to b** identified. The sequence {§(k)} is therefore assumed to be a
MA(4) process. Its parameters 6 are identified using techniques discussed in Section III.

V. PROCESS ORDER ESTIMATION

A. Introduction
One o f the important decisions that has to be made when modeling data with parametric
models is that o f estimating the order. In ARMA modeis, this means the estimation of (p,q)
the AR and MA orders.
Order estimation is usually done by means of some optimization technique. It is desirable
to have the minimum values for p and q to reduce the amount of calculations, storage, and
ill conditioning problems. It is required, however, to have large enough values of p and q
to adequately represent the process.
Currently, many methods are available for choosing the order of AR or ARMA models
(see, for exam ple, References 3, 17, 19 through 22, and 28 through 37). Several basic ideas
are used for order determination. Some check the correlation or spectral flatness of the
residuals; others use decision rules based on Bayesian approach, maximum likelihood ap
proach, and amount of information measures.

B. Residuals Flatness
For the estimation of the order of an AR process, a measure of residuals correlation has
been generally u sed.17 As noted by Equation 7.22, the residuals are the estimates of the
inaccessible input sequence {G U j which is an uncorrelated (white) sequence. Hence, for
an adequate model, the residuals must be uncorrelated or, in other words, possess flat power
spectrum.
Consider an AR model with order p, and define the normalized error p

Vp = f (7.70)

where Ep is the minimum average error (Equation 7.34) and r0 is the total energy in the
sequence (Equation 7.27). From Equations 7.32 and 7.70, it is clear that is a monotonically
decreasing function of p. It has been shown17 that is bounded by

V0 = 1 ^ V , ^ v min (7.71)

where

v- = ^ vp 4 exp( i C lo8S'“'doj) (7J2)

S(w) being the PSD function of the process.
Theoretically, = Vp for all p ^ p. One can plot Wp as a function p and check for the
model order for which the curve becomes flat. In practice, because estimations are based
on finite samples, the curve will not become completely flat. Therefore, the following test
has been suggested .17 Choose that model order, p, for which the following test has been
succeeded for several consecutive values:

(7.73)
96 Biomedical Signal Processing

FIGURE 5. Estimation of AR model’s order. Data was synthesized from an AR difference equation of order ten.

The test is based on the hypothesis testing methods, rather than on some procedure where
multiple decisions are used. This has been a major reason for criticizing the method20 and
developing new methods for model order estimation.
In Figure 5, the method for order estimation is demonstrated by means of synthesized
data. A sequence { S j was synthesized from an AR Model of order 10. The curve of vs.
p is plotted.

C . Final P rediction E rro r (F P E )19

Consider the sequence { S j k = 0, 1, N - 1 with its linear prediction of order p
given by Equation 7.21. Its prediction error is E^ given by

E, - 1 1 ; ( , , i

It can be shown that the expectation of Ep as N approaches infinity is

E{EP} = ( l - £ - ± - 1 ) G 2 (7.75)

Consider now another sequence { y j infinitely long with the same statistical properties as
{S(k)}. The predictor for this sequence will be

y* = - J U n - i (7.76)

where a; j — 1, 2, ..., p are functions of {S j. The variance of the resiauais tends asymp
totically to (1 + (p + 1) N ~ ') G2 as N approaches infinity. Hence, it is logical to estimate
the final prediction error (FPE) of the predictor (Equation 7.76) by
Volume I: Time and Frequency Domains Analysis 97

FPE(p) = ( l + Q (7.77)

where G2 is the variance of the input sequence {GU(k)}.

Since G 2 is not known, its estimate from "Equation 7.75 can be used such that

(7.78)

and the optimal model order p is given by minimizing Equation 7.78:

FPE(p) = Mm (7.79)
P

An alternate method which very often gives the same results has been suggested by Parzan.6
The method is known as the criterion autogressive transfer function (CAT). It is based on
minimizing the average normalized spectrum error between model of infinite order and of
order p.

D. Akaike Information Theoretic Criterion (AIC)

Akaike-" has extended the FPE method to any maximum likelihood situation, h e has
defined the information theoretic criterion, AIC, as

AIC = - 2 (n (maximized likelihood) + 2k (7.80)

where k is the number of independently adjusted parameters within the model. The AIC is
defined as an estimate of twice the negentrophy (known also as the Kullback information)
of the true structure with respect to the fitted model. The estimates of a and E by the Yule-
Walker Equation (7.40) are approximately maximum likelihood estim ates.1 Thus the AIC
for an AR model is given by

AIC(p) = p(UN£n(Ep) + 2p (7.81)

and the optimal model order, p, by

AIC(p) = Min AIC(p) (7.82)

In Equation 7.81, p(Uis a factor that compensates for windowing effects. Its value is taken
as the ratio of the energy under the window function used in the data handling to that of
rectangular window. For a Hamming window, pto = 0.4.
It has been suggested7 that the search for the absolute minimum of Equation 7.81 will be
carried out over the range 1 ^ p =? 3 N 1'2.
The AIC is also successfully used" in determining the order of ARMA models. Here the
AIC is defined by <

AIC(p,q) = Min (p(>)N£n EM + 2(p + q)) (7.83)

p,q
98 Biomedical Signal Processing

E. Ill Conditioning of Correlation Matrix

Consider the estimation of the AR parameters by means of Equation 7.66. The estimation
is performed by multiplying both sides o f Equation 7.66 by the inverse of ft* W" 'ft,. This
p x p matrix is well conditioned (i.e., strongly nonsingular) when p = p. If this matrix
becomes ill conditioned, the estimated parameters will be very sensitive to small errors in
the estimation of the autocorrelation.
Pao and Lee28 and later Cadzow27 have suggested the estimation of order model p as the
largest possible value that renders p x p matrix which is well conditioned.
Denote the matrix A(p)

A(p) = RJ W - ' Rt = [ a j ; i, j = 1, 2........p (7.84)

and its normalized determinant |A|n(p)

|A|n(p) = det |ajj/cTij (7.85)

where

<r? = f (|au |2) (7.86)

The matrix A becomes ill conditioned when its normalized determinant becomes much
smaller than one. The optimal order, p, can be defined as the model order for which |A|n(p.
+ 1) exhibits a significant falloff in magnitude.
In the case where the correlations are calculated with no error, the rank of Rt equals the
true AR order, p (for p ^ t, p p). In the practical case, the matrices will have full rank
for all estimated orders, p, larger than the correct one. However, it is found that (p - p)
eigenvalues have values “ close” to zero. A procedure for order estimation can thus be
formulated as: the optimal order, p, is the value for which the estimated correlation matrix
Rt has (p — p) of its eigenvalues sufficiently close to zero (for all p > p). A method for
implementing the above procedure, based on singular values decomposition (SVD), has
been suggested by Cadzow.27 Efficient algorithms for calculating the SVD are available.38

VI. LATTICE REPRESENTATION

Time series, modeled by AR model of order p, is characterized by the inverse filter

A (z“ ‘). The coefficients of the filter, a, can be estimated from the Yule-Walker Equations
7.25, e .g ., by the Durbin algorithm (Equation 7.30). Theoretically, the stability of the all
pole filter is guaranteed; however, in practice, due to finite wordlength (FWL) computations,
the estimated filter is not always stable.
A lattice formulation has been suggested by Itakura and by Burg (see Makhoul39) for the
filter. This formulation guarantees stability with much smaller sensitivity to finite wordlength
computations. The estimation of the lattice coefficients, however, require heavy computa
tions. Makhoul39 has suggested an improved lattice formulation (covariance lattice).
The lattice inverse filter is shown in Figure 6. The coefficients of the filter, k^, 1 ^ i ^
p, are the reflection, or PARCOR, coefficients (Equation 7.30). Filter stability is guaranteed
for |kj| < |; 1 ^ i p. The PARCORs can be calculated from the LPC (AR) parameters —
vector a by 17 the backward recursion equation

k4 = af}
aji-u = (a(i> _ a<o _ k2}- i ; ! ^ j ^ (i _ (7.87)
Volume I: Time and Frequency Domains Analysis 99

FIGURE 6. Lattice filter.

Equation 7.87 is calculated recursively for i = p, (p - 1), 1 (in that order) where
initially a]^' = a,, 1 ^ j ^ p.
The LPC coefficients can be calculated from the PARCORs by the recursive equation

a!"11 = km

a""' = aj"' " + km a‘n"V"j1 1 ^ j ^ (m - 1) (7.88)

Equation 7.88 is calculated for m = 1 , 2 , . . . , p. After each recursion, the coefficients

a-m\ 1 j m are the AR coefficients of the mth order model. The desired AR vector of
order p is given by aj = a]1”, 1 ^ j p.
M akhouP' has developed a procedure for the direct estimation of the lattice coefficients
with calculation complexity compatible with that of the regular prediction filter. Lattice
i‘i Iters have been suggested also for ARxMA models.40

VII. NO NSTATIONARY PROCESSES

A. T ren d N onstationarity — ARIMA-'

Consider a nonstationarv continuous time series y(t) such that

y(t) = m(t) + x(t) (7.89)

where x(t) is zero mean stationary time series and m(t) is continuous trend represented by
a polynomial of degree (d - 1).
Signals like the one described in Equation 7.89 frequently occur in biomedical applications.
The trend, m(t). in these cases may be, for example, a baseline shift due to undesirable
electrode movement in bioelectric signals.
The trend can be removed from the signal y(t) by the repeated difference filter V d. To
illustrate this, consider for example a trend polynomial of degree 2:

m(t) = m2t2 + m, t + mQ (7.90)

Then, applying the difference filtci once,

V y(t) = y(t) - y(t - 1) = m2t2 + m ,t + mG - m2(t - l) z

- m,(t - 1) - mQ + Vx(t)

= m2 • 2t - m2 + m, + Vx(t)

applyingthedifferenceoperator again,
100 Biomedical Signal Processing

V 2y(t) = 2tm2 - m2 + m, — 2(t - l)m2 + m2 - m, + V 2x(t)

= 2m2 '+ V 2x(t)

and applying once more the difference operator,

V 3y(t) = V 3x(t)

In the last equation, the trend has been completely removed.

Define the time series S(t) as

S(t) = Vd y(t) (7.91)

S(t), therefore, is stationary and stationary modeling techniques (such as ARMA modeling)
can be applied to its processing. Consider now the discrete sequence {Sk} consisting of the
samples of S(t), modeled by an ARMA model

A (z -')* S (z ) = B (z -')U (z ) (7.92)

Since z{V} = (1 — z ~ l), we get from Equation 7 .9 i

S(z) = (1 — z ~ ’)d Y(z) (7.93)

%
■ -
Inserting Equation 7.93 into Equation 7.92

A (z-')(1 - z " ') d Y(z) = A (z_1) Y(z) - B(Z- !) U(z) (7.94)

Note that the original process can be written from Equation 7.91 as

Y(z) = [(1 - z - 1) “ 1]d S(z) = ( V " ) d S(z) (7.95)

where

V " ' = (1 - z - 1) - 1 = 2 z -1 (7.96)

i =0

The sequence { y j can be calculated-by transferring back Equation 7.95

Yk = 2 ...... 2
jd=- oc i2=- -X 2 - TCSfi (7.97)

The nonstationary sequence {Yk} can be retrieved from the stationary sequence {Sk} by d
summations (or “ integrations” ); the process in Equation 7.94 is therefore called an auto
regressive integrated moving average or ARIMA (p,d,q) process. The indexes p,d,q denote
the order of the AR process, the order of the difference operator, and the order of the MA
process, respectively.
The ARIMA (p,d,q) model of Equation 7.94 has q (stable) poles outside the unit circle
due to the polynomial A(z~ ’), and d poles on the unit circle at z = 1. When solving Equation
7.94 for { Y j by the inverse transformation, the dth order pole at z = 1 is responsible for
the trend form present in { Y j while the stable poles are responsible for the stationary part
o f{ Y j.
Volume I: Time and Frequency Domains Analysis 101

B. Seasonal Processes 3
Many biological signals exhibit periodic behavior. Some show a period of 12 months,
some (like menstrual secretions) have a period of about a month, others have a period of
about a day (Circadian rhythms), and others exhibit shorter periods of 30 min, 7 min, and
shorter.
Assume a nonstationary time series {Y j with seasonal behavior of period X. The analysis
of the sequence Yk, Yk_x, Yk_2x, ... is of interest. This sequence may possess a trend of
order (d - 1). The stationary sequence {SJ is thus given by

A (z~x) S(z) = B (z~x)U(z) (7.98)

with

S(z) = VJ Y(z)

where

V JX = (1 - z - K)d (7.100)

The nonstationary sequence { Y j is described by the transfer function equation:

VC| - A1, . » l - >, . - f Ula I7l0l)

Note that the term (1 - z K) yields X equally spaced roots, on the unit circle, at angles
2uk/X; k = 0. 1........X - 1. These roots describe the X periodic behavior in {Y J.

V III. A D A P T IV E S E G M E N T A T IO N

A. In tro d u ctio n
The assumption usually made that the signal under test is stationary does not generally
hold for real signals . A more practical assumption is that of taking the signal to be stationary
within a certain time window. The length of the window is determined taking into account
the nonstationary dynamics. For the speech signal, stationarity can be assumed for windows
of about 10 to 20 msec. The short time segments are due to the relatively fast changes in
the spectral characteristics of the speech signal. Other, slower changing signals, like the
EEG, can be considered stationary for much longer durations.
Thus, it is often convenient to treat the nonstationary signal as piecewise-stationary. That
is to say, a signal which is stationary within every given segment. The nonstationarity
manifests itself by the changes in spectral characteristics between the various segments.
Thus, the continuous changes in the statistical characteristics of the process are modeled by
a series of jum p changes.
A simple method of segmentation is to divide the signal into a sequence of constant length
segments. The length of the segments is determined a priori such that it will be short enough
to be considered stationary, yet long enough for lowest frequencies of its spectrum to be
estimated. Such segmentation is usually used in speech processing.
A better segmentation procedure is the one in which the segments are determined adap
tively. Here, the segment length depends upon the dynamics of nonstationarity of the process.
Adaptive segmentation procedures are especially important when classification and effective
storing and transmission are required.
102 Biomedical Signal Processing

Adaptive segmentation procedures have been applied to the detection of failures in linear
system s.41'43 It has been applied mainly to the detection of abrupt, step-wise changes in the
signal statistics due to failure in some systems components. In signal analysis, however,
the usual case is the presence of “ trend-like” nonstationarity where the Statistical charac
teristics of the process are slowly changing. '
Adaptive segmentation procedures have been applied to biomedical signal processing44
and, in particular, to the analysis of EEG signals.45 52 In these applications, it is important
that the procedure can be implemented on line and will be immune to isolated short-time
nonstationarities caused by noise.

B. T he A utocorrelation M easure (ACM) M ethod

This method was suggested by Michael and Houchin48 for automatic segmentation of EEG
signals. The method uses a constant length reference and sliding windows (Figure 7). The
method is based on distance measures defined on the amplitudes and on the spectra of the
reference and sliding windows. A linear combination of these two distances is defined as
the criterion for segmentation.
Define the normalized amplitude distance measure

D = IMP))" - ^ ° » ial (7I(P)

where r,(o) and r0(o) are the zero lag correlation coefficient of the sliding and reference
windows, respectively. Since the zero lag correlation is the energy of the signal, DA is a
normalized distance measure of amplitudes. For the spectral distance measure, we shall
consider only the correlation coefficients for lags zero until the lag for which the correlation
first becomes negative. Define the modified normalized correlation, r°(m)

f r(m)
m = 0 ,1 ,..., m° — 1
no
r(o )’
i*(m) - (7.103)

0; m°,m° + 1,___

where m° is the lag for which the correlation first becomes negative. The spectral distance

X |rf(m) - r^(m)|
Ds = ------~ ' q ------------- — (7.104)
0.5 + Min{r^(m),i?,(m)}
m= I
I ..........
where q is chosen such that the correlations for lag larger than q can be neglected.
The rationale behind Equation 7.104 is that for pure sine waves of different frequencies
in the reference and sliding window, the measure yields the difference between the cosine
of the frequencies divided by their minimum.
The autocorrelation measure is now defined by a linear combination of the two distances

ACM = ^ + ! r ' (7.105)

Volume I: Time and Frequency Domains Analysis 103

7 / . / — — ~
sliding window

D
2

TIM E

FIGURE 7. Adaptive segmentation. Upper trace: piecewise stationary simulated EEG. First 2.5 sec were
simulated from an AR process and last 2.5 sec from a different AR process. Fixed and growing reference
windows are demonstrated. Lower trace: the SEM. A new segment has been detected at t = 2.5 sec.

where RA and Rs are constants. To determine a new segment, ACM is compared with a
threshold. In Reference 48, the constants were determined such that the threshold was unity.
The boundary of the new segment is located within the sliding window (for which ACM
is greater than the threshold) proportionally to the steepness of the autocorrelation measure.
The ACM method is based on somewhat heuristic measures. Its calculations load is,
however, relatively low and it is general in the sense that it does not assume a model for
the signal under test.

C. Spectral Error Measure (SEM) Method

This method has been suggested by Bodenstein and Praetorius45 who have applied it to
adaptive segmentation of EEG. The method uses a constant length reference and sliding
104 Biomedical Signal Processing

windows (Figure 7). The basic idea is to estimate an AR model for the signal in the reference
window and to observe the variations of the spectrum in the sliding window. Variations in
spectrum are measured by means of the SEM criterion.
Consider the spectrum of the signal estimated through the reference window by means of
AR spectral estimation (see Chapter 8). An AR model is fitted to the signal using the
reference window data. Denote this by A~'(z). The estimated signal samples are given by:

o GU(z)
s-<a ■ J m 1,1061
The power spectrum of the signal at the reference, SG(w), is given by:

= (7 1 0 7 »

since the input signal {U J is assumed to be white noise with unit variance.
Consider now the inverse filter, A 0(z), applied to the samples, {st(k)j, o f the sliding
window. The output of the filter, the residuals {et(k)} transformed into the z domain are

Et(z) = A 0(z) • St(z) (7.108)

where Et(z) and St(z) are the Z transform of the residuals and sliding window samples,
respectively. The power spectral density function of the residuals can be expressed in terms
of the spectrum o f the reference §0(a)) by

E,(o>) = jA0(z)j2S,(a3) = G2 (7.109)

S0(to)

Note that here all functions of to denote power spectral density functions. The power spectral
density of the residuals is the Fourier transform of residuals autocorrelation |r t(k)}

E,(o>) = r,(o) + 2 2>,(k)cos(a>k) (7.110)

Equations 7 .1 0 9 and 7 .1 1 0 give the relation between the residuals autocorrelations and the
ratio of spectra.
Since we are interested in the relative spectrum error, it is logical to define an average
error criterion:

Introducing Equations 7 .1 0 9 and 7 .1 1 0 into Equation 7.1 1 1 and noting the orth ogon ality of
the cosine functions, we get

W hen comparing the spectra in the reference and the sliding windows, G is constant.
Therefore, Equation 7.112 can be multiplied by G4 without loss o f meaning. Bodenstein
Volume I: Time and Frequency Domains Analysis 105
and Praetorius45 also claim that for the EEG application it was shown that normalization to
rf(o) was appropriate. Therefore, define the spectral error measure (SEM) as

G4
rf(o )' • (,u31
Since G is not given and only M autocorrelation coefficients are estimated, a more practical
measure is

SEM
(Ir O’- 1 O ’
where G is the estimation of the reference AR filter gain given by Equation 7.35. If the
signal contains short-term, isolated nonstationarities caused by disturbances, the system may
introduce short, meaningless segments. It has been shown45 that clipping the residual signal
to a predetermined level before calculating the correlation, rt(k), removes most of the effect
of these nonstationarities.
Segmentation is determined by placing a threshold, TSEM, on the SEM. As long as SEM
^ T sem, we continue to slide the window and include the data in the current segment. For
the first window for which SEM > TSLm<we declare a new segment. The boundary between
segments is placed at the middle of the window. The complete process is then repeated with
a reference window initiated at the boundary, serving for the new AR coefficients estimation
The complete segmentation is given by me following steps:

1. Estimate the AR coefficients from a reference window k = 1, 2, .... M, y ielding the

AR filter. A~ '(z);
2. Estimate the filter gain, G:
3. Set n = M 4- 1;
4. Define the sliding window k = n. n I- 1....... . n -!- M;
5. Calculate the residuals by passing the data from the sliding window through the inverse
filter A(z);
6. Clip the residuals to remove the effects of noise. Clipping level has to be determined
a p rio ri:
7. Calculate the correlations of the sliding window data by r"(m) = i?_1(m) +
en+ M-m-ien+ M+1 “ enen+ for n > M + 1. The correlations of the first window (n
= M -r 1) have to be calculated in full;
8. Calculate SEM and compare with the threshold TSEM;
9. If SEM =£ T sem, shift the window by one sample (n —* n + 1) and go to step 5;
10. If SEM > T sem, a new segment is defined at the middle of the current w indow;
11. Shift the time base by substituting k —> k - n - M/2 + 1;
12. Return to siep i .

D. Other Segmentation Methods

Several other adaptive segmentation methods have been proposed (see, for example.
References 53 to 55). Three of them will be briefly discussed here; the reader is referred to
the original papers for a detailed discussion.
Appel and Brandt52 have proposed a method based on the generalized likelihood ratio
(GLR). In their method, the sliding window is constant in length as is the case in the method
discussed previously. However, the reference window is continuously growing and includes
all the data up to the first sample of the sliding window (see Figure 7.7). This is definitely
106 Biomedical Signal Processing

an advantage when abrupt, step-wise changes are expected since the data up to the change
does indeed belong to a stationary process. When the nonstationarity is “ trend” wise,
namely, gradual changes occur, the advantage of this arrangement over the constant length
reference is doubtful. The GLR method puts emphasis on the detection of the boundary
within the window in which a change has been detected. This is again important in the case
of abrupt changes. Its significance is lost when the change is gradual and boundaries of
segments are arbitrarily defined by means of some average error allowed.
Basseville and Benveniste55 have suggested a procedure based on two AR models with
distance measure between them such as the log likelihood ratio and Kullback’s divergence
between conditional probability laws. They have suggested a continuously growing reference
window but with “ forgetting” weights that will decrease influence of past samples. This
may cause even more severe problems with trend-like nonstationaries than the previous
growing reference window.
A method based on parametric families of distributions was suggested by Sclove.*3 The
assumption here is that each segment is the sample function of a random process with
different probability distribution.
A comparison between three segmentation methods — the ACM, SEM, and GLR — has
been conducted56 on simulated data with parameter jumps and real EEG data. The results
show that for the simulated signal (with “ jum p” nonstationaries), the GLR method is
superior. For the EEG data, however, the SEM method may be recommended due to its
lower calculations load and satisfactory segmentation properties.
Segmentation approaches based on syntactic methods (see Chapter 13) also have been
suggested and applied, for example, to the analysis of speech signals.57

R E FE R E N C E S

1. M ann, H. B. and Wald, A ., On the statistical treatment of linear stochastic difference equations. Econ-
ometrica. 11. 173. 1943.
2. Kitagawa. G ., A nonstationary time series model and its fitting b\ a recursive filter. J. Time Series A n a l.,
2, 103, 1981.
3. Box, G. E. P. and Jenkins, G. M ., Time Series Analysis Forcasting and Control. Holden-Day. San
Francisco. 1970.
4. Hannan. E. J ., Time series analysis, IEEE Trans. Autom. Control. 19. 706. 1974.
5. Robinson, E., Physical Application o f Stationary Time Series, Macmillan, New York. 1980.
6. Parzan, E., Some recent advances in time series analysis, IEEE Trans. Autom. Control. 19. 723. 1974.
7. Sawaragi, V., Soeda, T ., and Nakamizo, T ., “ Classical" methods and time series analysis, in Trends
and Progress in System Identification. Eykhoff, P., Ed., Pergamon Press, Oxford. 1981. chap. 3.
8. Anderson. T. YV., The Statistical Analysis o f Time Series, John Wiley & Sons. New York. 1971.
9. Parzen, E., Time series model identification and prediction variance horizon, in Applied Time Series
Analysis. Vol. 2. Findley, D ., Ed., Academic Press, New York. 1981, 415.
10. Granger, C. YV. J ., Acronyms in time series analysis, J. Time Series Anal., 3. 103. 1982.
11. YYang, D. C. C. and Vagnucci, A. H ., TSAN' A package for time series analysis, Comput. Programs
Biomed., 11. 132, 1980.
12. Isaksson, A ., YVennberg, A ., and Zetterberg, L. H ., Computer analysis of EEG signals with parametric
models, Proc. IEEE, 69(4). 451, 1981.
13. Eykhoff, P., System Identification: Parameter and State Estimation. John Wiley & Sons. London, 1974.
14. Jakem an, A. J. and Young, P. C ., Advanced methods of recursive time series analysis. Int. J. Control,
37, 1291. 1983.
15. Cadzow, J. A ., ARMA time series modeling: an effective method. IEEE Trans. Aerosp. Electron. Syst.,
19. 49, 1983.
16. Cadzow, J. A ., ARMA modeling o f time series, IEEE Trans. Pattern Anal. Mi ch. Intelligence, 4, 124.
1982.
17. M akhoul, J ., Linear prediction: a tutorial review, Proc. IEEE, 63. 561. 1975.
Volume I: Time and Frequency Domains Analysis 107

18. D urbin, J ., The fitting of time series models. R e v . Inst. Int. Suit., 28(3), 233, 1961.
19. A kaike, H ., Statistical prediction identification, Ann. Inst. Stat. Math., 22, 203, 1974.
20. A kaike, H ., A new look at the statistical model identification, IEEE Trans. Autom. Control, A C -19, 716.
1974. * j
21. A kaike, H ., Fitting autoregressive models for prediction, Am. Inst. Stat. Math., 21, 243, 1969.
22. A kaike, H ., Likelihood o f a model and information criteria, J. Eton., 16, 3, 1981.
23. B ittani, S ., Is the prediction o f regression model white?, J. Franklin Inst., 315, 239. 1983.
24. T relter, S . A . and Steiglitz, K ., Power spectrum identification in terms of rational models, IEEE Trans.
Autom. Control, A C -12, 185, 1967.
25. M ayne, D . Q . and Firoozan, F., Linear identification of ARMA processes, Automatica, 18, 461, 1982.
26. G raupe, D ., K rause, D. J ., and Moore, J. B ., Identification o f autoregressive moving average parameters
of time series, IEEE Trans. Autom. Control, AC-20, 104, 1975.
27. C adzow , J. A ., Spectral estimation: an overdetermined rational model equation approach. Proc. IEF.E.
70(9), 907. 1982.
28. Pao, Y. and Lee, I). T ., Performance characteristics of the Cad/.ow modified direct ARMA method lor
spectrum estimation. Proc. 1st Acoust. Speech and Signal Process. Workshop on Spectral Estimation.
McMaster University. Hamilton. Ontario. Canada. Aug. 1981, 2.51-2.5.10.
29. Akaike, H ., Modern development of statistical methods, in Trends and Progress in System Identification.
Eykhoff, P.. Ed., Pergamon Press. Oxford. 1981. chap. 3.
30. Suzum ura, N. and lsh ii,‘N ., Estimation of the order of autoregressive process, Int. J. Syst. Sci.. 8. 905.
1977.
31. K ashyap, R. L ., Optimal choice of AR and MA parts in ARMA models, IEEE Trans. Pattern Ana!.
Mach. Intelligence. 4. 99. 1982.
32. Chaure, C . and Benveniste, A ., AR and ARMA identification algorithms of Levinson type: an innovation
approach. IE E E Trans. Autom. Control. 26. 1243. 1981.
33. H annan, E. J ., The estimation of the order of an ARMA process, Ann. Stat., 8, 1071. 1980.
34. Schw arz, G ., Estimating the dimension of a model. Ann. Stat., 6, 461, 1978.
35. Rissanen, J ., Modelling by shortest data description. Automatica, 14. 465. 1978.
36. Ishii, N ., Iwata, A ., and Suzumura. N., Evaluation of an autoregressive process by information measure.
Int. J. Syst. Sci., 9. 743, 1978.
37. Nloden, Y ., Yam ada, M ., and Arimoto, S., Fast algorithm for identification of an ARX mode! and its
order determination. IEEE Trans. Acoust. Speech Signal Process., 30, 390. 1982.
38. Haimi C ohen, R. and Cohen, A., A rapid gradient search algorithm for computing partial SVD and
principal component decomposition'', to be published in IEEE Trans. Pattern Anal. Mach. Intelligence, in
press.
39. M akhoul, J ., Stable and efficient lattice methods for linear prediction IEEE Trans. Acoust. Speed: Signal
Process.. 25. 423, 1977.
40. Friedlander, B ., Instrumental variable methods for ARMA spectral estimation, IEEE Trans. Acoust. Speech
Signal Process. 31, 404. 1983.
41. WHIsky, A. S ., A survey o f design methods for failure detection in dynamic systems. Automatica. 12,
601, 1976.
42. W illsky, A. S. and Jones, H. L., A generalized likelihood ratio approach to the detection and estimation
of jumps in linear systems, IEEE Trans. Autom. Control, 21, 108, 1976.
43. Rao, T. S ., A cumulative sum test for detecting change in time series. Int. J. Control, 34. 285. 1981.
44. V asseur, C. P. A ., Rajagopaian, C. V., Couvreur, M ., Toulottes, J. M ., and Dubois, O .. A micro
processor oriented segmentation technique: an efficient tool for electrophysiological signal analysis. IEEE
Trans. Instrum. M eas., 28, 259, 1979.
45. B odenstein, G . and Praetorius, H. M ., Feature extraction from electroencephalogram by adaptive seg
mentation. Proc. IEEE, 65, 642, 1977.
46. Sanderson, A. C ., Segen, J ., and Richey, E ., Hierarchical modeling of EEG signals, IEEE Trans. Pattern
Anal. M ach. Intelligence, 2, 405, 1980.
47. Jansen, B. H ., Hasman, A ., and Lenten, R., Piecewise analysis of EEG using AR modeling and clustering.
Comput. Biomed. Res., 14, 168, 198],
48. M ichael, D . and Houchin, J ., Automatic EEG analysis: a segmentation procedure based on autocorrelation
function, Eleciroencephalogr. Clin. Xeurophysiol.. 46, 232, 1979.
49. Barlow, J. S ., Creutzfeldt, O. D .. Michael, D ., Honchin, J ., and Epellbaum, H ., Automatic adaptive
segmentation o f clinical EEGs, Electroeneephalogr. Clin. Neurophysiol., 51, 512, 1981.
50. Bodenstein, G . and Schneider, W ., P-*"-m r^-i^nition of clinical, electroencephalograms. Proc. Int. Conf.
Dig. Signal Process., Florence Italy. 19X1, 206.
51. Praetorius, H. M ., Bodenstein, (I., and Creutzfeldt, O. D., Adaptive segmentation of EEG records: a
new approach to automatic EEG anaUsis, Eleciroencephalogr. Clin. Neurophysiol.. 42, 84, 197".
108 Biomedical Signal Processing

52. Appel, U. and Brandt, A. V ., Adaptive sequential segmentation of piecewise stationary series, Inf. Sci.
N.Y., 29, 27, 1983.
53. Sclove, S. L ., Time series segmentation: a model and a method. Inf Sci. N .Y ., 29, 7, 1983-
54. Segen, J. and Sanderson, A. C ., Detecting changes in time-series, IEEE Trans. Inf. Theory, 26, 249,
1980. |
55. Basseville, M . and Benveniste, A ., Sequential segmentation of nonstationary digital signals using spectral
analysis, Inf. Sci. N.Y., 29, 57, 1983.
56. Appel, U. and Brandt, A. V ., A comparative study of three sequential time series segmentation algorithms.
Signal Process., 6, 45, 1984.
57. De M ori, R ., Computer Models o f Speech Using Fuzzy Algorithms, Plenum Press, New York, 1983.
Volume I: Time and Frequency Domains Analysis 109

Chapter 8

SPECTRAL ESTIM ATION

I. INTRODUCTION

Biomedical signals most often are the result of processes that take place in the time
domain. The analysis of such signals, however, may be more convenient and effective in
the frequency domain. This may be the case both in the deterministic and in the stochastic
cases. In most cases, power spectral density function (PSD), known also as “ the spectrum ” ,
is of interest (see Chapter 6).
Spectral analysis of the EEG1 has been used both for clinical and research purposes.
Detailed spectral analysis can be applied to the auLuiuaiic classification of sleep states- and
of the depth o f anaesthesia3 as well as to the classification of a variety of neurological
disorders. Spectral analysis of EJvIG signals serves also in the clinic and in research work.
It has been shown, for example, that muscle fatigue4 can be characterized and. to some
extent, predicted by processing the EMG spectrum. Power spectral analysis of speech signals
has been used to assist in the diagnosis of laryngical disorders.5 The analysis of hand tremors,
pressure and flow waveforms are but a few examples of the application of PSD processing
in biomedical signals.
The exact PSD function cannot, in general, be calculated. The given signal is time limited,
nonstationary. and corrupted by noise. It is necessary, therefore, to estimate the PSD from
the given, short data record. Earlier methods of PSD estimation were based on the estimation
of die Fourier transform. An important step in modem spectral analysis was Wiener's work'1
which established the theoretical framework for the treatment of stochastic processes. Wiener
and, independently. Khinchin7 have shown the Fourier transform relationship between the
autocorrelation function (of a stationary process) and its PSD. This is often referred to as
the Wiener-Khinchin relationship.
Prior to the introduction of the FFT algorithm in 1965. the accepted method for otim ating
the PSD was based on the implementation of the Wiener-Khinchin relationship, suggested
by Tukey and Blackman." According to this method, the discrete autocorrelation coefficients
are first estimated using the sequence of windowed data. The windowed correlation is then
Fourier transformed to provide the estimated PSD. The Blackman-Tukey estimation pro
cedure suffers from poor resolution, high cost of computation, and poor accuracy mainly
due to sidelobes of the window which may even produce negative estimates for the positive
PSD.
Since the introduction of the FFT algorithm, a lot of progress has been made in spectral
estimation. Methods based on the FFT and on time series analysis have been extensively
researched and reported in books914 and articles.15'25 Attempts to provide a uniform approach
to the various estimation methods or to categorize them have been reported.22 23 31
This chapter describes several methods of PSD estimation. The Blackman-Tukey procedure
is briefly discussed followed by the more modern approach of the periodograms. Time series
analysis techniques based on AR (and the maximum entropy method), MA, and ARMA
models are presented in more details. More specialized cases of the Pisarenko harmonic
decomposition (PHD) and Prony’s methods are discussed as well as the maximum likelihood
method (MLM). 1
Finally, a comparison of the various techniques is made with the goal of providing some
help in choosing the right method for a given application.
110 Biomedical Signal Processing

II. M ETHODS B A S E D O N THE FOURIER TRA NSFO RM

A. Introduction
We shall consider two approaches for PSD estimation based on the Fourier transform.
The first is the direct method suggested by Tukey and Biackman8 which uses the discrete
form o f the Wiener-Khinchin relation. This method requires first the estimation of correlation
coefficients and second the application of DFT (to the correlation sequence) to get the desired
PSD estimation. The second approach is the indirect approach known as the periodogram.
Here the estimation is achieved by applying the DFT operator directly to the (windowed)
data and then smoothing or averaging the absolute values of the DFT.
In general, these two methods do not yield identical results. However, if a certain biased
estimator is used for the correlation estimation, and as many correlation coefficients as data
samples are used, then the two methods do yield identical results.26
One major problem 7. transform PSD estimation methods is due to finite time
data sequence used. The PSD estimated is not that of the process, but that of a sample
function multiplied by a window. In the frequency domain, this yields the PSD of the process
(the desired function) convoluted with Fourier transform of the window. When the power
of the signal is concentrated in a narrow bandwidth, the convolution operation will “ spread”
the power into adjacent frequency regions, a phenomenon known as leakage. The leakage
causes both lack of resolution and inaccuracies. The power due to weak sinusoidal com
ponents in the signal can be completely masked by the sidelobes of adjacent stronger
sinusoides. For a rectangular window, for example, the resolution (the 3 db of main lobe
width) is approximately15 the inverse of the observation time (NAT). Better weighting
windows are available-7-2* (see also Appendix B, Volume II).
One way to reduce the finite observation time problem is to use extrapolation techniques
to estimate the data outside the observation window.85 Another problem arises when the
number of data samples is small. A common procedure, known as zero padding. is often
used. Zeroes are added to the given data sequence (autocorrelation in the case of Blackman-
Tukey algorithm or actual signal in the case of the periodogram) before applying the DFT.
The result is the estimation of the PSD with additional interpolated values within the given
frequency r^.nge. The basic resolution of the estimation is not improved by the zero-padding
technique; the results are, however, much smoother and in some cases with less ambiguities
in peak spectral determination.
Consider a data sequence { x j. k = 0, 1, ..., P - 1. Its DFT is given by

p i ••
Xni = At xk exp(—j27T m k/P)
k 0
m = 0 ,1 ,...P - 1 (8.1)

where At is the sampling interval. The estimation range in the frequency domain is given
by:

— it/At w ir/At (8.2)

If w;e now pad the data with L zeroes, we get a “ new” data sequence {xk}, k = 0, 1. ....
P - 1 + L with

xk — 0: k = P, P + 1,..., P - 1 + L (8.3)

The DFT of the zero-padded sequence is

Volume 1: Time and Frequency Domains Analysis 111

p - i

X* = A* 2 xk exp {-j2Trmk/(P + L)}

It 0

m = 0,1,....P - 1 + L (8.4)

The DFT (Equation 8.4) of the padded sequence has the same frequency region (Equation
8.2) as that of the nonpadded signal (Equation 8.1), but with P + L spectral points rather
than P spectral points. If we choose L = qP where q is an integer, then Equations 8.1 and
8.4 are equal for all frequency samples m = (1 + q)m. An efficient FFT algorithm for
zero-padded sequences is available. ^

B. T he B lackm an-T ukey M ethod

Assume a wide sense stationary process x(t). It is desired to estimate its PSD function.
The W iener-Khinchin theorem relates the PSD with the autocorrelation function

. S(w) = J tx( t ) e x p (-jw T )d T (8.5)

Consider the discrete form of Equation 8.5

M
S(w) = At ^ rx(m) exp( - jwmAt) (8.6)
m= -M

where At is the sampling inierval and ?x(m) m = - M , ....... M are the discrete estimates
of the correlation function.
The correlation coefficients are estimated from the sampled data sequence { x j, k = 0,
1, ... N - 1. Rather than using the unbiased estimator

. N —m - 1

---------- X x„-m xn; m = ( U ,....P ^ N — 1 (8.7)

N - m „=0

a biased estimator has been suggested:14

J N -m -l

f,(m) = - 2 xr.-mx„
n= 0

rx( - m ) = rx(m); m = 0 ,1 ,....P ^ N - 1 (8.8)

This biased estimator of the correlation function has an expectation

E{rx(m)} = rx(m) (8.9)

which is the true autocorrelation function weighted by a triangular weighting window (Bartlett
window). Estimator 8.8 tends to have lower mean square error than the unbiased one for
many finite data sequences. Note that this type of estimator has been used for the estimation
of the AR coefficients (see Equation 7.27).
The Blackman-Tukey8 PSD estimation is given by using the estimates of Equation 8.8 in
Equation 8.6. Equation 8.6 can be solved by means of the FFT. The sequence rx(m) used
in Equa on 8.6 can be zero-padded as discussed in the previous section.
Figure 1 shows the Blackman-Tukey estimated PSD of synthesized sine waves corrupted
with PRBN. Three sine waves were used and the estimation was performed with various
112 Biomedical Signal Processing

FIGURE I. Power spectral density function estimation by the Blackman-Tukey method. Synthe
sized signal consisting of tVee sinusoidals with additive white noise. Upper trace: 3?. correlation
coefficients and 480 padding zeroes. Middle trace: 256 correlation coefficients and 416 padding
z.eroe . Lower trace: 256 correlation coefficients and 256 padding zeroes.

numbers of data points, observation times, and zero padding. Figure 2 shows the estimation
of an EMG signal by the Blackman-Tukey method.

C . The P eriodogram
/ . Introduction
The periodogram is a method for PSD estimation using the data sequence without the
need to first estimate the correlation coefficient.
It can be shown9 that the Wiener-Khinchin relation (Equation 8.5) can be rewritten as:

S(w) = lim e | J j jj ^x(t) e x p (-jw t)d t| jj (8.10)

Implementation of Equation 8.10 is impractical since it requires both infinite time integration
and statistical expectation. The periodogram is thus an estimator of Equation 8.10, which
can be practically implemented.
Consider the wide sense stationary process x(t) with the windowed data sequence {xk}
such that
Volume I: Time and Frequency Domains Analysis 113

50 100 Ka

FREQ.

FIGURE 2 Power spcctr;;' density function estimation by the Blackman-Tukey method. SiuLice
EMG recoidcd o \c r the respiratory diaphragmatic muscle, sampled at 400 Hz. Traces are same as
in Figure i .

w(k) x(k At); k = 0 ,1 ,..., N — 1

0; otherwise
( 8 . 1 !)

where w(k) is the window function and N is the number of samples in the sequence, use
the following estim ator for the autocorrelation:

f x(m ) = ^ X x k x k + m- -o c < m < x ( 8 .1 2 )

Note that since we have defined an infinite sequence in Equation 8.11 we can talk about
infinite dimensions for the correlation function estimates. Introducing the estimates o1 Equa
tion 8.12 into the PSD estimation Equation 8.6 yields

S(w) = —: X 2 W + exp(-jw m A t) (8.13)

N m- - / k -- -

which can be rewritten as

114 Biomedical Signal Processing

S(w) = ~ ' 2 xk EXP(jwkAt) • 21 xk+m e x p (-jw (k + m)At) (8.14)

Changing the variable of the inner summation n = k + m we get

(8.15)

where X(w) is the DFT of xk given by Equation 8.1 and an asterisk (*) denotes the conjugate.
Note that |X(w)|2 is the energy distribution function. The division by At was required in
order to get the PSD.
The major advantage of the periodogram method is the fact that one can use the efficient
FFT algorithm to compute the DFT. When calculating the square absolute value of the DFT
of the sequence { x j by means of the FFT we get:

n- 1

|XF(w)|2 = | ^ xk exp(-j27Tmk/N)i2 (8.16)

where XF is the FFT result which is scaled differently than the quantity calculated from
Equation 8.1. When using the FFT, a scaling factor must be used such that

S(w) = — |XF (w)l2 (8.17)

The periodogram (Equation 8.17), though an efficient estimator from calculations point
of view, has been shown to have large variance. Example of periodograms of white noise
process in which the variance does not decrease even when longer sequences are used have
been demonstrated:26 The large variance comes as no surprise since in the estimation 8.15
the expectation operator present in Equation 8.10 has been ignored.
The expectation and variance of the periodogram will be discussed followed by several
methods of smoothing and averaging.

2. The Expected Value o f the Periodogram

Consider the expectation of the periodogram. For Equation 8.13 and using Equation 8.12
we get

E{S(w)} = At X E{rx(m)} ex p (-jw m A t) (8.18)

The expected value of the correlation depends on the correlation estimation, namely, the
window used in Equation 8.11. Hence the expectation of the correlation

E{rx(m)} = — 21 E{x(k At) x(k + wAt)} w(k) w(k + m) =

N k=

rx ( m ) ^ X w k w k+ m = rx(m )r*(m ) (8.19)

where rx(m) are the true process correlation coefficients and rvv(m) are the correlation coef
ficients of the window. Introducing Equation 8.19 into Equation 8.18 we get
Volume I: Time and Frequency Domains Analysis 115

E{S(w)} = At 2 rx(m) rw(m) exp( - jwmAt) (8.20)

The periodogram is thus a biased estimator, with bias resulting from the window rw(m).
In the windows used, the bias vanishes as N approaches infinity.
When a rectangular window is used,

I
I; 0 ss k *= N - 1

0; otherwise
( 8 . 21)

the window correlation coefficients are iiiven by

!m . .
|m| ^ N

r„(m) (8 .22 )

0; otherwise

and the expected value of the periodogram (given rectangular window) is

E{S(w)f = At ^ t\(m) ( I - exp(-jw m A t) (8.23)

Note that Equation 8.20 gives the expected value of the PSD estimator in terms of the
DFT of the multiplication of the two sequences rx(m) and r jm ) . Applying the complex
convolution theorem we get

At f vSl
E{S(w)} = — S( t])R w( w - iq)dTi (8.24)
277 J- - ±-

where S(w) is the true PSD and Ru(w) is the DFT of rw(m). Equation 8.24 states that the
expected value of the periodogram is the true PSD “ viewed” through the filter R,v(w).
Consider, for example, the use of a rectangular data window (Equation S.21). The cor
relation window is then given by Equation 8.22. and Rw(w) is

1 / Sin W N/2\ -
R*(w, = N ( l i 7 w i r ) (8'25)

figure 3 shows the data window, the correlation window, lag window in the time and in
the frequency domain, and a schematic description of the frequency convolution (Equation
8.24). Here the process under investigation, whose true PSD is S(w), was chosen as a
random process with strong spectral peak at w = w ,. Figure 3D depicts the layout for the
estimation of S(wu) by Equation 8.24. The expected value of the estimation at this frequency
is given by the area under the multiplication of the two curves. In this particular example,
the estimation of the PSD at w = w() may be largely due to the effect of the first sidelobe
of R„ which coincides with the peak of S(w). Hence, the leakage due to the peak and first
sidelobe will cause S(w(); to be large even though S(wu) is very small.
116 Biomedical Signal Processing

(k)

N k

FIGURE 3. Power spectra! density estimation with rectangular data window. (A) Uata window; (B) correlation
(lag) window in time domain; (C) correlation window in frequency domain; (D) frequency convolution.

The main sidelobe for this window has the width of 4n/NAt. As N increases, the window
becomes narrower reducing the leakage. At the limit, as N approaches infinity, the window
becomes a delta function having no leakage at all. A detailed discussion on the various
windows used for signal analysis is presented in the appendix.

3. Variance o f the Periodogram

In addition to being a biased estimator, the periodogram suffers from large variance.
Unfortunately, in many cases, the variance does not become smaller even when N is increased.
Volume I: Time and Frequency Domains Analysis 117

in L W w ¥ .

FIGURE - Power spectral density function estimation by means of the peiiodoiirum. Synthe
sized nois\ '.r.-so id als as in Figure I. Upper trace: 512 samples and 512 padding zeroes. Lower
trace: 102- ^ " ip le s . no padding zeroes.

Jenkins and W aits14 have shown that for a gaussian process

lim [S(w)j = S2(w) <S.26)

N—*

If we define the standard deviation of the periodogram as the measure for noise, then the
signal-to-noise ratio for the periodogram of a gaussian process with an infinitely large data
window is only 1. Nongaussian processes yield approximately the same results.
The main reason for the large variance of the periodogram is the fact that the estimator
(Equation 8.15) does not take into account averaging as indicated by the expectation operation
in Equation 8.10.
A detailed discussion on the variance of the periodogram and on confidence limits of the
PSD estimation is given by Koopmans.9
An estimation o f the PSD with periodograms is demonstrated in Figures 4 and 5. The
same signals used for the Blackman-Tukey PSD estimation (Figures 1 and 2) are used here.

4. Weighted O verlapped Segment Averaging (WOSA)

An effective method for the reduction of the large variance is to average several perio
dograms calculated from finite segments of the stationary time sequence. An improved
method has been suggested by W elch.30
Consider the sequence { x j, k = 0, I, N - 1. We define segments of length L. such
that the it^ o—' — nt is given by the sequence

xj. = x(k + (i - 1)D) k = 0 .1 ......L - 1

i = 1.2......1 (8.27)
118 Biomedical Signal Processing

FIGURE 5. Power spectral density function estimation by means of the periodogram. Surface
EMG as in Figure 2. Traces as in Figure 4.

1.5 SIC.

V A /1

FIGURE 6. EMG signal segmented with 25% overlapping for WOSA spectral estimation. Upper trace: EMG.
Lower trace: hamming windowed overlapping segments.

Each two adjacent segments overlap with D samples. The I segments covcr the given data
sequence { x j such that the last sample of the I’s segment obeys L + (I - 1)D = N. An
EMG signal, such segmented, is described in Figure 6.
W e shall now calculate the periodograms of the I overlapped segments, each multiplied
by a data window w(k). Denote the normalized periodogram of the ith segment by S'(w),
hence:

S'(m) = ~ | X 4 w(k) exp (-j2 'irm k /L )|2

n wL ic--(>

i = 1 ,2 ,...,I ( 8 .2 8 )
Volume I: Time and Frequency Domains Analysis 119

Where the normalization factor F u is the average power of the window

(8.29)

The WOSA spectral estimate is the average of all I normalized periodograms

S(m) = - X S’(m) (8.30)

»i i

The expectation of the estimator is similar in nature to the one calculated for the periodogram
(Equation 8.24). The variance, however, is improved.
Define the normalized covariance, p(j), between two normalized periodograms such that

p(j) = Covar {Sj(w), S1f j(w)}/p(0) (8.31)

Then it can be shown30 that the variance of the WOSA estimate is

(8.32)

Note that for D > L such that the covariance (Equation 8.31) approaches zero we get p(i)
= 0 for all i > 0 . Hence the improvement in the variance (over a periodogram with L
samples) is I times. Nonoverlapping segmentation, therefore, should be employed if N is
large enough. If the total number of samples, N, is not large, it is recommended30 to overlap
the segments by one half of their length (D = L/2).
Several attempts have been made recently 22-23.31 to provide a general framework for PSD
estimation. It has been argued that both the Blackman-Tukey and the WOSA estimators are
special cases o f a general estimator. Figures 7 and 8 show an example of the PSD estimation
by means o f WOSA. The reader is referred to Figures 1 ,2 ,4 , and 5 for comparison.

5. Smoothing the Periodogram

An alternative method for reducing the variance of the periodogram is by smoothing.10
Given a single periodogram, S(w) (Equation 8.15) calculated from all available data, we
can smooth it by passing it through an appropriate spectral filter (window) H(w). The
smoothed estimate < S(w) > is thus given by the spectral convolution

(8.33)

Assume, for exam ple, a rectangular window of frequency width of 2B

|w — tj| < B

H(w - T\) = < (8.34)

0; otherwise
V

and the smoothed periodogram

120 Biomedical Signal Processing

FR EQ .

FIGURE 7A. Power spectral density function estimation by means of WOSA with 25% overlapping.
Synthesized sinusoidals signal as in Figure 1. Averaging over one segment o f 1024 samples.

FREQ.

FIGURE 7B. Averaging over seven segments of 1024 samples each.

1 f w' + B A
<s(Wj) > = — scn)dn (8.
2B Jnv. - b

In the discrete case where B — LAf we get

PSD Volume I: Time and Frequency Domains Analysis

FIGURE SA. Power spectral density function estimation by means of WOSA with 25% overlapping
EMG signal as in Figure 2. Averaging over one segment of 1024 samples.

FREQ .

FIGURE 8B. Averaging over seven segments of 1024 samples each.

(8.
S' " ' • -2 1 A ' , -

which is the average value of 2L + 1 adjacent values.

The bias of the smoothed periodogram is seen from (Liquation 8.33) to be
122 Biomedical Signal Processing

f tr/At
E{<S(w) > } = E{S(t))} H(w - -n)dti (8.37)
• J —n/At

where the expectation term in the integrant is given by Equation 8.20. For very large N we
get

f rr/A
- t

E{<§(w ) > } = At S(n)H(w - T))dT) (8.38)

Jir/At

Hence, the smoothing operation has introduced a bias, even for large N.
Triangular smoothing window has often been used. This window results in less bias than
the rectangular one since it gives less weight to remote periodogram samples. However, the
variance due to the triangular is larger than achieved by the rectangular one.

III. M A X IM U M E N T R O P Y M E T H O D (M EM ) A N D T H E A R M E T H O D

The MEM is also known as the Maximum Entropy Spectral Estimation (MESE)15
and the Maximum Entropy Spectrum Analysis, MESA.33 The method was originally pro
posed by Burg34 for geophysical applications and was since dealt with by many research
ers.*"''3237 Several special problems have been investigated.38'40 More recently, attention has
been placed on the application of the method to multidimensional PSD estimation.41 43
Modifications and extensions to the original method have also been suggested.44'45
In the one-dimensional case, w'hen consecutive correlation values are available, the MEM
method is identical to the AR PSD estimation method.46 These two methods will be discussed
jointly in this section.
The MEM PSD estimation can be posed as follows: Given p + 1 consecutive estimates
of the correlation coefficients o f the process {x(t)}. rx(i), i = 0, 1, ..., p, estimate the PSD
of the process. Clearly what is needed for the estimation are the unknown correlation
coefficients rx(i); i > p. Burg34 has suggested these autocorrelation coefficients be extrap
olated so that the time series characterized by the correlation has maximum entropy. Out of
all time series having the p + 1 given first autocorrelation coefficient, the time series that
yields maximum entiopy will be the most random one or, in other words, the estimated
PSD will be the flattest among all PSDs having the given p +• 1 correlation coefficients.
The entropy is a measure of the amount of information (or “ ignorance” ) we have on a
process. Consider a discrete process with J states each having the probability of occurrence
Pi j = 1, 2, J. Assume that initially no knowledge is available on the probabilities pj.
The information we posses, on the system, is limited. If now we are given the value of a
certain pi9 we have gained a certain amount of information on the system or our state of
ignorance has been reduced. The amount of information added, AI, is defined (in “ bits” )
by

AI = log, - (8.39)
'P i

The average information per time interval, H, is known as the entropy, and is given from
Equation 8.39 by

H = 2 P, log- ~ ~ X Pilog^Pi (8.40)

.= i " P i i= i

In the deterministic case, where we know that event i has happened with probability one,
Volume I: Time and Frequency Domains Analysis 123

all other p}’s arc zero and the entropy is zero. In the random case, the entropy is always
positive. The entropy is thus a measure for the randomness of the system.
In the continuous case the relative entropy is given by

H = - J ^ p(s) Iog2 p(s) ds (8.41)

where p(s) is the amplitude distribution of the signal s(t). We would like to write down a
relation between the estimated spectrum of the signal S(w) and the entropy. Assume the
signal s(t) was generated by passing a white signal through a linear filter having a transfer
function S(w). It can be shown47 that the difference in entropies, AH, between that of the
signal s(t) and the input is

AH = j log, S(w)dw (8.42)

Since there are an infinite number of signals with white spectrum, the exact input is
unknown. However, we know that we want to maximize the entropy (Equation 8.42) subject
to the constraints that

r(n) = | S(w)exp(j2IlwAtn)dw

n = 0 .1 ......P (8.43)

This constraint maximization will ensure that the estimated spectrum is the spectrum of a
process having the flatness spectrum of all processes with the given P + 1 correlation
coefficients.
The maximization of Equation 8.42 with the constraints (Equation 8.43) can be solved
using the Lagrange multiplier technique. The result is found to be

criAt
S(w) = ---------------- e------------------ (8.44)
jl + 2 ai e x p (-j2 iriw )p
i= I

where a p and a* i = 1, ..., P are determined from the data. It has been sho\vn4S'49 that the
MEM PSD estimation (Equation 8.44) is identical with the estimation of PSD of an AR
m odel.40
Recall that the AR model (Equation 7.13) presents a random sampled signal xk as the
output of a filter H(z) = A ~'(z) with white sequence as an input. The estimated PSD of
xk is thus

S(w) = |H(jw)|2 = L U ’ = --------- p----- ~ --------------- <8-45>

JW I 1 + S aiexp(-j2TTiw)|2

where G 2 equals the mean square error, Ep, and is given in terms of the correlation coefficients
by Equation 7.35. The estimated AR coefficients a* are given by Equation 7.30 and the
order, p, by one of the methods discussed in Chapter 7.
Equation 8.45 can be rewritten" as
124 Biomedical Signal Processing

p
r(0) + 2 a,r(i)
____________ i = I
§(w) = (8.46)
p(0) + 2 p(i)cos(iw) |

with

p -i

p(i) = 2 . M t +i. a» = 1
k=0

i ■= 0 ,1 ,...,p (8.47)

Note that in the Blackman-Tukey PSD estimator (Equation 8.6) the power spectrum is
estimated by means of the p -I- 1 first correlation coefficients, calculated from the data
while the correlation coefficients r(i), i > p are assumed to be zero. In the MEM and AR
PSD estimation, the first p + 1 correlation coefficients are calculated from the data and are
identical to the ones used in the Blackman-Tukey algorithm. The coefficients r(i), i > p are
not assumed zero but are calculated by the maximization of Equation 8.42 with the constraint
(Equation 8.43). This yields the same result as the AR model where

p
r(i) = - J a k r ( i - k), i> p (8.48)
k= 1

The statistical properties of the MEM and AR spectral estimators have been investigated
by several authors.50'33 It has been demonstrated that the variance of the estimators is inversely
proportional to the data length.
The MEM estimator has been experimentally compared54-55 with the WOSA estimator
(Equation 8.30). The MEM estimator has been found to have superior frequency resolution
capability, especially for short records and to have larger dynamic range. Hung and Herring,55
however, report that when considering the detection of sinusoid in additive white noise, the
DFT based detector consistently provided higher signal detection probability and a more
accurate estimate of signal frequency than the MEM. Dyson and Rao33 have concluded that
the MEM methods show promise of achieving the detection performance of long observation
interval DFT analysis, at a reduced observation time, for useful SNR range.
The MEM PSD estimator is known to yield incorrect results for sinusoidal signals in
additive white noise, a phenonmenon sometimes called line splitting?9 54 56 Line splitting is
the occurrence of two or more closely spaced peaks in the estimated PSD, where only tme
should be present. Line splitting is most likely to occur when the SNR is high, the initial
phase is some odd multiple of 45°, the time duration is such that sine components have an
odd number of quarter cycles, and the number of AR coefficients (the order of the model)
is a large percentage of the number of data samples.
The correct model (with white noise as an input) for describing an N-pole complex
sinusoidal signal in additive white noise is an N-pole and N-zero system with equal gain
weights for its pole and zero parts. When an AR model is forced to describe such a signal,
infinite number of poles are required. The use of finite numbers of poles is the source for
the line splitting and line shifting inaccuracies. Several methods have been suggested to
overcome the line splitting problem.56'58
A second problem associated with the MEM method u*ui of the bias m the positioning
o f the spectral peaks with respect to the true frequency of the peaks. This shift is sometimes
known as the frequency estimation bias. Swingler60 has shown that this bias can be of the
Volume I: Time and Frequency Domains Analysis 125

FIGURE 9. Power spectral density function estimation by the MEM method. Synthesized sinu-
soidals in Figure I. Upper trace: AR model order is 10. Lower trace: AR model order is -*0.

Yet another problem exists when estimating the PSD of sinusoids in noise with MEM.
it has been shown59 that the peak amplitudes in the MEM are not linearly proportional to
the power. In high SNR the peak is proportional to the square of the power.
Recently, many modifications and improvements for the MEM have been suggested for
AR and multivariate AR spectral estimation (see, for example, References 61 through 63).
Experimental results with MEM PSD estimation are shown in Figures 9 and 10.

IV. THE M OVING AVERAG E (M A) METHOD

The process, its PSD is to be estimated, can be modeled by an MA(q) model (Equation
7.15). Its estim ated PSD is given by

S(w) = |B (exp (-jw A t) )|2 Sn(w) (8.49)

where Sn(w) is the PSD of the input white noise. The coefficients b(, j = 1, 2, .... q of the
MA model can be estimated as discussed in Chapter 7, Section III. In terms of these
coefficients, the estimator becomes:

q q q
S(w) = | 2 ) bj e x p (-jw iA t)|2 = ^ ^ bmbn e x p (-j(rn - n)wAt) (8.50)
i= 0 n= 0 m= 0

The last equation can be reWritten so that

q
S(w) = X r(n) ex p (-jw n A t)
n = -q

order of 16% o f the frequency resolution (l/NAt). Methods for overcoming this problem*
were also suggested.58
126 Biomedical Signal Processing

FIGURE 10. Power spectral density function estimation by the MEM method. EMG signal as in
Figure 2. Traces are as in Figure 9.

with

q
r(n) = X 6kbk_n; -q n ^ q
k= 0

and

bk = 0 for k > q and k < 0 (8.51)

It can be easily shown that the quantities r(n) in Equation 8.51 are the autocorrelation
coefficients of the MA model.
Hence, the MA spectral estimator (Equation 8.51) is the same as the Blackman-Tukey
estimator (Equation 8.6). The Blackman-Tukey spectral estimator can be thus considered a
special case o f the MA spectral estimator. The MA estimator will be effective when the
spectra to be estimated contain sharply defined notches and do not contain sharply defined
peaks.

V . AUTOR EG RESSIVE M OVING AVERAGE (A R M A ) M ETHODS

A. The General Case

In most applications, the power spectrum to be estimated contains both notches and peaks.
The more general ARMA (p,q) rational model has obvious advantages over the AR(p) and
MA(q) models discussed in previous sections. Although the estima !. .. .f aie ARMA coef
ficients (Chapter 7, Section IV) requires a lot of computation effort, it has become an
important method due to its superiority in the PSD function estimation. Several algorithms
Volume /. Time and Frequency Domains Analysis 127

had been suggested for ARMA (p,q) spectral estimation (e.g., References 64 to 70). The
statistical properties of the ARMA (p,q) estimator have been investigated7173 and some
preprocessing techniques for the improvement of estimation have been suggested.74 Cadzow,
in a comprehensive paper,75 has presented various methods for AR, MA, and ARMA power
spectral estimation.
Once the order p,q and the coefficients of the ARMA model had been estimated, the PSD
estimate is obtained by

where H(jw) is given in Equation 7.44 and i>n(w) is the PSD function of the input white
noise having a variance a 2. The order of the ARMA model p,q and its estimation are
discussed in Chapter 7, Section IV.
Cadzow75 has shown that when the ARMA coefficients are evaluated by the overdeter
mined rational model equation approach, the resultant PSD estimate is less sensitive to the
coefficients estimates. Cadzow’s approach calls for the determination of the coefficients
from an overdetermined set of the Yule-Walker equation (see Chapter 7, Section II), for
example by means of the singular valued decomposition (SVD) technique (for SVD analysis,
see Chapter 3, Volume II).

B. P isaren k o ’s H arm onic Decomposition (PHD)

In many applications, the signal (its PSD is to be estimated) can be considered as several
sinusoids in additive noise. This may be the case for special types of EEG, gastrointestinal
signals, breathing signals, infants cry, and many other biomedical signals. This type of a
signal is also widely used as a test signal for system identification. Pisarenko76 77 has sug
gested an estim ator for the PSD function of a signal consisting of p/2 sinusoids in additive
white noise

p-
Xn == 21 Aj sin(WjnAt + 4\) -f nn (8.53)
i= i

where A ; and (I), i = 1, ..., p/2 are the amplitudes and phases of the sinusoids and {nj is
a sequence from the white noise process having zero mean and a variance of a 2. The noise
is uncorrelated with the sinusoids.
Using the trigonometric identity:

Sin(wnAt) = 2 cos (wAt) sin (wAt(n - 1) - sin (wAt(n - 2)) (8.54)

and letting x„ = Sin(wnAt), we get the difference equation:

xn = (2 cos(wAt))xn_, - xn_2 (8.55)

Hence, the samples of a determinstic sinusoid can be described by means of a second order
AR(2) equation.
In general, for the deterministic summation of p/2 sinusoids,15 the resultant difference
equation is an AR(p) equation
128 Biomedical Signal Processing

xn = - X amxn_m (8.56)
m= I

Transferring Equation 8.56 into the Z domain yields the characteristic equation

(l + 2 amZ- i) = 0 = f (Z - Z )(Z - z;> (8.57)

There are p roots to the characteristic Equation 8.57 arranged in conjugate pairs

zk = exp(jwkAt) and = e x p ( -jw kAt)

k = 1 , 2 , . . . , p/2. The roots are all located on the unit circle where the frequencies wk are
the frequencies of the sinusoids -present m me m - uui.
Returning now to the noisy case (Equation 8.53)., we get

Xn — > Xn + °n = ~ 2 a mX n - m + (8 .5 8 )
ni —I

Defining a<> = 1, the last equation can be rewritten as

p p

V Xn = — 2 a mX n - m + 2 a mn n - m (8 .5 9 )
m= I m. = 0

Equation 8.59 states that the signal represented by Equation 8.53 is a special ARMA(p,p)
process. In this process, the AR(p) and MA(p) coefficients are identical. Due to this property,
the identification of this special ARMA(p,p) process is less complicated than the general
case. Techniques, simpler than the ones discussed in Chapter 7, Section IV can be applied;
one method is presented here. *
The ARMA (p,p) Equation 8.59 can be written in a matrix form. Define

xT = [xn, xn_ , -----.x„_p]

aT = [1, a,, a2,....,a p]

nT = [nn,nn_ ,,......,nn_p] (8.60)

Introducing Equation 8.60 into Equation 8.59 yields

xT a = nT a (8.61)

Premultiplying both sides of Equation 8.61 by x and taking the expectation we get

E{x xT}a = E{x n1} a (8.62)

noting that (because of the assumptions made on the noise) the cross correlation between
the noisy observation x and the noise n is

E{x nT} = cr“I (8.63)

where 1 is the identity matrix, and'the signal’s (p + 1) (p + I) autocorrelation matrix Rx

Volume I: Time and Frequency Domains Analysis 129

r*(0) rx( —1 ) ,.......... ,rK( - p )

E{x xT} = Rx = (8.64)

r*(p), ,rx(0)

Hence, we get

Rxa = (ti a (8.65)

Equation 8.65 states that the coefficient vector a is an eigenvector of the correlation matrix
Rx and the noise variance is the corresponding eigenvalue. The eigenvector must be scaled
such that its first component equals one.
It can be show n15 that the eigenvalue a ; is the minimum eigenvalue of the correlation
matrix with the correct dimension (p + 1) (p + 1). In the overdetermined case, where the
correlation matrix is generated by more than (p + 1) lags, the minimum eigenvalue is
repeated.
The autocorrelation of the signal xn given by Equation 8.53 is

J p/2

rx(0) = ct; + - 2 A? (8.66A)

Z i= i

rr,(k) = - 2 A~ cos(W jkAt), k'^ 0 (8.66B)

Assuming the frequencies w> i = 1 ,2 ....... p/2 and the correlation coefficients rv(k) k =
0. I, ... are known, the sinusoids power Af/2 can be determined. Define the power vector

P1 - [A t/2 , A j/2,...A ;/2] (8 .6 7 )

the correlation vector

rj = [rK( 1), rx(2)........rx(p)] ( 8 . 68 )

and the cosine matrix

cos (w,At) cos (w,At) ...... cos (wpAt)

C = (8 .6 9 )
cos (w,pAt) .... .... cos (wppAt)

The power coefficients are given by introducing Equations 8.67, 8.68, and 8.69 into Equation
8.66B

£ = c 'rx (8.70)

Pisarenko's PSD estimator can now be formulated in an algorithmic form:

1, Estimate the order, p, of the model, which is twice the number of sinusoids present
in the si anal;
130 Biomedical Signal Processing

2. Estimate, for the data, p + 1 terms of the autocorrelation function using the biased
estimator (Equation 7.27);
3. Solve the eigenvector equation (Equation 8.65);
4. Repeat steps 2 and 3 with increasing order until the minimal eigenvalue remains
unchanged;
5. The order, p, the noise variance (minimal eigenvalue) are determined. The vector a
is taken to be the eigenvector corresponding to the minimal eigenvalue;
6. Solve Equation 8.57 to get the roots and the frequencies, wk;
7. Solve Equation 8.70 to get the power of the various sinusoids;
8. Solve Equation 8.66A to get the noise power.

An efficient method for solving the eigenvector equation is discussed in Chapter 3, Volume
II. When a priori knowledge about the signal exists, stating it consists of sinusoids in
additive noise, Pisarenko’s method has the advantage o f pro a estimate with 8
functions. Other methods such as the AR spectral estimator will “ sm ear” the spectrum.
However the order is not known exactly. It may be estimated too high: then spurious
components may be introduced to the PSD estimation; or too low: then the spectral com
ponents will usually appear at incorrect frequencies. Another inaccuracy source of the method
is the fact that the autocorrelation coefficients are estimated by means of the biased estimator.
This is done in order to ensure that the autocorrelation matrix is positive definite. The bias
estimation, however, causes inacurracies in both frequency and power estimation.
The technique had been extended78 to include the case of colored additive noise.

C. P ro n y ’s M ethod
Prony’s method15 79-82 is mainly applied for transients analysis. It has been extended,
however, to provide PSD function estimation.
Assume the sequence xn is the samples of a signal composed of damped (complex) sinusoids

xn = £ Am exp(amnAt)exp(j(wmnAt + 4>m))

n = 0 ,1 ,...,N - 1 (8.71)

Equation 8.71 describes the signal xn as a sum of p sinusoids with frequencies wm, phase
<t>m, and with amplitudes Am, m =• 1 ,2 , ..., p, exponentially decaying, with rate a m ( a m
< 0). For xn to be real, it is required that the roots of the characteristic equation be complex
conjugate pairs of the type exp(j(wmnAt + d>m)) and exp(-j(wmnAt 4- cj>m)). The energy
spectral distribution function of Equation 8.71 is given by

S(w) = |X(w)p = | £ A„,exp(j4>m) [a , + w J 2 jl2 (8.72)

To use Equation 8.72 as “ spectral” estimator, the parameters p, Am, a in, c})m, and wm m
= 1 ,2 , ..., p must be identified. In order to do that, rewrite Equation 8.71 as

*„ = S bmz;„ n = 0,1 - 1
m=1

bm — Am exp(j<j>m)

Zm = exp((am + jw JA t) (8.73)
Volume I: Time and Frequency Domains Analysis 131

The last equation is the homogeneous solution to a constant coefficient linear difference
equation'5

p
*n = - X ainxn„ m
in = 1

n = p,p + 1,...,N - 1 (8.74)

Transferring the last jquation into the Z domain yields

X ,„ ( l + 2 = 0 (8.75)

having the roots Zk at

Tl (z - Zk) = I + £ amz~ m = 0 (876)

k I m = I

The roots Zk given by the solution of Equation 8.76 are the exponents of Equation 8.71.
In the more practical case, the signal is noisy, hence the observation xn can be described
by the Prony’s model as

xn = xn + nn

n = 0 .1 ...... N - 1 (877)

where n„ is a sequence of white noise with zero mean and variance ctJ. Introducing Equation
8.77 into Equation 8.74 yields

^ p
Xn = - 2 a mXn-m + 2 V n -m (8.78)

The signal is thus modeled as a ARMA (P,P) model with equal coefficients for the AR(p)
and MA(p) parts. Unlike the Pisarenko model, the roots of the characteristic Equation 8.78
are not restricted to the unit circle. The model describes, in general, decaying sinunosoids
rather than pure sinusoids.
The ARMA coefficients of (Equation 8.78) can be solved by the methods discussed in
Chapter 7, Section IV. Note that only the AR (or MA) part has to be identified.
Once the a* have been identified, Equation 8.76 can be solved to provide the roots Zk.
The coefficients required for the estimator Equation 8.71 are computed from the solution
of the set of linear equations (Equation 8.73). Define the model vector

xT = lx0,x ,,...x N_,] (8.79)

with the observation vector x defined similarly. Define the complex coefficient vector

bT = [b,,b;......,bp] (8.80)

and the matrix Z

132 Biomedical Signal Processing

1
Zn
Z =

Z?-
(8.81)
The ‘ et of N equations (8.73) with the unknown vector b can be written in a matrix form

x = Zb (8.82)

Recall that x is the model of the observations x. Our aim is to choose the model parameters
such that the model will best fit (in some sense) the observations. Assume we want a least
squares minimization

Min 2 (xn - *„)2 = Min((x - x)T(x - x)) (8.83)

b -° b

Introducing Equation 8.82 into Equation 8.83 and performing the minimization yields the
well-known least squares estimate of b

b - i(Z)TZ ] - !(Z)Tx (8.84)

where an asterisk (*) denotes the conjugate.

With the Zk and bk k = 1, 2, .... p estimated, the values needed for the energy spectral
distribution estimator (Equation 8.71) can be solved as follows:

Am = |bm|

<!>m = tg _I(In.(bm)/Re(bm))

a m = (l/A t)€ n |Z j

w . = (l/A t)tg -'(Im(Zm)/Re( Z J )

m = l,2 ,...,p (8.85)

In summary, the Prony’s method for energy spectral density estimation is given by the
following steps:

1. Estimate order of model;

2. Estimate the AR coefficients of the model (Equation 8.78) by one of the methods
discussed in Chapter 7;
3. Determine the roots Zk fr m the solution of Equation 8.76;
4. Estimate the bk coefficients by the least squares Equation 8.84;
5. Compute the parameters Am, a m, wm by means of Equations 8.85;
6. Estimate the energy power density function, Equation 8.72.
Volume I: Time and Frequency Domains Analysis 133

VI. M AXIM UM LIKELIHOOD METHOD (MLM) — CAPON S SPECTRAL

ESTIMATION
i
A jmethod for PSD function estimation54 83'IWbased on the idea of measuring the power
out of a set of narrow band (optimal) filters has been suggested. The method has been termed
the maximum likelihood method (MLM) or the maximum likelihood spectral estimation
(MLSE). It has been argued15 that the method is not truly a maximum likelihood estimate
of the PSD, thus the name capon spectral estimate83 is more appropriate.
We have seen that the effect of unavoidable windowing of the data (Figure 3) is to distort
the PSD estim ation. The sidelobes of the window cause “ leakage” from neighboring fre
quencies into the estimate of the frequency of interest.
Suppose that for each frequency of interest, prior to the PSD estimation, we shall filter
the data by means of an optimal filter in such a way that contribution from other frequencies
be minimized. This can be viewed as a set of narrow band pass filters, each optimally
designed for the particular frequency. The PSD function estimation is calculated from the
power output o f these filters.
Suppose that a sequence { x j is given. The sequence is a finite sequence of samples of
x(t) at intervals At; its PSD function estimation is required. To estimate the PSD at a frequency
w, we choose to prefilter the data with an MA(N) filter with coefficients to be optimally
adjusted. Ideally, we are interested in the output power of an infinity narrow band pass filter
at the frequency w. If we allow complex sinusoids, the output of the filter should be of the
form A exp(jwkAt) where A is the amplitude of the sinusoid component from which the
PSD is to be derived. Assume that the input to the filter (the observed sequence) is given
by

xk = A exp( jwkAt) + nk (8.86)

where {nk} is the sequence of “ noise” appearing due to the leakage. It is our desire to design
the MA(N) filter in such a way that the signal Aexp(jwkAt) will pass with no distortion,
while the second term in Equation 8.86 be minimized.
Following the arguments presented in Chapter 7. we “ predict" the kth value of the output
of the MA filter such that

= 2 b„ (8.87)

We want to find the optimal value of the filter coefficients bn n = 0, 1, N — 1 such

that the variance of xk be minimal, thus optimally reducing the effects of {n j.
The variance of xn is

Var { x j = E bmn((k - m )A t)j j ( 8 .8 8 )

Equation 8.88 can be written in a matrix form as

Var {xk} = (b*)TRnb (8.89)

where Rn is the noise autocorrelation matrix and b r = [b„, b,, ..., bN_,].
Note that we want the sinusoidal component to pass the filter undistorted; hence, for the
noise-free case we want
134 Biomedical Signal Processing

N- 1

A exp(jwkAt) = 2 exp(jw(k ~ n)At) (8.90)

n=0

Dividing both sides of Equation 8.90 by the term on the left, we get the constraint

1 = (e*)Tb (8.91A)

eT= [l, exp(jwAt),...,exp(jw(N - l)At)] (8.91B)

It can be shown that the minimization of Equation 8.89 subject to the constraint (Equation
8.91) yields the optimal filter

b = Rx 'e/(eT Rx !e) (8.92)

where Rx is the observation sequence correlation matrix.

It can also be shown that the power spectrum of the filter, which is the MLM estimate
of the process PSD function, is given by

(8.93)

. Note that Rx has to be inverted only once. The evaluation of the quadratic form (the
denominator of Equation 8.93) for each frequency can be done by means of the FFT
algorithm. It is easily shown that the quadratic form can be written as a weighted sum of
exponentials such that

N- 1

eT Rx 1 e* = 2 g(n) exp(jwnAt) (8.94)

n = —N + 1

Denote the components of the matrix Rx ‘ by qm(, m, € = 0, 1, ..., N - 1. The terms

g(n) in Equation 8.94 are the sum of all qm, for which n = m — I. Equation 8.94 can be
evaluated by the FFT. For the estimation of N values (frequency samples) of S(w), we
require the calculation of N FFTs.
It has been shown84 that the MLM estimate SMLM(w) of Equation 8.93 is related to the
AR PSD function estimator SAR(w) (Equation 8.45 with model order, p) by

(8.95)
S mLmIW) p m=. ^ ar(w )

In Equation 8.95 both estimators were assumed to have the correct autocorrelation matrix
of order p.
The MLM has lower resolution as compared to the AR estimator. This can be explained
intuitively by Equation 8.95 where it is seen that the high resolution of the pth order AR
estimator is reduced by the “ inverse” averaging with lower order estimators. The MLM,
however, exhibits less variance than the AR estimator.50 This can also intuitively be explained
by Equation 8.95.

V II. D IS C U S S IO N A N D C O M P A R IS O N O F S E V E R A L M E T H O D S

Only the most commonly used techniques have been presented in this chapter. Some
methods, such as the Walsh spectral estimator89 <X) and other specific methods, hr^e not been
Volume 1: Time and Frequency Domains Analysis 135

presented either because of lack of space or due to the fact that they have not been used in
biomedical applications. For the same reasons, many interesting approaches85 86 for PSD
estimation have not been discussed.
An important topic not discussed in this chapter, but that must be taken into account when
estimating the PSD, is that of preprocessing the signal. Care must be taken to low-pass filter
the data before sampling so that (atmost) band-limited spectrum is to be processed and the
sampling rate should be chosen such that no aliasing problems arise. Prewhitening filters
are often used (especially when employing an A R estimator). Prewhitening filters have been
reported to improve AR estimator results and reduce the window bias in the Blackman-
Tukey method. A more general preprocessing filter was suggested by Lagunas-Hemandez
et al.74 T hc\ have reported that the use of their preprocessing method caused the reduction
of estimator order and computation complexity. Preprocessing filters, however, requires
some a priori knowledge about the signal.
Several adaptive and recursive methods for PSD estirnat;,' n suggested; most
adapt the coefficients of the appropriate filter (e.g., References 75, 87— 88). Adaptive
estimation is important when the signal under test is nonstationary as is most often the case
with biomedical signals. The most commonly used method to overcome the nonstationarity
problem is that of segmenting the signal. In this approach, the signal is segmented into
“ alm ost" stationary segments. The PSD function of each segment is then estimated by
means of nonadaptive methods. Segmentation may be done a priori into predetermined
length segments, as is commonly done in speech processing, or by an adaptive algorithm
as is done, for example, in BEG processing (see Chapter 7).
It is sometimes required to estimate the PSD function with unequal resolution, when
certain regions in the frequency are of specific interest. Methods based both on the FFT92
and on AR m odels15 have been developed.
The various methods for PSD function estimation discussed here, and others, differ from
one another in the assumptions made on the process under test, in their statistical charac
teristics, and computational complexity. It is therefore not expected that one can define a
“ universal" criterion by means of which the various methods can be graded. The best
method to use heavily depends on the application, the type of signal, the computation facility
or accuracy, and time constraints. It is obvious fhat the estimation method to choose when
the computation facility consists of a mini- or microcomputer may be different than the one
chosen if a “ number crunching" machine (e.g., array processor) is available.
Table 1 presents, in a concise manner, some of the main characteristics of the various
methods discussed in the chapter. This table may thus serve as a guideline for choosing an
appropriate estimation method for a given problem (see Kay and Marple15 for more details).
136 Biomedical Signal Processing
Volume I: Time and Frequency Domains Analysis 137

REFERENCES

1. Boston, J . R ., Spectra of auditory brainstem responses and spontaneous EEG, IEEE Trans. Biomed. Eng.,
28, 344, 1981. •
2. W illiam s, R . L ., Karacan, I., and Hursch, C. Y ., EEG o f Human Sleep: Clinical Applications, John
Wiley & Sons, New York, 1974.
3. Butter, L. A ., A real time software system on the PDP-11 for two channel EEG spectral analysis during
surgery', Comput. Programs Biomed., 6, I. 1976.
4. G ross, D ., G rassino, A ., Ross, W. R. D., and Macklem, P. T ., Electromyogram pattern o f diaphragmatic
fatigue, J. Appl. Physiol.. 46, 1, 1979.
5. M ezzalam a, M ., Prinetto, P., and Morra, B., Experiments in automatic classification o f laryngeal
pathology. M ed. Biol. Eng Comput., 21. 603. 1983.
6. W iener, N ., Generalized harmonic analysis. Acta M ath.. 55. 117, 1930.
7. K hinchin, A ., Y a., Korrelationstheone der Stationaren Stochastischen Prozesse, Math. Annalen. 109, 604,
1934.
8. Tukey. J. W. and Blackman, R. B., The Measurement o f Power Spectra From the Point o f View o f
Communications Engineering, Dover, New York. 1959.
9. K oopm ans, L. H ., The Spectral Analysis o f Time Series, Academic Press, New York, 1974.
10. Schw artz, M. and Shaw, L ., Signal Processing Discrete Spectral Analysis Detection an d Estimation,
McGraw-Hill, New York. 1975.
11. C hilders, D. G ., Ed., Modern Spectrum Analysis, IEEE Press, New York, 1978.
12. Box, G . E. P. and Jenkins, G. M ., Time Series Analysis: Forecasting and Control, Holden-Day, San
Francisco. 1970.
13. H aykin. S. S ., E d ., Nonlinear Methods o f Spectral Analysis, Springer-Verlag, New York. 1979.
14. Jenking, G. M, and Watts, D. G ., Spectral Analysis and its Applications, Holden-Day. San Francisco.
1968.
15. Kay, S. M. and M arple, S. I,., Spectrum anahsis — a modern perspective. Proc. IEEE, 69. 1380. 1981.
16. Spectra: estimation, special issue. Proc. IhEE. “0(9». i 982.
17. Robinson, E. A ., A historic perspective of spectrum estimation. Proc. IEEE. 70(9). 885. I9S2.
18. Friedlander, B .. Lattice methods for spectra! e'lhnation. Proc. IEEE, 70(9.). 990, 1982.
19. M cClellan, J. H ., Multidimensional spectral ^timation. Proc. IEEE, 70(9), 1029, 1982.
20. Papoulis,' A ., Maximum entropy and spectral estimation: a review, IEEE Trans. Acoust. Speech Signal
Process.. 29, 1176. 1981.
21. Roberts. J. B. G .. Moult*. G. L.. and Parry. G., Design and application of real time spectrum analyser
system. I EE Proc.. 127(2). 70. 1980.
22. C arter. G. C. and MuttaS, A. H ., Analyst .i; 4 generalized framework for spectral estimation. 1EE Proc..
130(3). 239. 1983.
23. Allen. J. B. and Rabiner, L. R., A unified approach to short time Fourier analysis and synthesis. Proc.
IEEE. 65(11). 1558. 1977.
24. Linkens. D. A ., Short time series spectral anahsis of biomedical data, I EE Proc., 129(9). 663. 1982.
25. M akhoul, J ., Spectral linear prediction: properties and applications. IEEE Trans. Acoust. Speech Signal
Process.. 23, 283. 1975.
26. O tens, R. K. and Enochson, L., Digital Time Series Analysis. John Wiley & Sons. New York. 1972.
27. H arris. F. J ., On the use of windows for harmonic analysis with DFT. Proc. IEEE, 66. 51. 1978.
28. W ebster, R. JL, Leakage regulation in the DFT spectrum. Proc. IEEE. 68. 1339, 1980.
29. M arkel. J. D ., FFT pruning. IEF.E Trans. Audio Eiectroacoust., 19, 305, 1971.
30. W elch. P. D ., The use of fast Fourier transform for the estimation of power spectra: a method based on
time averaging over short modified periodograms. IEEE Trans. Audio Eiectroacoust., 15. 70. 1967.
31. Carter. G. C. and Nuttall, A. H ., A brief summary of generalized framework for power spectral estimation.
Signal Process., 2. 387, 1980.
32. M orf, M .. Vleria, A ., Lee, LT. L., and Kailath, T ., Multichannel maximum entropy spectral estimation.
IEEE Trans. Geosci. Electron.. 16, 85. 1978.
33. Dyson, T . and Rao, S. S., Equal observation interval comparison of maximum entropy and weighted
overlapped segment averaging spectrum estimation techniques. (EEE Trans. Acoust. Speech Signal P rocess.,
2 9 ,9 1 9 .1 9 8 1 .
34. Burg, J. P ., Maximum Entropy Spectral Anahsis. Proc. 37th Annu. Inst. Meeting Soe. Explor. Geophys.,
Oklahoma City, 1967.
35. Ulrych. T . Y. and Bishop, T. N ., Maximum entropy spectral analysis and autoregressive decomposition.
Rev. G eophys. Space Phys.. 13, 183, 1975.
36. Jaynes. E. T ., On the rationale of maximum entropy method . Proc. IEEE, 70, 939, 1982.
37. Lang, S. W. and McClellan, J. H ., Frequency estimation with maximum entropy spectra! estimators,
IEEE Trans. Acoust: Speech Signal Process.. 28. 716, 1980.
138 Biomedical Signal Processing

38. Theodoridis, S. and Cooper, D. C ., Application of the maximum entropy spectrum analysis technique
to signals with spectral peaks o f finite width, Signal Process., 3, 109, 1981.
39. Herring, R. W ., The cause o f line splitting in Burg maximum entropy spectral analysis, IEEE Trans.
Acoust. Speech Signal Process., 28, 692, 1980. j
40. W u, N .-L ., An explicit solution and data extension in the maximum entropy method, IEEE Trans. Acoust.
Speech Process., 31, 486, 1983. ;
41. Lang, S. W .’and M cClellan, J . H ., Multidimensional MEM spectral estimation, IEEE Trans. Acoust.
Speech Signal Process., 30, 280, 1982.
42. M alik, N. A. and Lim , J. S ., Properties of two dimensional maximum entropy power spectrum estimates,
IEEE Trans. Acoust. Speech Signal Process., 30, 788, 1982.
43. M clellan, J. H. and Lang, S. W ., Duality for multidimensional MEM spectral analysis, IEE P roc.. 130(F),
230, 1983.
44. Newman, W. I., Extension to the maximum entropy method, IEEE Trans. Inf. Theory, 23. 89. 1977.
45. Johnson, R. W. and Shore, J. E ., Minimum cross entropy spectral analysis of multiple signals. IEEE
Trans. Acoust. Speech Signal Process., 31, 574, 1983.
46. M akhoul, J ., Linear prediction: a tutorial review, Proc. IEEE, 63. 561, 1975.
47. Bartlett, M. S ., An Introduction to Stochastic Processes, 2nd ed., Cambridge University Press. New York,
1966.
48. Van den Bos, Alternative interpretation of maximum entropy spectral analysis, IEEE Trans. Inf. Theory,
1 7 ,4 9 3 ,1 9 7 1 .
49. Grande, J ., II, Hamrud, M ., and Toll, P., A remark on the correspondence between the maximum
entropy method and the AR model, IEEE Trans. Inf. Theory, 26, 750, 1980.
50. Baggeroer, A. B ., Confidence intervals for regression (MEM) spectral estimates. IEEE Trans. Inf. Theory,
22, 534. 1976.
51. Huzii, M ., On spectral estimate obtained by an AR Model fitting , Am. Inst. Statist. M ath., 29. 415. 1977.
52. Sakai, H ., Statistical properties o f AR spectral analysis, IEEE Trans. Acoust. Speech Signal Process., 21,
402, 1979.
53. Toomey, J. P., High resolution frequency measurement by linear prediction. IEEE Trans. Aeorsp. Electron.
Syst., 16. 517. 1980.
54. Fougere, P, F., Zawalick. E. J ., and Radoski, H. R ., Spontaneous line splitting in maximum entropy
power spectrum analysis, Phys. Earth Planet. Inter., 12, 201, 1976.
55. Hung, E. K. L. and H erring, R. W ., Simulation experiments to compare the signal detection properties
of DFT and MEM spectra, IEEE Trans. Acoust. Speech Signal Process., 29. 1084, 1981.
56. Kay, S. M. and M arple, S. L ., J r., Sources of and remedies for spectral line splitting in AR spectrum
analysis. Proc. IEEE Int. Conf. Acoust. Speech Signal Process., 151. 1979.
57. Fougere, P. F., A solution to the problem of spontaneous line splitting in maximum entropy power spectrum
analysis. J. Geophys. Res.. 82. 1051, 1977.
58. M arple, L., A new AR spectrum analysis algorithm, IEEE Trans. Acoust. Speech Signal Process., 28.
441, 1980.
59. Lacoss, R. T ., Data adaptive spectral analysis method., Geophysics, 36, 661. 1971.
60. Swingler, D. N ., A comparison between Burg’s maximum entropy method and a nonrecursive technique
for the spectral analysis o f deterministic signals, J. Geophys. Res.. 84, 679, 1979.
61. Scott, P. D. and Nikias, C. L ., Energy-weighted linear predictive spectral estimation: a new method
combining robustness and high resolution, IEEE Trans. Acoust. Speech Signal Process., 30, 287, 1983.
62. Quirk, M. P. and Liu, B ., Improving resolution for AR spectral estimation by decimation, IEEE Trans.
Acoust. Speech Signal P ro cess., 31, 630, 1983.
63. Lee, T. S., Large sample identification and spectral estimation of noisy multivariate AR processes. IEEE
Trans. Acoust. Speech Signal Process., 31, 76, 1983.
64. Cadzow, J. A ., High performance spectral estimation — a new ARMA method, IEEE Trans. Acoust.
Speech Signal Process., 28, 524, 1980.
65. Cadzow, J. A ., ARMA spectral estimation: a model equation error procedure. IEEE Trans. Geosci. Rem.
Sens., 19, 24, 1981.
66. K ay, S. M ., A new ARMA spectral estimator, IEEE Trans. Acoust. Speech Signal Process., 28, 585,
1980.
67. Friedlander, B ., Efficient algorithm for ARMA spectral estimation. IEE Proc., 130 (F.3), 195, 1983.
68. Friedlander, B ., Instrumental variable methods for ARMA spectral estimation, IEEE Trans. Acoust. Speech
Signal Process., 31, 404, 1983.
69. Bruzzone, S. and Kaveh, M ., On some suboptimal ARMA spectral estimators, IEEE Trans. Acoust.
Speech Signal Process., 28, 753, 1980.
70. Porat, B ., ARMA spectral estimation based on partial autocorrelations, Circuits, Syst. Signal Process.,
2(3), 341, 1983.
Volume I: Time and Frequency Domains Analysis 139

71. Kaveh, M . and Bruzzone, S. P., Statistical efficiency of correlation based methods for ARMA spectral
estimation. IEE Proc.. 130 (F-3), 211, 1983.
72. Fischer, J. and Wilfert, H. H ., Determination o f the statistical errors in the estimation o f the pov.jr
spectrum by means of univariate time series analysis, Proc. 5th IFAC Symp. Ident. and Syst. Param.
Hstimation, 1235, 1979.
73. Takeuchi, M ., Decision of order in ARMA model on power spectrum estimation and accuracy of estimated
power spectrum — in fitting ARMA model to time series, Syst. Comput. Controls, i 2. 18, 1981.
74. Lagunas-H ernandez, M . A ., Flgueiras-Vldal, A. R ., Marino-Acebal, J. B ., and Vilanova, A. C ., A
linear transform for spectral estimation. IEEE Trans. Acoust. Speech Signal Process., 29. 989, 1981.
75. C adzow , J . A ., Spectral estimation: an overdetermined rational model equation approach. Proc. IEEE,
70. 907. 1982.
76. Pisarenko, V. F ., The retrieval of harmonics from a covariance function, Geophys. J. R. Astron. Soc.,
33.' 247. 1973.
77. Pisarenko, V. F., On the estimation of spectra by means of non linear functions of the covariance matrix,
Geophys. J. R. Astron. Soc., 28. 511. 1972.
78. Satorius. E. H. and Alexander, J. T., High Resolution Spectral Analysis of Sinusoids in Correlated
Noise. Ree. 1978 ICASS1V luisa.
79. Bucker, H. P ., Comparison of FFT and Prony algorithms for bearing estimation of narrow band signals
in a realistic ocean environment. J. Acoust. Soc. Am ., 61, 756. 1977.
80. Scahubert, D. H ., Application of Pron\ s method to time domain reflectometer data and equivalent circuits
synthesis. IEEE Trans. Antenas Prop.. 27. 180. 1979.
81. W eiss, L. and M cDonough, R. N ., Prony1s method. Z. transforms and pade approximation. SIAM Rev.,
5. 145, 1963.
52. K um aresan. R. and Tufts, D. W ., Improved spectra! resolution III: efficient realization. Proc. IEEE. 68.
1354. 1980.
53. Capon. J .. High resolution frequency wavenumber spectrum analysis, Proc. IEEE, 57. 1408. 1969.
84. Burg, J. P .. The relationship between maximum entropy spectra and maximum likelihood spectra. Geo
physics. 3". 375. 1972.
85. Jain. A. K. and Ranganath, S .. l:\tranolation algorithms for discrete signals with applications in spectral
estimation. IEEE Trans. Acoust. Spetch Signal Process., 29. 830. 1981.
86. Beev, A. A. and Scharf, L. L ., Gnariance sequence approximation for parametric spectrum modeling,
IEEE Trans. Acoust. Speech Signal Process.. 29. 1042, 1981.
87. Andrew s. M .. An adaptive NR filter for spectrum analysi.. Comp. Electron. Eng.. 6. 9y. 1979.
88. Friedlander. B ., Recursive lattice forms for spectral estimation, IEEE Trans. Acoust. Speech'Signal
P rocess.. 30. 920, 1982.
89. Larsen, R. D ., Crawford, E. F., and Howard. G. K ., Walsh analysis of signals. Math. Biosci.. 31,
237. 197(v
lM). Sm ith, W. I)., Walsh versus Fourier estimators of the FEG power spectrum, IEEE Train. Biomed. Ei m;.,
28. 790. h > $l.
91. Lai, I). C. and Larsen, H ., Walsh spectral estimates with applications to the classification of FF.G signals.
IEEE Trans, Biomed. Eng., 28, 790, 1981.
92. O ppenhein. A ., Johnson, D ., and Steiglitz, K ., Computation of spectra with unequal resolution using
FFT. Proc. IEEE, 64. 299. 1971.
Volume I: Time and f requency Domains Analysis 141

Chapter 9

ADAPTIVE FILTERING

I. INTRODUCTION

Filtering is used to process a signal in such a way that the signal-to-noise ratio is enhanced,
noise of a certain type is eliminated, the signal is smoothed, or “ predicted” , >r classification
of the signal is achieved. When the signal and noise are stationary and thcii characteristics
are approximately known or can be assumed, an optimal filter can be designed a priori.
Such is the W iener filter discussed in Chapter 6 and the matched filter that is presented in
Chapter I, Volume II.
w hen no a priori information on the signal or noise is available, or when the signal or
noise is nonstationary, a priori optimal filter design is not possible. Adaptive optimal filters
are filters that can automatically adjust their own parameters, based on the incoming signal.
The adaptation process is conducted such that the filter uses incoming signal information in
order to adapt its own parameters so that a given performance index is optimized. Adaptive
filters thus require little or no a priori knowledge of the signal and noise characteristics.
The applications of adaptive filtering to signal processing in general, and to biomedical
signal processing in specific, has been preceded by the development and use of adaptive
algorithms1'3 in control theory. Although adaptive filters and algorithms, used in signal
processing, are basically similar to those used in control systems, some differences do exist,
which demand new design approaches.4
Since no (or almost no) a priori information is available, the adaptive filter requires an
initial period for learning or adaptation. During this period, its performance is unsatisfactory.
The time of adaptation is clearly an important characteristic of the filter. Signals, where fast
changes are expected, require filters which adapt rapidly. Care should be placed when
designing such filters since the filter may track rapid artifacts. After initial adaptation, the
filter is supposed to act optimally, while tracking the nonstationary changes in signal and
noise. The nonperfect ability of the filter to estimate signal and noise statistics p r e v e n ts it
from being truly optimal. In practical design, however, this loss of performance can be
made quite sm all.5
The adaptive filter is required to perform calculations to satisfy the performance index
and must have provision for changing its own parameters. Digital techniques, w'ith or without
computing device, have clear advantages here over analog techniques. It is mainly because
of this reason that most adaptive filter implementations are performed by discrete systems.
We shall only consider here discrete adaptive filters operating on sampled signals.
The next section in this chapter will present the general structure of an adaptive filter.
This will follow by a detailed discussion o f least mean square (LMS) adaptive filter.5 s The
use of the LM S adaptive filter for line enhancement914 and noise cancellation7 15 " will be
discussed with some biomedical applications.71517 Finally, the multichannel20 and the time-
sequenced adaptive filter2122 will be introduced. The discussions are based mainly on Wid-
row’s papers.
The LMS filter discussed here is by no means the only type of adaptive filter available.
Other types are discussed in the literature,23'28 where various performance criteria or structures
are used (e.g .. the lattice structure35 27), or in which the structure as well as the weight is
adaptable.28
142 Biomedical Signal Processing

II. GENERAL STRUCTURE OF ADAPTIVE FILTERS

A. Introduction
The adaptive filter consists o f three main parts: the performance index which is to be
optimized, the algorithm that recomputes the parameters of the filter, and the structure of
the filter which actually performs the required operations on the signal. Claasen4 has sug
gested the classification of adaptive filters according to their major parts and to their goals.
W e follow this approach here.
The performance index is best determined by the application. When using adaptive filtering
for elimination of maternal ECG in automatic fetal ECG monitoring, the performance index
may bef the minimization of false detection. This, however, may be a difficult criterion to
implement since we do not know when a false detection has happened. Therefore, we look
for performance criteria that will be easily implemented. In most applications, the minim
ization o f the square of an output »‘c fonrid to be a satisfactory criterion.
The algorithm is the mechanism by means of which the parameters, optimizing the
criterion, are calculated. Two*basic types of algorithms are to be considered. The first is a
nonrecursive algorithm. It requires the collection of all data in a given time window and
solving the necessary equations. The exact least square method is such an algorithm. The
algorithm usually requires the solution of a set of linear equations by the inversion of a
matrix, and the results are not available in real time. The second type of algorithm is the
recursive algorithm which updates itself with every incoming signal sample, or a small group
of samples. This algorithm usually requires gradient methods and convergence must be
checked. Results are available immediately and tracking of signal nonstationarities is possible.
The structure of the filter depends to some extent on the algorithm and the application.
Most often a transversal filter is used because of its straightforward hardware structure and
its robustness in combination with iterative algorithms. The lattice structure, though some
what more complex, has been found to possess better convergence and sensitivity.26
We shall proceed by considering various goals of adaptive filtering.

B. Adaptive System Parameter Identification

System parameter identification29 is an important problem in control systems. In order to
(optimally) control a system, one must know its dynamic behavior. This is usually given in
terms of the differential equations relating the outputs and inputs of the system. If only the
structure of the equations (order, in linear systems) is known, but not the parameters, some
kind of identification algorithm must be applied in order to estimate it. Identification al
gorithms are widely applied in the control of sophisticated systems such as missile guidance.
It has also been applied to biomedical systems.30*31 Figure 1 depicts an adaptive scheme for
systems parameter identification. The noisy inputs and outputs of the system are measured,
either during normal operation or during special identification test. These are fed into a filter
with variable coefficients. The coefficients of the filters are adjusted by the algorithm in
such a way as to optimize3 the given criterion. In Figure 1, a minimum square output error
criterion was chosen. After adaptation, the filter represents the best (in the sense of the
given criterion) model for the system. Its coefficients are the systems identified parameters.

C. Adaptive Signal Estimation

The need often arises to estimate the output of a system, given a noisy or distorted output.
The noisy input may or may not be given. Note that here, in contrast to the system iden
tification problem, we do not require information on the system nor do we wish to model
the system. Our goal here is to best estimate the output of the system, according to the
given performance index. Figure 2 depicts this type of adaptive filter.
Volume I: Time and Frequency Domains Analysis 143

FIGURE I. System parameter identification.

FIGURE 2. Adaptive signal estimation.

D* Adaptive Signal Correction

Assume we are given a signal, produced by the same system, the input to which is
inaccessible. We require performance of some correction to the signal. This may be, for
example, the elimination of power line interferences from EEG records or the elimination
of maternal ECG from fetal ECG recordings. Information on both the signal and the required
correction ^.re of extreme importance here. We may want, to use this filter, for example, to
enhance the alpha rhythm in an EEG signal (adaptive line enhancer, ALE). The filter must
be carefully designed so that it will not enhance noise components and introduce it as alpha
waves. Figure 3 depicts the adaptive signal correction filter.

III. LEAST M EAN SQUARES (LM S) ADAPTIVE FILTER

A. Introduction
We shall discuss here adaptive filters using the least mean square (LMS) algorithm
developed5 by Widrow and Hoff in 1960. The filter consists of reference inputs, variable
gains multipliers (weights), an adaptation algorithm, and an additional input signal denoted
144 Biomedical Signal Processing

SIGNAL TO BE
CORRECTED

FIGURE 3. Adaptive signal correction.

FIGURE 4. Adaptive linear combiner.

the “ primary input” . We shall present the principal component of the adaptive filter, namely,
the adaptive linear combiner, and then discuss various structures of the adaptive filter for
adaptive noise cancelling and line enhancement.

B . Adaptive Linear Combiner

The adaptive liner combiner7 8 will be discussed here as a separate unit. It will be later
attached to various configurations to meet various applications. The line combiner is shown
in Figure 4.
The inputs, xSj, are reference inputs. We shall see later that these are given, or are derived
from the signal itself The primary input, d„ is sometimes called the ' desired” input. This
nam e, however, should be carefully considered. It is definite^ nut the desired signal since
if it were, there was no need to process it by the filter. We shall later see that from the
point of view of the adaptation algorithm, it can indeed be considered the “ desired” input.
Volume I: Time and Frequency Domains Analysis 145

Assume we have a set of n reference (discrete) inputs xkj, k = 1, 2, n where j denotes

the time. Define the n + 1 dimensional reference input vector Xji

X] = [xoj,xy, . . . , x J (9.1)

where xuj is usually a constant set to the value 1. Its role is to take care of biases in the
inputs. Wc also define a vector of the variable gains (weights):

WT = [w0, w ,,..., wnl (9.2)

where \vu is the bias weight.

The summer output, at time j, is s(, given by:

Sj = t,
i 0
= W'x, = x J W (9.3)

We consider s, to be the estimation of a signal. This signal will depend on the problem to
which the com biner is applied. We shall define the error signal Cj by:

tj = dj - §j = d, - WTx, = d, - x|W (9.45

€. is therefore the error between the desired and estimated signals.

C. T he LM S A daptive A lgorithm
The performance index for our algorithm is the mean square error. The LMS adaptive
algorithm ’s task is to adjust the weights. W. in such a way as to minimize the mean square
error. The mean square error is calculated by squaring Equation 9.4 and taking the expec
tation. Assuming the reference and primary inputs to be stationary, and the weights fixed,
we get:

E{er; - E{d;} - ZE^xTjW + WTE{xix'[}W (9.5)

Define the cross correlation between dt and Xj as the vector p:

gT = E{djXj} = E{[dJx„..d,x11......djXnj]} (9.6)

and denote the symetric, positive definite input correlation matrix, R,

R = E{x,x.’} (9.7)

so that the mean square error can be expressed as the quadratic function of the weights:

E{€j2} = E{d:} - 2£t W + WTRW (9.8)

In the stationary' case, the minimization of Equation 9.8 means the adjustment of the
•weights, descending along the surface (Equation 9.8) until the minimum is reached. In the
nonstationary case, the minimum is drifting and the algorithm has to adapt the weights sue!
that they track the minimum.
To find the minimum of Equation 9.8. we have to calculate the gradient of the squared
error. The weighting vector, Wopt, is the vector that zeroes the gradient.
146 Biomedical Signal Processing

.... 25H 1- , 0 (9.9)

L dw0 dw, dwn J ^ — p

hence:

Wopt = R->2 (9.10)

which is the matrix form of the Wiener-Hopt equation (see Chapter 6).
The LMS algorithm does not use directly Equation 9.10 for the optimal solution. Rather,
it uses the method of steepest descent. We calculate the optimal vector iteratively, where
in each step we change the vector proportionally to the negative of the gradient vector.
Hence:

Wj+ , (9.11)

where jjl is a scalar that controls the stability and rate of convergence of the algorithm. We
have added a subscript to the weighting vector to denote the number of iteration. Note that
using Equation 9.11 does not require the calculations of the correlations and not the inversion
o f the correlation matrix. The gradient with subscript j in Equation 9.11 is given by Equation
9.9 where the derivatives are taken at W = Wj.
In practice, it is impossible to implement Equation 9.11 since the gradient involves
expectations. For practical implementation we have to replace the gradient, Vj, with some
kind of estimation, Vj. Widrow has suggested the crude estimate:

de2 jte 2
(9.12)
dw0’ d w ,\ *’ dwn

namely, to estimate the expectation of e? by the value of ef itself. This means that we
estimate the mean by a very short finte time. The derivatives of Equation 9.12 become:

i,b '

The right side of Equation 9.13 is calculated by taking the derivative of Equation 9.4 with
respect to W. Introducing the estimate of the gradient of Equation 9.13 into Equation 9.11
yields:

W J+1 = W j + 2ji€& (9-14)

The last equation is known as the W idrow-Hoff LMS algorithm. It has been shown that the
expected value of the weight vector (Equation 9.14) converges to the Wiener weight vector
(Equation 9.10) if the reference inputs are uncorrelated over time.
A necessary and sufficient condition for convergence8 is

^ a x > l^ > 0 (9.15 A)

where Xmax is the largest eigenvalue of the correlation matrix R. The eigenvalues, however,
are usually not known. It has been suggested,8 therefore, to use a sufficient condition for
convergence:
Volume I: Time and Frequency Domains Analysis 147

l/tr(R) > |x > 0 (9.15B)

Since R is positive definite, tr(R) > \ nm. The trace is easy to estimate since it is the total
power in the reference signals. Widrow has shown that the learning curve (the curve de
scribing the convergence of the weights W, to the Wiener weight) can be approximated by
a decaying single exponential curve with time constant, t

’ - T
4|xTtr TR » ■ '»

The LMS adaptive algorithm (Equation 9.14) is easy to implement, and does not require
differentiations or matrix inversion. For each iteration, it requires n + 2 multiplications and
n additions.

D. The LMS Adaptive Filter

Consider now the case where the reference vector, x, is received by a tapped delay line
as in Figure 5. Here the reference vector is

xT = [1 ,X j,X j, ......x,_n, , l (9.17)

and the output of the summer, the estimated signal s, (Equation 9.3):

x, = s, = WTXj = w(l + j^WiX, ;,.i (9.18)

i ... i

Note that §j is the autoregressive (AR) estimation (see Chapter 7). The LMS filter with
reference Equation 9.17 is an adaptive AR filter. The AR coefficients (LPC) are optimally
adapted in such a way that the output of the filter and the desired input w-iH have minimum
mean square error.
If we set d j.= Xj, w0 = w, = 0, wt = - w ^ , , i = 1 ,2 ....... n, denote §j = xj5 we get
from Equation 9.18:

n - I

Xj = - 2 WiXj-i (9-19)
i= I

which is the AR equation. The filter, under these conditions, can be used to estimate and
track the LPC (AR) of a nonstationary signal.
Adaptive LMS filters have been successfully implemented on small machines. The errors
due to finite word length have been analyzed.40 4' Adaptive filtering can be implemented
also in the frequency domain1'02 ” with some advantages over the time dom ain.16

IV. ADAPTIVE NOISE CANCELLING

A. Introduction
Consider the following problem (depicted in Figure 6). A signal s(t) is contaminated with
an additive noise n^(t), and with another noise. i^(t); we assume that s and n(, and -r] are
uncorrelated. The noise, nu, is generated by a white noise process n(t) that has passed an
unknown linear filter, H,. The additive noise n„(t) is therefore a colored noise. Assume also
that we have a reference signal x(t) consisting of a white noise £(t) and another noise nr(t).
The second noise is the result of the noise process n(t), contributing to the primary noise,
but after another unknown linear filter. HL. Note that here we have:
148 Biomedical Signal Processing

FIGURE 5. The LMS adaptive filter.

N0(z) =■ H2~ 1(z _ 1)H ,(z “ 1)Nr(z) (9.20)

where N0, Nr, H ^ z - '), and H2(z ~ ‘) are the z transforms of n„(t), nr(t). h,(t), and h2(t).
respectively. We assume that the auxiliary noises T ](t) and £ ( t ) are white and uncorrelated
with one another, with n(t), and with the signal s(t). The concept of the adaptive noise
canceller is as follows. An adaptive estimate of n0(j), denoted by nQ(j). is calculated by the
adaptive LMS filter. As shown before this filter is an adaptive AR filter estimating the
unknown filter H2' ,(z _1)H,(z_1) by means of the reference input nr(j) and the error. Note
also that the adaptive filter does not operate as the Durbin’s algorithm described in Chapter
5. There, the AR coefficients estimated were optimized in such a way as to whiten the
output of the filter (the residuals). The estimated AR coefficients were the coefficients of
the filter, H2_i(z _ I). Here the criterion is the minimization of E{ef} so that the estimated
AR coefficients are the coefficients of H2- !(z~ *)H ,(z- ').
Adaptive noise cancelling filters7 have been extensively used in biomedical signal pro
cessing and many other applications.

B. Noise Canceller With Reference Input

Assume that s, nQ, nr and nGare stationary with zero mean, and recall that s and n0 are
uncorrelated. The filter’s output, e, is given by:

e = s = s + T] + nc - (n0 + f|) (9.21)

The expectation of the square error is

E{e2} = E{s2} +-E{((n0 + tj ) - (nG - fj))2} (9.22)

Volume I: Time and Frequency Domains Analysis 149

FIGURE 6. Adaptive noise canceller.

The adaptive algorithm will change the weights so that Equation 9.22 is minimized.
However, changing the filter weights effects only nu and does not effect the term E{s2}.
Therefore, minimization of Equation 9.22 is equivalent to the minimization of E{«n . •+- t])
— (n„ - From Equation 9.21 we get:

(s - s) = (n0 + t}) - (nu + f))

hence the m inimization of E{(rno 4 T|) - (n0 -f f|))2} also minimizes E{(s - s): }. The
adaptive noise canceller provides the estimate s, which is the best least squares estimate
of s.
The reference input must be correlated with the primary noise, n0. It is this correlation
that allows the LMS noise canceller to function effectively. To demonstrate this, assume
that nr is not correlated with n4, (Figure 6 does not hold for this example). The minimization
of Equation 9.22 yields:

Min E{e2} = MinE{s2} + MinE{(n0 - n0)2} =

E{s2} + E{n2} + Ef^2} + M inE{-2r]fj + (ntl + fj)2} (9.23)

The algorithm will minimize Equation 9.23 by adjusting all weights W to zero thus bringing
the last term to its minimum, namely zero. The adaption noise canceller has been applied
to a variety of biomedical715 and many other applications, such as echo cancellation in
communication network.34'35
150 Biomedical Signal Processing

Example 9.1 — Elimination of Power Line Interferences

Consider an example where the noise contaminating the signal is due to power line
interferences. The primary input is

d = s + A cos(wHt + <f>) (9.24)

where the amplitude, A, and the phase, <j>, are unknown. The frequency of the line voltage,
wOJ can vary around its nominal value. Its exact value, at any moment, is unknown a priori.
This is a common problem in biomedical signal processing. A fixed band stop filter can be
designed with a notch at the nominal value of w0, and with sufficient width to cover the
expected variations in the frequency . In cases where meaningful portions of the PSD function
of the signal are in the vicinity of wu, this will cause distortions to the processed signal.
Typical examples are ECG, EMG, and EEG signals, all having meaningful information in
the region of 50 to 60 Hz. — ......
We shall see now that the adaptive LMS noise canceller can operate as an adaptive narrow
notch filter with its central frequency automatically tracking the variations of w„.
For the reference signal, we take a signal which is directly proportional to the power line
voltage; we choose:

x,(t) = B cos(w0t + i|/) (9.25A)

Here B, wt„ and i{j are known. This can simply be the voltage taken directly from the wall
outlet. The second reference is derived from x,(t) by shifting it 90°. Hence:

x,(t) = B sin(wot + i|/) (9.25B)

and the samples of the reference vector

x tr = [B cos(w0jAT + i{i), B sin(w JA T + i{>)] (9.25C)

where AT is the sampling interval. Note that the output of the LMS combiner is a linear
combination of the two normal phasors (Equation 9.25C). It is clear that we can represent
the cosine primary noise as a linear combination of the phasors, given the right weights.
Any change in w„ will appear both in the primary and reference signals. The LMS will thus
track the variations in w0. Widrow7 has shown that the adaptive noise canceller described
in Figure 7 is equivalent to a notch filter, with notch frequency always at w0, and Q (ratio
of center frequency to bandwidth) given by:

(9.26)
2 jjlB 2

This configuration was used7 to cancel 60-Hz interferences in ECG. Figure 8 shows the
cancellation effects. Note the adaptation of the filter. Other types of learning filters for
power line interferences removal are available.17

Example 9.2 — Elimination of ECG Interferences

In some heart transplantation procedures,36 the new donor heart is attached to a small
remnant of the old heart. The remaining portion of the old heart contains its SA node, which
remains connected to the vagus and sympathetic nerves. The old SA node continues to beat
at a rate controlled by the central nervous system (CNS). The new heart, containing both
SA and AV nodes, is not connected to the CNS. It thus generates spontaneous pulses. When
Volume I: Time and Frequency Domains Analysis 151

FKjURh 7. Adaptive noise canceller for sinusoidal noise cancelling.

monitoring the ECG of a patient after such heart transplantation, both “ o ld " and “ new”
QRS complexes are present. It is of interest to the physician to be able to separate between
the two and to be ,.ble to analyze the “ old” ECG without the interferences of the “ new”
ECG. Adaptive noise cancelling has been applied to this problem by W idrow' and his co
workers. The primary input was supplied by a catheter, inserted through the left brachial
vein and the vena cava to the atrium of the “ old” heart. The reference input was supplied
by ordinary chest electrodes, which carried mainly the “ new” heart’s signals. Figure 8B
shows the improvement in the signal after the application of adaptive noise cancelling.

Example 9.3 — Elimination of Electrosurgical Interferences

During surgery, the electrosurgical unit (ESU) is often used. It supplies a high frequency
signal, modulated at twice the power line frequency, to the tip of the surgeon's knife. The
power delivered, in the range of 100 to 200 watts, is used to aid in cutting tissue and
coagulate severed blood vessels. While the ESU is in operation, strong interferences are
introduced into the ECG recorded during the operation. These interferences are due to ESU
currents passing through the patient’s tissues. Conventional grounding, screening, and other
instrumental methods are unable to significantly reduce these interferences. Yelderman,
W idrow, and their co-workers15 have applied the adaptive noise canceller to this problem.
Primary signal was taken from normal chest electrodes. Reference signal was taken from a
pair of electrodes placed on the arm in such a way that they pick an interfering ESU signal
with minimum ECG. Special care had to be taken to isolate the electrodes from the monitoring
instrumentation. Passive filtering applied to both primary and reference signals removed
most of the spectral components above 60 Hg. Strong interferences from the first three
harmonics of the power line and random low frequency signals still remained, these inter
ferences were highly nonstationary and changed considerably as the surgeon moved the
knife. The LMS adaptive noise canceller was used to cancel the nonstationary interferences.
Figure 8C shows some of the results.

Example 9.4 — Noise Reduction for the Hearing Impaired

Hearing impairment is a major chronic disability affecting \ ;ople all over the world. The
reduced ability to understand speech in everyday communication, in a noisy environment,
is a common phenomena for the hearing impaired.37 Widrow7 has applied the LMS noise
152 Biomedical Signal Processing

A D A PT A T IO N |j A DA PT AT IO N
R Fm N S 11 P.OMPi FTF

FIGURE 8. Noise cancellation in ECG. (A) Adaptive cancellation tff power line
interferences; {B) removal of ‘'new-' ECG in heart transplant patient; (C) cancellation
of electrosurgical noise. (From Widrow, B.. Glover. J. R., Jr., McCool. J. M..
Kaunitz. J.. Williams, C. S., Hearn, R. H., Zeidler, J. R.. Dong, E., Jr., and Goodlin,
R. C.. Proc. IEEE, 63, 1962, 1965. With permission.)

cancelling filter to the problem of noise reduction in pilot radio communications. The noise
in the cockpit is highly nonstationary due to variations in engine speed and load. A second
microphone was placed in a suitable location in the cockpit to produce the reference signal.
In W idrow’s experiments, simulated cockpit noise made the unprocessed speech unintelli
gible. After LMS processing, the output power of the interferences was reduced by 20 to
25 dB, rendering the interferences barely perceptible to the listener. No noticeable distortion
w as introduced to the speech signal itself. Other algorithms, which use only one microphone,
have been suggested.?s These, however, were ineffective for low signal-to-noise ratios.
The LMS noise canceller was applied to the problem of improving speech communication
for the hearing impaired.37 Noise cancellation is important, for example, in cases wherein
a hearing impaired child must function in an educational setting. A special amplification
system can be used to amplify the teacher’s voice for the child. The teacher’s microphone,
however, picks not only the teacher’s voice but also the environmental noise o f the classroom.
A reference microphone can be placed apart from the teacher to pick the reference noise
(which will also, in this case, contain some of the primary signal).
Applying the LMS noise canceller in a controlled environment using very noisy speech
improved intelligibility of speech from near zero to about 30 to 50%.

C . Noise Canceller Without Reference Input

Consider the case where periodic interferences contaminate the signal, but this time no
reference signal is available. If the signal is broad band, such that its correlation function
u o p s relatively fast, we can “ generate” a reference signal out of the primary signal. We
shall form the reference, xj? by placing a delay, t r , on the primary signal (see Figure 9).
Hence we have:
Volume I: Time and Frequency Domains Analysis 153

r NEW I r-NEW
H EA RT H EA R T

A
J s

^OLD
^NEW AND OLD
NEW / H EA RT i
H EA RTS
JH E A R T ^ J (
M I

i i i

, OLD |
Y OLD
HEART
NEW
H EA RT
,v^
H EA RT

/ w — "V 'vw v
\

dj = Sj + n; (9.27)

Xj =« d(j — t k) = s(j - 7r ) + n(j - t r ) (9.28)

154 Biomedical Signal Processing

v
Let the autocorrelation of the signal be R s( t ) . We choose t r such that R s( t r ) < e , where €
is some small number. The delayed signal s(j - t r ) in the reference will (almost) be
uncorrelated with the primary signal. The reference n(j - t r ) is correlated with the primary
noise since it is periodic.
Adaptive noise cancellers can be implemented also in the frequency dom ain.16-33

D. Adaptive Line Enhancer (ALE)

Consider now the problem of detecting a sine wave in white noise.7 We can apply the
same configuration used in the last section for cancelling periodic noise without reference.
Here, however, the signal is periodic and the noise is broad band. The adaptive combiner
will supply the estimate of the periodic signal. Refer to Figure 10 and note that the only
difference between the ALE and the noise canceller of Figure 9 is that here we take the
output from the combiner.
The ALE has been extensively used913 to detect and track a moving spectral line burried -
in background broad band noise, with applications mainly in sonar detection and tracking
and in communications.

V. IM PRO VED ADAPTIVE FILTERING

A. Multichannel Adaptive Signal Enhancement

In many practical applications, we are given a number of channels. Each channel contains
a signal contaminated by additive noise. The signals in the various channels are correlated
with one another in some unknown fashion; they may, however, differ in their waveshapes
and power spectra. The noise signals involved are assumed to be uncorrelated with one
another and with the signals. One of the channels is denoted, the primary input, and it is
desired to enhance its signal. A method for such enhancement was suggested by Ferrara
and W idrow.20 The basic idea is to use the correlative information in each channel by means
of multichannel adaptive filtering; it is based on an earlier idea applied to antenna systems.
Volume I: Time and Frequency Domains Analysis 155

FIGURE K). Adaptive line enhancer (ALE).

A multichannel recording of evoked potentials, where each channel represents the voltage
monitored at a different location on the scalp, may serve as an example for such a problem.
Consider the ith channel's output \ "

x,0) = d = su,) + nu,)

x"1 = s"’ * n; i = 1,2......M - 1 (9.29)

where M is- the number of channels available.

Since we assume all signals s(il to be somehow correlated with one another, we can
arbitrarily choose the signal s(° \ in the channel of interest, to be the “ source” of all other
signals. The signal in each one of the other channels can be expressed, therefore, as the
output o f some unknown linear filter inputted by s(t>). Hence,

X(z)(i) = S(z)(i) + N(z)(i) = H ,(z-') ♦ S(z)(0) + N(z)(i\ i= 1,2,..,M - 1 (9.30)

Here we describe these relationships in the z domain by means of the unknown linear
filters, H,(z ■'). Refer to Figure 11 for the signals model and the adaptive filter. Each signal
x(i) is used as the desired input for an LMS adaptive filter (see Figure 5). The left side of
Figure 11 is just an imaginary model. In practice, the M given channels are used directly
as primary and reference inputs to the multichannel adaptive filter. In the process of ad
aptation, the weights of each LMS filter are simultaneously adjusted to minimize the power
of €j. After convergence of all LMS filters, the output of the combiner is the best least
squares estimate of the delayed primary signal s]l,).
The delay in the primary channel is required to account for possible delays in the filters
H|(z) and in the LMS filters. The distortions in the estimated signal as well as the noise
power spectra o f the output for the multichannel adaptive filter have been calculated.20 The
multichannel adaptive signal enhancer yields a substantial reduction in background noise
but often at the expense of considerable signal distortion21 and computation load.
156 Biomedical Signal Processing

S IG N A L A N D N O IS E M U L T IC H A N N E L A D A P T IV E
M OOEL F IL T E R

FIGURE l i . Multichannel signal enhancer.

B. Tim e-Sequenced Adaptive Filtering

An improvement for ihe conventional LMS filter has been suggested21 that enhances the
signal with much less distortion. This new modification, termed the time-sequenced adaptive
filtering, has been successfully applied to the problem of fetal ECG enhancement.22
One of the problems with the conventional LMS is that of signal nonstationarities. The
modified algorithm is especially suited for nonstationary signals with recurring statistical
characteristics. These recurring events do not have to be periodic. The ECG is such a signal.
The conventional LMS adaptive filter uses the Widrow-Hoff LMS algorithm (Equation
9.14} to find the minimum of the error surface, a concave hyperparaboloidal surface (Equation
9,8 ). If the signal is highly nonstationary. composed of rapid recurring wavelets in noise,
the LMS will not be able to track the nonstationarities. It would then converge to some best
tim e invariant filter. The time-sequenced adaptive filter consists of several LMS filters, each
slowly adapting to a specific “ stationary” mode of the signal. The total filter, therefore,
finds simultaneously several minima points on the error surface.
Consider the signal d(k) given by:

{ s,(k) + n(k) , ke(k|,kf) , (k-J,kf) ....

dfk) = ) s2(k) + n(k) , ke(k;,k;) , {k|,k1) ....

( sM(k) + n(k') , ke(kl1,k2.), (k^,k?v1).... (9.31)

The signal d(k) consists of M noisy processes s;(k), i = 1, 2, ..., M. each one appearing
in the signal at times kj to kj + !. The assumption is made that at each time only one process
is present in the signal. As in the conventional LMS filter, a reference signal, x(k), is given.
Additional input inquired here, called the sequence number, a k. This signal provides
information concerning the type of signal, s, currently present in the signal, d. Namely,
when <jk — i, we assume that x(k) — S;(k) + n(k). Figure 12 shows the time-sequenced
Volume I: Time and Frequency Domains Analysis 157

FIGURE 12. Time-sequenced adaptive filter.

adaptive filter. For each one of the M processes appearing in the signal, we sc: adaptive
filter. Each filter will adapt and find the minimum point appropriate for its ov-n process.
The filters are switched by two synchronized switches controlled by the sequence number.
crk. Consider the case where crk = i, depicted in Figure 12. The output of the L M S algorithm
is connected to the ith LMS filter, adjusting its weight vector W‘. This is dor.r using the
constant j j l ' . the ith element of the vector ji.

jtT = ( h ',| a2...... fiN1] (9.32)

The output of the ith LMS filter is connected to the summer to provide the error. ek. Note
that when the sequence number changes, another filter will be selected. The same LMS
algorithm will now adapt the weightings of the new filter using the appropria:e constant,
fx. Hence the adaptation process can be written as:

WL + <xk - i

Wj.; otherwise (9.33)

The computational load of the time-sequenced adaptive filter is almost the same as that
required from the conventional LMS filter, since in both cases only one vector is being
adjusted at a given time. The time-sequenced filter requires some more operations for the
switching and selection of proper constants and weights. The convergence time o f the filter
is longer than that of the conventional LMS filter, since here each one of the filters is
158 Biomedical Signal Processing

(a ) ABDOMINAL LEAD #■ 1 WITH M ATERNAL ECG CANCELLED

1 ,,''/ v V /V /A a$\
(b) ABDOMINAL LEAD & 2 WITH MATERNAL ECG CANCELLED

{c) CONVENTIONAL LMS LEAST-SOUARES ENHANCING

(d) TiME-SEQUENCED ENHANCING

FIGURE 13. Enhancement of fetal ECG. (From Ferrara, E. R. and Widrow. B .. IEEE Trans. Biomeil. Eng.,
29. 458. 1982. © 1982 IEEE. With permission.)

adapted only in the periods when the appropriate signal is present. Also, the memory required
for the time-sequenced filter is larger than that required by the conventional one.
When ork is given with no error, the time-sequenced filter converges to the correct optimal
time varying filter. In practical applications, however, the sequence number, a , is not perfect
but subject to “ jittering” which causes the filter to be less than optimal.

Example 9.5 — Fetal ECG Enhancement

The time-sequenced adaptive filter has been applied22 to the enhancement of fetal ECG.
Fetal ECGs are the signals extracted, by means of surface electrodes, from the mother’s
abdomen. The fetal ECG contains strong maternal QRS complexes, maternal muscle noise
(EM G), and power line interferences, in addition to the weak fetal QRS complexes. Both
maternal and power line interferences can be removed effectively by means of conventional
LMS adaptive noise canceller. The time-sequenced adaptive filter has been used22 to remove
the EMG without severely distorting the fetal QRS. Two abdominal leads .preprocessed for
removal of maternal ECG and power line interferences were used. The sequence number,
crk. was achieved by using matched filtering (see Chapter 1, Volume II) for R wave detection.
^Filters with 75 weight and a bias weight were used. Figure 13 shows the results of fetal
ECG enhancement with conventional and wtth time-sequenced adaptive filtering. The su
periority of the latter is clearly demonstrated.

REFERENCES

1. Davisson, L. D ., A theory of adaptive filtering, IEEE Trans. Inf. Theory. 12, 97, 1966.
2. M ehra, R. K ., Approaches to adaptive filtering, IEEE Trans. Autom. Control, 17. 693, 1972.
3. Johnson, C. R ., The common parameter estimation basis of adaptive filtering, identification and control,
IEEE Trans. Acoust. Speech Signal Process.. 30, 587, 1982.
Volume I: Time and Frequency Domains Analysis 159

4. Claasen, T. A. C. M. and Mecklenbrauker, W . F. G ., Overview of adaptive techniques in signal

processing, in Signal Processing II: Theories and Applications, Schussler, H. W., Ed., Elsevier, Amsterdam,
1983.
5. W idrow, B ., Adaptive filters, in Aspects o f Network and System Theory, Kalman, R. E. and De-Claris,
N ., Eds., Holt Rinehart and Winston, New York, 1970.
6. Feintuch, P. L ., An adaptive recursive LMS filter, Proc. IEEE, 64, 1622, 1976.
7. W idrow, B ., G lover, J. R ., J r., McCool, J. M ., Kaunitz, J ., Williams, C. S ., H earn, R. H ., Zeidler,
J. R ., Dong, E ., J r., and Goodiin, R. C ., Adaptive noise cancelling: principles and applications, Proc.
IEEE. 63. 1962, 1975.
8. W idrow, B ., M cCool, J. M , Earimore, M. G ., and Johnson, C. R ., J r., Stationary and nonstationary
learning characteristics of the LMS adaptive filter. Proc. IEEE, 64, 1151, 1976.
9. Reddy, V. U ., Egardt, B., and Kailath, T ., Optimized lattice form adaptive line enhancer for a sinusoidal
signal in broad band noise. IEEE Trans. Acoust. Speech Signal Process., 29, 702, 1981.
10. I'reichler, J. R ., T ransient and convergent behavior of the adaptive line enhancer. IEEE Trans. Acoust.
Speech Signal Process,, 27. 53. 1979.
11. Richard, J. T. and Zeidler, J. R., Second order output statistics of the adaptive line enhancer, IEEE
Trans. Acoust. Speech Signal Process., 21. 31. 1979.
12. Bershad, N. J. and Feintuch, P. L., The recursive adaptive LMS filter — a line enhancer application
and analytical model for mean weight behavior. IEEE Trans. Acoust. Speech Signal Process., 28 . 652.
1980.
13. Fisher, B, and Bershad, N. J ., The complex LMS adaptive algorithm — transient weight mean and
covariance with applications to the ALE. IEEE Trans. Acoust. Speech Signal Process.. 31. 34. 1983.
14. David, R. A. and Stearns, S. I)., A comparison of adaptive line enhancers, in Signal Processing II:
Theories and Applications, Schussler. H. W.. Ed.. Elsevier. Amsterdam. 1983.
15. Yelderm an, M ., Widrow. It.. Cioffi. J. M .. Hesler, E., and Leddy, J. A., ECG enhancement by
adaptive cancellation of electrosurgical interference. IEEE Trans. Biomed. Eng.. 30. > 0 . 1983.
16. Reed, F. A. and Feintuch, P. L., A comparison of LMS adaptive cancellers implemented in the frequency
domain and the time domain. IEEE Trans. A c o u S p e e c h Signal Process.. 29. 770. 19S1.
17. Furno, G. S. and Tompkins. W. J.. A learning filler for removing noise interferences. IEEE Trans.
Biomed. Eng., 30. 234. 1983.
18. Shensa, M. J ., Non Wiener solutions of the adaptive noise canceller with a noisy reference. IEEE Trans.
Acoust. Speech Signal Process.. 2X. 468. 1980.
19. Clover, J. R ., Adaptive noise cancelling applied to sinusoidal interferences. IEEE Tr.:rs. Aeons’. Speech
Signal Process., 25. 484. 1977,
20. Ferrara, E. R. and Widrow, B., Multichannel adaptive filtering for signal enhancement. IEEE Trans.
Acoust. Speech Signed Process.. 29. 766. 1981
21. Ferrara, E. R. and Widrow, B., The time sequenced adaptive filter. IEEE Trans. Circuits Syst.. 28.
519. ! 9 8 1.
22. Ferrara, E. R. and W idrow, B., Fetal electrocardiogram enhancement by time sequenced adaptive filtering.
IEEE Trans. Biomed. Eng.. 29. 458. 19,82.
23. Parker, S. R. and Griffiths, L. J ., Eds., Special issue on adaptive signal processing. IEEE Trans. Acoust.
Speech Signal Process., 29. 1981.
24. W eiss, A. and Mitra, I)., Digital adaptive filters: conditions for convergence rates of convergence, effects
of noise and errors arising from implementation. IEEE Trans. Inf. Theory, 25. 637. N ”9.
25. Friedlander, B ., System identification techniques for adaptive signal processing. Circuits Syst. Signal
Process., i. 3. 1982. ,
26. Friedlander, B ., Lattice filters lor adaptive processing. Proc. IEEE, 70, 829. 1982.
27. G ibso, C. J. and Haykin, S ., Learning characteristics of adaptive lattice Filtering algorithms. IEEE Trans.
Acoust. Speech Signal Process.. 28. 681. 1980.
28. Soderstrand, M. A. and Yigil. M. An adaptive filter w-hich adapts structure as well as weights. Ih l
./. Electron.. 55. 533. 1983.
29. Astrom, K. J. and Eykhoff, I*., System identification — a survey, Automatica, 1. 123. 1971.
30. Cohen, A ., Parameter estimation for medical diagnosis, Proc. 3rd IEAC Symp. Iden:. and Syst. Param.
Est.. North Holland. Amsterdam. 1973. 239.
31. Cohen, A .. Transplantability tests for isolated perfused organs, Med. Biol. Eng. Comput., 15, 349. 1977.
32. Dentino, M ., McCool, J ., and Widrow, B., Adaptive filtering in the frequency domain. Proc. IEEE.
ft(>. 1958. 1978.
33. M ansour, D. and Gray. A. H ., Unconstrained frequency domain adaptive filter. IEEE Trans. Acoust.
Speech Signal Process.. 30. 726. 1982.
34. Agassi, O. and Messerschmitt, Nonlinear echo cancellation < data signals, IEEE Trans. Commim., 30.
2421. 1982.
160 Biomedical Signal Processing

35. Verhoeckx, N . A. M ., van den Elzen, H. C ., Snijders, F. A. M ., and van Gerwen, P. J ., Digital echo
cancellation for baseband data transmission, IEEE Trans. Acoust. Speech Signal Process., 27, 768, 1979.
36. Low er, R. R ., Stofer, R. C ., and Shumway, N. E ., Homovital transplantation o f the heart, J. Thoracic
Cardiovas. Surg., 41, 196, 1961. j
37. Chabries, D. ML, Christiansen, R. W ., Brey, R. H ., and Robinette, M . S ., Application of [he LMS
adaptive filter to improve speech communication in the presence o f noise, Proc. ICASSP, IEEE, New York,
1982, 148.
38. L im , J. S. and Oppenheim, A. V ., Enhancement and bandwidth compression o f noisy speech, Proc.
IEEE, 67, 1586, 1979.
39. W idrow, B ., M antey, P. E ., Griffiths, L. J ., and Goode, B. B ., Adaptive antenna systems, Proc. IEEE,
5 5 ,2 1 4 3 ,1 9 6 7 .
40. Caraiscos, C. and Lin, B ., A roundoff error analysis o f the LMS adaptive algorithm, IEEE Trans. Acoust.
Speech Signal Process., 32, 34, 1984.
41. Norm ile, J. O. and Boland, F. M ., Adaptive filtering with finite wordlength constraints, IEE Proc.,
130(E), 42, 1983.
Volume I: Time and Frequency Domains Analysis 161

INDEX

A Autocorrelation function, 5, 25
Autocorrelation matrix, 93
Autocorrelation measure (ACM) method. 102— 103
Acetylcholine (ACh), 12
Autocorrelation method, 86, 106
ACM, see Autocorrelation method
Autocovariance, 85
Acquisition, 5— 6
Autoregressive integrated moving average
Actin, 12
(ARIMA), 84, 99— 100
Action potential, 11— 12
Autoregressive (AR) model, 83— 88
Adaptation, 141
comparison with other methods, 136
Adaptive algorithm, 135. 142
least squares model, 85— 88
Adaptive delta modulation (ADM), 32
order estimation, 95— 96
Adaptive estimation. 135
spectral estimation, 109, 122— 125
Adaptive filtering. 7. 141 — 160
Autoregressive moving average (ARM). 83, 85, 100
adaptation time, 141
Akaike information theoretic criterion. 97
adaptive signal correction, 143— 144
comparison with other methods, 136
adaptive signal estimation, 142— 143
filters, 74
adaptive system parameter identification, 142—
lattice filters, 99
143
order estimation, 95
algorithm, 142
spectral estimation, 109, 126— 132
frequency domain. 147
Autoregressive moving average exogenous variables
general structure o f filters, 142— 143
(ARMAX), 82— 83
improved, 154— 158
Availability o f long records, 45
lattice structure, 141— 142
AV node, 150
least mean square, 141, 143— 147
Axon, 9— 11
line enhancement. 141
multichannel. 141
multichannel adaptive signal enhancement, 154— B
156
noise cancellation. 141, 147— 154
Background broad band noise, 154
performance index. 142
time-sequenced. 141. 156— 158 Band-limited continuous signal, 29, 31
Adaptive filters. 73 Band-limited spectrum, 135
Adaptive linear combiner. 144— 145 Bartlett window, 111
Adaptive line enhancer (ALE). 143 Bayes’ rule, 17— 18
noise cancellation. 154— 155 Bias, 124
Adaptive segmentation. 101— 106 Bilinear transformation method, 74
ADM. see Adaptive delta modulation Bioelectric potentials, 9
Akaike information theoretic criterion (AIC), 97 Bioelectric signals, 6
ALE, see Adaptive line enhancer muscle, 12
Algorithms, 81 nerve cell, see also Nerve cell, 9— 12
Aliasing in frequency. 30 neuron, 9
All zero model, 83 origin of, 9— 13
“ Almost” periodic signals, 3— 4, 15 volume conductors. 12— 13
Almost stationary segments, 135 Biomedical signals, see also specific topics. 1
Amount of information. 122 Blackman-Tukey algorithm, 110, 124
Anaesthesia, 109 Blackman-Tukey procedure, 109, 111— 113, 117,
Analog processing. 29 124, 126, 135
Analog systems, 4 comparison with other methods, 136
Analog to digital (A /D ) conversion systems, 29 Block quantizer, 36
A priori information. 5, 7, 73, 130, 135, 141 Brain activities, 13
A priori optimal filter design, 141
AR, see Autoregressive mode!
ARIMA, see Autoregressive integrated moving
c
average i
ARMA, see Autoregressive moving average Capon’s spectral estimation, 133— 134, 136
ARMAX, see Autoregressive moving average exog CAT-computed averaged transients, see Synchron
enous variables ous averaging
Autocorrelation, 26, 85, 89 Cell body, 9— 11
Autocorrelation coefficients, 93 Central limit theorem, 26— 27
162 Biomedical Signal Processing

Central nervous system (CNS), 9, 140 Difference equations, 43

Cepstral analysis, 74, 76— 77 Differential pulse code modulation (DPCM), 32, 36
Cepstrum, 76 Digital filters, 73— 74
Chebychev inequality , 60 Digital signal processing, 29— 44
Chemical signals, 6 discrete methods, 42— 43
Circadian rhythms, 101 main tasks of, 29
Classification o f signals, 3— 4, 81 quantization, see also Quantization, 29, 36— 42
CNS, see Central nervous system sampling, see also Sampling, 29— 36
Coherence function, 72— 73 Digital technology, 4
Coherent averaging, 45 Digital-to-analog conversion (D/A), 30
Colored additive noise, 130 Dirichlet conditions, 65, 67
Colored noise, 72 Discrete Fourier transform (DFT), 4, 68— 70, 77,
Comb notch lifter, 80 79, 110— 111
Communications, 154 Discrete methods, 42— 43
Companding, 39 Discrete Parseval’s theorem, 70
Complex cepstrum, 76— 77, 80 Discrete random events, 15
Complex exponentiation, 76, 79 Discrete signal processing, 5
Complex frequency plane, 5 Discrete signals, 4— 5
Complex logarithm, 76— 77 Discrete time processing, 29
Compression, 81, 90 DPCM, see Differential pulse code modulation
Compressor, 38 Durbin algorithm, 87— 88, 98, 148
Computation load, 155— 157 Durbin procedure, 87— 88
Computer tomography (CT), 6 Dye dilution curves, 80
Conditional probability, 16— 18
Conditional probability density function, 20
Constraint maximization, 123— 124 E
^Continuous random variable, 17, 19— 20
Continuous signal processing, 5 ECG. see Electrocardiogram; Electrocardiographic
Continuous signals, 4 signals
Convolution integral, 66 Echo cancellation in communication network, 149
Convolution theorem, 66, 115 EEG. see Electroencephalogram; Electroencephalo-
Correlation analysis, 15, 23— 26 graphic signal
Correlation coefficient, 23— 2 4 ,1 1 0 EEM. 106
Correlation estimation, 56 Eigenvalue, 129— 130
Correlation function, 25— 26 Eigenvector, 129— 130
Correlation matrix, 25 Electrical signal, 1
Correlation vector, 129 Electrocardiogram (ECG), see also Fetal ECG; Ma
Correlation window, 115— 116 ternal ECG, 2, 150
Cosine matrix, 129 interferences, elimination of, 150— 151
Covariance, 23 nonuniform sampling process, 33— 34
Covariance lattice, 98 Electrocardiographic monitoring, 1
Co variance matrix, 25, 27 Electrocardiographic (ECG) signals, 3, 9, 15
Criterion autoregressive transfer function (CAT), 97 Electroencephalogram (EEG), 102— 103, 127, 135,
Cross correlation function, 25 143, 150
Cross-spectral density function, 72— 73 gaussian distribution, 27
Cross spectrum, 72 sample function, 26
Cyclic convolution, 70 spectral analysis, 109
Cyclic convolution theorem, 70 Electroencephalographic (EEG) signal, 4, 9, 21— 22
Electroencephalography, 1
Electromagnetic signals, 6
D Electro.nyographic (EMG) signals, 9, 150
Blackman-Tukey method, 112,
Data reduction, 6 periodogram, 118
Data window, 115— 118 spectral analysis, 109
Decimation in frequency, 70 Electrosurgical interferences, elimination of, 151—
Decimation in time, 70 152
Delta .function, 30 Electrosurgical unit (ESU), 151
Delta modulation (DM), 36 EMG, see Electromyographic signals
Dendrites, 9— 11 End plate, 12
Deterministic signals, 3, 5, 15 Energy spectral density estimation, 13'
DFT, see Discrete Fourier transform Energy spectrum density, 67
Volume I: Time and Frequency Domains Analysis 163

Ensamble, 4, 21— 22, 26 Fourier transform, see also Fourier transform,

Entropy, 122— 123 65—67
EP, sec Evoked potential inverse discrete Fourier transform, 68
EPSP, see Excitatory postsynaptic potential nverse Fourier transform, 65
Ergodicity, 26
Ergodicity theorem, 26
Iinear filtering, 73— 76
hoise cancellation, 154
Ergodic signals, 3— 4 representation, 65
Events, 15 spectral analysis, 71— 73
Evoked potential (EP), 4, 45— 46, 155 Frequency estimation bias, 124
Exact least square method, 142 Frequency filtering. 5
Excitable membrane, 10— 12 Frequency sampling method, 74
Excitatory postsynaptic potential (EPSP), 12 FT, see Fourier transform
Excitatory synapse. 11 Fundamental frequency, 78
Expander, 39
Expectations. 22— 23
G
Extracellular fluids. 10— 11

Gastrointestinal signals, 127

F Gaussian distribution, 27
Gaussian process, 15, 26— 28
Generalized likelihood ratio (GLR), 105— 106
Fast Fourier transform (FFT). 29, 68— 70
General measurement and diagnostic system, 1— 3
Fetal ECG, 6. 142— 143
time-sequenced adaptive filter applied to enhance
ment of. 156. 158 H
Fetal heart monitoring, 6
FF1, see Fast Fourier transform
Hamming window. 97
FHR, see Fetal heart monitoring
Hearing impaired, noise reduction for. 151 — 152
Filter design theory. 5
Heart activities, 13
Filtering, 4— 5. 141
Heart transplantation, 150— 151
Final prediction error (FPE), 96— 97
Homomorphic Filtering, 74. 77— SO
Finite impulse response (FIR), 74
Finite time averaging. 45— 64
availability of long records. 45 I
estimation of variance and correlation, 53— 56
finite time estimation of the mean value. 45— 53
IDFT, see Inverse discrete Fourier transform
stationarity. 45
IFT, see Inverse Fourier transform
synchronous averaging, see also Synchronous av
Ignorance, 122
eraging 56— 64
IIR, see Infinite impulse response
Finite time estimation of the mean value, 45— 53 111 conditioning, 98
continuous case. 45— 51 IMA, see Autoregressive integrated moving average
discrete case. 51— 53 Imaging, 6
Finite wordlength (FWL) computations, 98 Impulse invariant method, 74
FIR. see Finite impulse response Impulse response, 73
First order adaptive sampling, 32— 34 Infinite impulse response (IIR), 74
Fixed parameters filter. 73 Inhibitory postsynaptic potential (IPSP). 12
Forgetting weights. 106 Inhibitory synapse. 11
Four sigma loading. 40 Input-output identification techniques. 90
Fourier series. 4 Interferences, 2
Fourier transform (FT), 4— 5, 29— 30, 65— 67. 104 Intracellular fluid, 9— 11
convolution theorem, 66 Inverse discrete Fourier transform (IDFT). 68. 79
Parseval’s theorem. 66— 67 Inverse Fourier transform (IFT). 65
periodic signals. 67 Ion concentration changes, 9
spectral estimation methods based on, 109— 122 Ion selective electrodes, 9
Frequency convolution, 115— 116 IPSP, see Inhibitory postsynaptic potential
Frequency domain analysis. 4— 5, 65— 80, 109, Iterative estimate. 89—90
115— 116
adaptive filtering. 147
cepstral analysis. 74, 76— 77 J
discrete Fourier transform, 68— 70
fast Fourier transform, 68— 70 Jittering, 158
164 Biomedical Signal Processing

Joint central moments, 23 Maximum likelihood spectral estimation (MLSE),

Jointly gaussian distributed, 27— 28 133
J^int moment, 23 Mean, 22— 23
Joint probabilities, 16— 18, 22 Mechanical signals, 6
Joint probability density function, 4, 22, 27 MEM, see Maximum entropy method
Joint probability distribution function, 18 Menstrual secretions, 101
Jump nonstationarities, 106 MESA, see Maximum entropy spectrum analysis -
MESE, see Maximum entropy spectral estimation
Midtread uniform quantizer, 37
K Midviser uniform quantizer. 38
Mixed autoregressive moving average (ARMA)
K-complexes, 7 models, 90— 95
Kullback information, 97 direct method, 90— 93
Kullback’s divergence between conditional probabil maximum likelihood method, 93— 95
ity laws, 106 ML, see Maximum likelihood
Motor nerve, 12
Moving average (MA) model. 74, 83— 85, 89— 90
L comparison with other methods. 136
spectral estimation, 109. 125— 126
Lag window, 115— 116 Multichannel adaptive filter. 141
Laplace transform, 4— 5 Multichannel adaptive signal enhancement, 154—
Laryngical disorders, 109 156
Latencies estimation, 6 !— 64 Multivariate gaussian process. 27— 28
Lattice coefficients, 98— 99 Muscle, 12
Lattice filter, 99 Muscle fatigue, 109
Lattice inverse filter, 98 Mutually exclusive events. 16
Lattice representation, 98— 99 Myelin sheath, 10
Lattice structure, 141— 142 Myosin, 12
Leakage. 110. 11-5— 116. 133
Least mean square (LMS l adaptive filter. 141,
143— 148, 156
N
Least squares method, 85— S8. i 32
Lifter. 76 n-bit quantizer, 38
Linear filtering, 73— 80 Negentrophy, 97
Lineariy independent variables. 24 Nernst equation, 10
Linear prediction, 81— 108 Nerve cell, 9— !2
Line enhancement, 141 Nerve trunk, !0
Line interference. 150, 158 Neural information transfer, v
Line splitting, 124 Neural network, 9
LM S, see Least mean square <LMS) adaptive filter Neurological disorders, 109
Loading factor, 40 Neuromuscular junctions. 10. 12
Log likelihood ratio, 106 Neuron, see also Nerve cell. 9
Long pass lifter, 78 Neurophysiological studies. 13
Low-pass filter, 135 Neurophysioiogy, 9
LPC. 98 -9 9 Neurosignais, 7
Noise cancellation, 141, 14“— 154, 158
adaptive line enhancer. 154—,155
M ECG interferences, 150— 151
electrosurgical interferences. 151— 152
M A, see Moving average frequency domain. 154
Matched filtering, 141, 158 hearing impaired, 151 — 152
Maternal ECG. 142— 143. 158 line interference, 150
Maximum entropy, 109 with reference input, 148— 152
Maximum entropy method (MEM), 109, 122— 125 without reference input 152— 154
Maximum entropy spectral estimation (MESE), 122 Noninvasive techniques, 6
Maximum entropy spectrum analysis (MESA), 122 Nonperiodic signals. 3— 4-
xMaximum likelihood (ML). 90. 93— 95, 97. 109 Nonrecursive algorithm, 142
comparison with other methods, 136 Nonstationary processes, 99— 101
spectral estimation, 109, 133— 134 Nonstationary signals, 3— L 81. 84, 156
Maximum likelihood method (MLM), see Maximum Noniriform sampling, 31— 36
likelihood Non-u,liform sampling encoder (NSE), 34
Volume I: Time and Frequency Domains Analysis 165

Normal distribution, 37 Preprocessing of signal, 135

Normal equations, 86 Presynaptic region, 11
Normalized correlation function, 25 Prewhitening fillers, 135
Normalized covariance, 24 j Primary input, 144. 154
N-pole complex sinusoidal signal, 124 Probability density functions, 19 —20, 26, 41— 42
nth central moment, 23 j Probability distribution functions, 18— 19. 22, 27
nth moment, 22 Probability of event, 15, 17— 18
Nuclear magnetic resonance (NMR), 6 Probability theory, 15— 20
Nyquist rate, 30— 31 Processing of signal, 1— 2, A— 6
Process order estimation, 95—98
Prony’s methods, 109, 130— 132
o comparison with other methods, 136
PSD. see Power spectral density function
Off-line processing, 29 Pseudo random binary noise. 111
One-dimensional signals, 6
One-sided Z transform, 43
On-line processing, 29 Q
Optimal filters, 73
Order estimation, see Process order estimation QRS complex, 3. 7. 15, 151. 158
Outcome of experiment, 21 Quantization. 29. 36— \2
analysis of noise. 39— }()
block quantizer. 36
P companding, 39
compressor, 38
Parameter identification: 142— 143 expandor, 39
Parametric families of distributions. 106 four signs loading. 40
Parametric model, 81 loading factor. 40
PARCOR. see Partial correlation coefficients midtread uniform quantizers. 37
Parseval's theorem, 66— -67 midviser uniform quantizer. 38
Partial correlation coefficient* (PARCOR). 88, 9N-— n-hit quantizer. 3ft
99 probability den'i?\ function. 4 !— 12
Performance index, 142, 145 rough. 40—4 2
Periodic signals. 3— 4 sequential quantizers. 36
Periodograms, 109— 110. 112— 122 signal-to-noise ratio. 40
advantage of, i 14 zero memory quantization. 36-—39
comparison with other methods, 136 Quantization theorem. 41
expected value. 114— ! 16 Quefrency. 76— 7". 79— 80
smoothing. 119- -1 2 2
variance. 116— 117
weighted overlapped segment averaging, 117—
R
121
Phase unwarping, 77 Random processes. 15— 28
PHD, see Pisarenko’s harmonic decomposition correlation analysis. 23— 26
Piecewise-stationary, 101. 103 ensamble, 21— 22
Pisarenko harmonic decomposition (PHD), 109. gaussian process. 26— 28
127— 130 probability-theory, see also Probabili:;. theory.
comparison with other methods, 136 15— 20
Pitch, 78 random signals. 2 i— 23
Point processes, 7 stationary in the wide sense, 26
Pole model. 83 Random signals. 3— 5. 21— 23
Poie-zero model, 83 Random variables. 17— 18. 21
Positive definite, 130 Real-time processing. 29
Postsynaptic region, 12 Recognition, 6
Power cepstrum. 76— 77 Reconstruction. 3
Power spectral density (PSD) function. 5, 71— 72. Records alignment. 61— 64
109 Recursive algorithm. 142
Power vector. 129 Redundancies, 2
PQRST complex. 2 Reference inputs. 143. 145— 146
PRBN. see Pseuuw uuim.m binary noise noise cancellation with. 148— 152
Prediction. 75 noise cancellation without. 152— 154
Premature ventricular contractions (PVC), 79 Reference window. 103— 105
166 Biomedical Signal Processing

Reflection coefficients, 88 Speech signals, 106

Regression line, 23— 24 Spindles, 7
Relative frequency, 15 Standard deviation, 23
Relative refractory period, 11 Stationarity, see also Wide sense stationarity, 45
Residual, 86 Stationary modeling techniques, 100
Residuals correlation, 95 Stationary process, 3— 4, 22
Residuals flatness, 95— 96 ^ Statistical averages, 22— 23
Resting potential, 11 Statistically independent events, 17
Rough quantization, 40—42 Statistically independent random processes, 22
R-R interval, 3, 15 Steepest descent, 146
Run length encoding, 34— 36 Stochastic signal, 15
Storage, 2— 3, 6
SVD, see Singular valued decomposition
s Synapse, 10— 12
Synaptic cleft, 11
Sample function, 21— 22, 26 Synchronous averaging, 45, 56— 64
Sampling, 29— 36 estimation of latencies
nonuniform, see also Nonuniform sampling, 31— general case, 60— 61
36 records alignment, 61— 64
uniform, 30— 31 statistically independent responses, 58— 59
Sampling theorem, 30, 68 totally dependent response, 59
SA node, 150 Syntectic analysis, 7
Seasonal processes, 101 Syntactic methods, 106
Second differences method, 34
Second order adaptive sampling, 32— 34
Segmentation, see also Adaptive segmentation, 1—
T
2, 135
l SEM, see Spectral error measure (SEM) method Theoretic decision approach, 7
Sequential quantizers, 36 Time average, 26
Short pass lifter, 78 Time cross correlation, 26
Sidelobes, 109, 115— 116, 133 Time domain, 4
Signal correction, 143— 144 Time invariant filter, 73
Signal distortion, 155 Time-sequenced adaptive filtering, 141, 156— 158
Signal enhancement methods, 2 Time series analysis, 6, 81— 109
Signal estimation, 142— 143 Time varying filters, 73
Signal nonstationarity, see Nonstationary signals Time window, 1
Signal-to-noise ratio (SNR), 40, 45, 117, 141 Toeplitz matrix, 87
Singular values decomposition (SVD), 98, 127 Total refractory period, 11
Skeletal muscle cells, 12 Transducer, 1— 2, 6
Sleep states, 109 Transfer function (TF), 81— 82
Sliding windows, 102— 105 Transient signal, 3— 4
Smoothing, 75, 80 Transmission, 2— 3, 6
periodogram, 119— 122 Tremors, 109
Sonar detection, 154 Trend-like nonstationarities, 102, 106
Trend nonstationarity, 99— 100
Sonar tracking, 154
Triangular smoothing window, 122
Spectral analysis, 71— 73, 109, 122— 125
Triangular weighting window, 111
Spectral error measure (SEM) method, 103—-106
T-wave, 79
Spectral estimation, 109— 139
Two-dimensional signals. 6
autoregressive method, 109, 122— 125
Two points projection method, 32
autoregressive moving average methods, 109,
126— 132
Blackman-Tukey procedures, 109— 113 u
Fourier transform, methods based on, 109— 122
maximum likelihood, 109, 133— 134
Ultrasonic imaging, 6
moving average method, 109, 125— 126 Uncorrelated independent variable, 24
periodogram, 109— 122 Uniform sampling, 30— 31
Pisarenko harmonic decomposition, 109, 127—
130
Prony’s methods, 109, 130— 132 V
Spectrum, 71, 109
Speech processing, 101, 135 Variance, 23
Volume I: Time and Frequency Domains Analysis

Variance estimation. 53— 56 Window bias, 135

continuous case, 53— 54 Windowing effects, 97
discrete cases. 54— 56 Woody filtering, 61
VER. see Visual evoked response WOSA, see Weighted overlapped segment
Visual evoked response (VliR). 79 averaging
Voltage triggered method. 32
Volume conductors. 1 2 - 1 3
X

w
X-ray analysis. 6

Walsh spectral, estimator, 1? \

Wavelet. 2, 7, 61, 80, 156 Y
Weighted overlapped segment averaging (WOSA),
117— 121. 124
Yule-Walker equations, 86, 93, 97— 9S. 127
Weighting sequence. 93
Weighting windows. 110
Whitening filter, 85, 90 z
White noise, 72
Wide sense stationarity, 26, 28
Widrow-Hoff LMS algorithm, 146. 156 Z domain, 5
Wiener filter, 73— 76 Zero memory quantization, 36— 39
Wiener-Hopf condition, 75 Zero order adaptive sampling, 32— 34
Wiener-Hopf equation. 146 Zero order hold (ZOH), 30
Wiener-Khinchin relations. 71, 109— 110. 112 Zero padding, 110
Wiener-Khinchin theorem. 111 Z transform, 4, 42—43, 77, 104
Biomedical
Signal Processing

Volume II
Compression and Automatic
Recognition

Author

Arnon Cohen, Ph.D.

Associate Professor
Departm ents of Biomedical Engineering and
Electrical and Computer Engineering
Head
Center for Biomedical Engineering
Ben Gurion University
Beer Shcva. Israel

C R C P ress, Inc.
Boca R ato n, Florida
Library o f Congress Cataloging in Publication Data

Cohen, A m o n , 1938-
B ioinedical signal processing/

Bibliography:*?.
Includes index.
Contents: v. 1. time and frequency domains analysis
— v. 2. Compression and automatic recognition.
1. Signal processing. 2. Biomedical engineering.
I. Title.
R 857.S47C 64 1986 610'. 28 85-9626
ISBN 0-8493-5933-3 (v. 1)
ISBN 0-8493-5934-1 (v. 2)

This book represents information obtained from authentic and highly regarded sources. Reprinted material is
quoted w ith permission, and sources are indicated. A wide variety o f references are listed. Every reasonable effort
has been m ade to give reliable data and information, but the author and the publisher cannot assume responsibility
for the validity of all materials or for the consequences o f their use.

A ll rights reserved. This book, or any parts thereof, may not be reproduced in any form without written consent
from the publisher.

Direct all inquiries to CRC Press, Inc., 2000 Corporate Blvd.. N .W ., Boca Raton. Florida, 33431.

© 1986 by CRC Press. Inc.

Second Printing, 1988

International Standard Book Number 0-8493-5933-3 (v. 1 ?

International Standard Book Number 0-8493-5934-1 (v. 2 s

Library o f Congress Card Number 85-9626

Printed in the United States
Dedicated to
my m other Rachel
my wife Yama
and my sons Boaz, Gilead, and Nadav
PREFA CE

B iom edical signal processing is o f prime importance not only to the physiological re
searcher but also to the clinician, the engineer, and the computer scientists who are required
to interpret the signal and to design systems and algorithms for its m anipulations.
T h e biom edical signal is, first o f all, a signal. As such, its processing and analysis are
covered by the numerous books and journals on general signal processing. Biomedical
sig n als, however, possess m any special properties and unique problem s that render the need
for special treatment.
M o st o f the material dealing with biom edical signal processing methods has been w iddy
scattered in various scientific, technological, and physiological journals and in conference
proceedings. Consequently, it is a rather difficult and time-consuming task, particularly to
a new com er to this field, to extract the subject matter from the scattered information.
T h is book was not m eant to be another text or reference on general signal processing. It
is intended to provide m aterial o f interest to engineers and scientists who wish to apply
m odern signal processing techniques to the analysis o f biomedical signals. It is assumed the
read er is fam iliar with the fundam entals o f signals and system s analysis as well as the
fundam entals o f biological system s. T w o chapters on basic digital and random signal proc
essin g have been included. These serve only as a summary of the material required as
background for other material covered in the book.
T he presentation of the material in the book follows the flow of events of the general
signal processing system. After the signal has been acquired, some manipulations are applied
in order to enhance the relevant information present in the signal. Simple, optim al, and
adaptive filtering are exam ples o f such manipulations. The detection o f wavelets is of
im portance in biomedical signals; they can be detected from the enhanced signal by several
m ethods. The signal very often contains redundancies. When effective storing, transmission,
o r autom atic classification are required, theso redundancies have to be extracted. The signal
is then subjected to data reduction algorithms that allow the effective representation in terms
o f features. Methods for data reduction and features extraction are discussed. Finally, the
topic of automatic classification is dealt with, in both the decision theoretic and the syntactic
approaches.
T he em phasis in this book has been placed on modem processing m ethods, some of which
have been only slightly applied to biom edical data. The material is organized such that a
m ethod is presented and discussed, and examples of its application to biomedical signals
are given. Rapid developm ents in digital hardware and in signal processing algorithms open
new possibilities for the applications o f sophisticated signal processing methods to biome
dicine. Solutions that were cost prohibitive beforehand or im practical because of the lack
o f appropriate algorithm becom e available. In such a dynamic environm ent, the biomedical
signal processing practitioner requires a book such as this one.
T h e author wishes to acknowledge the help received from many students and colleagues
d uring the preparation of this book.

A rnon C ohen
T H E AUTHOR

A nion C ohen. P h .D ., is an Associate Professor of Electrical and Bio-M edical Engineering

at the B en-G urion University, Beer-Sheva, Israel, and the head o f the Center for Bio-Medical
Engineering.
Dr. Cohen received his B.Sc. and M .Sc. degrees in Electrical Engineering in 1964 and
1966 respectively, from the Israel Institute of Technology (Technion) in H aifa. He received
his P h.D . in 1970 from the Department of Electrical Engineering (Bio-M edical Engineering
Program ). Carnegie Mellon University in Pittsburgh.
Prior to joining the Ben-Gurion University in 1972, he was an Assistant and later an
Associate Professor of Electrical and Bio-Medical Engineering at the Connecticut State
U niversity. Storrs.
During 1976-1977, Dr. Cohen was appointed a Visiting Professor at the Electrical
Engineering departm ent of Colorado State University, Fort Collins.
Dr. Cohen has been the recipient o f research grants from various foundations and cor
porations in the U .S. and Israel. He has been a consultant with various industrial companies
in the U .S . and Israel. Recently, he has founded and became the president o f MEDI-SOFT,
a consulting firm in the field o f bio-medical signal processing. Dr. C ohen’s current research
interest* are signal and speech processing with applications to biomedicine.
TA B LE O F CONTENTS

V o lu m e I

C h a p ter 1
In tro d u ctio n
I. G en eral M easu rem en t and D ia g n o stic S y ste m ........ .1
II. C la s sific a tio n o f S ig n a ls ............. .............................. .. .3
HI. F u n d am en tals o f S ig n a l P r o c e s s i n g .............................
IV . B io m e d ic a l S ig n a l A c q u isitio n and P r o c e s s in g ___ . 5
V. T h e B o o k ...... ~ .......................... ..................................... .. .6
R e f e r e n c e s . . . . . . . .......................................... ............ ............................ . 7

C h a p ter 2
T h e O r ig in o f the B io e le c tr ic S ig n a l
I. in tr o d u c tio n ..................................................... .......................... . 9
II. T h e N erv e C ell ................................................. ...................... . 9
A, Introdu ction ............................................ .................. .y
8. T h e E x c ita b le M e m b r a n e . . ................... 10
C. A ctio n P oten tial Initiation and Propagation n
D. T h e S y n a p se ............................................................. ii
III. T h e M u sc le ............... ............................ ................................... 12
A. M u scle S t r u c tu r e .................................................... i
B. M u scle C o n tr a c tio n ........................... . .................. 12
IV . V o lu m e C o n d u c to r s ............................................................... 12
R e f e r e n c e s ...... .................... ............................... ....................................... 13

C h a p ter 3
R a n d o m P r o cesse s
I. I n tr o d u c tio n .................................... ......................................... 15
II. E le m en ts o f P robab ility T h e o r y ......................... ............ is
A. I n tr o d u c tio n ......................... ..................................... 15
B. Joint P r o b a b ilit ie s .................................................. 16
C. S ta tistica lly Ind ep en d en t E v e n t s ............ 17
D. R and om V a r ia b le s .................................................. 17
E. Probab ility D istrib u tion F u n c tio n s ................ 18
F. P robab ility D en sity F u n c t io n s ......................... 19
III. R an d om S ig n a ls C h a r a c te r iz a tio n ................................. 21
A. R and om P r o c e s s e s ................................................... 21
B. S tatistical A v e r a g e s (E x p e c t a t io n s ) .............. TI
IV . C orrelation A n a l y s i s ............................................................. 1
A. T h e C orrelation C o e ffic ie n t ............................. 23
B. T h e C orrelation F u n c tio n ................................... 25
C. E r g o d ic it y ...... ................ ........................................... 26
V. T h e G au ssian P r o c e s s ............................................ .............. 26
A. T h e C entral L im it T h e o r e m ............................. 26
B. M u ltivariate G a u ssia n P r o c e s s ......................... 21
R e fe r e n c e s ................................................................................................... 28
C hapter 4
D ig ita l S ig n a l P r o c e ssin g
I. I n tr o d u c tio n ..................................................................................; ............................................................. 29
II. J
S a m p lin g ........................................................................................ ............................................................. 29
A. I n tr o d u c tio n ................................................................. i ............................................................. 29
B. U n ifo r m S a m p lin g ....................................................................................................................30
C. N o n u n ifo rm S a m p lin g ............................................................................................................31
1. Z er o , First, and S eco n d O rder A d a p tiv e S a m p lin g ................................ 32
2. N o n u n ifo rm S am p lin g w ith Run Length E n c o d in g ................................ 34
III. Q u a n t iz a t i o n ............................................................................................................................................... 36
A. I n tr o d u c tio n .................................................................................................................................36
B. Z er o M em ory Q u a n tiz a tio n ....................................................................................................
C. A n a ly s is o f Q u an tization N o is e ......................................................................................... 39
D. R o u g h Q u a n t iz a tio n ................................................................................................................40
IV . D is c r e te M e t h o d s ............................................................................................................. - ...............42
A. T h e Z T r a n s fo r m ............................................................................................ ,\v : ,.................42
B. D iffe r e n c e E q u a tio n s ...................................................................................... ................. 43
R e f e r e n c e s ............................................................................................................................................. ................... 44

C hap ter 5
F in ite T im e A v e r a g in g
I. I n tr o d u c tio n ..................................................................................................................................................4}
U. F in ite T im e E stim ation o f the M ean V a l u e .............................................................................. 45
A. T h e C o n tin u o u s C a s e ..............................................................................................................4r>
1. Short O b servation T im e ......................................................................................... 47
2. L ong O b servation T im e ................. ....................................................................... 48
B. T h e D iscrete C a s e .................................................................................................................... 51
III. E stim a tio n o f the V arian ce and C o r r e la tio n ...............................................................................53
A. V a iia n c e E stim ation — T he C o n tin u o u s C a s e .........................................................
‘ B. V a ria n ce E stim ation — The D iscrete C a s e ................................................................54
C. C orrelation E stim a tio n ............................................................................................................36
IV . S y n c h r o n o u s A v era g in g (C A T -C om p u ted A v era g ed Transients') .................................. 56
A. I n tr o d u c tio n ......................................................................................................... ....................... 56
B. S ta tistic a lly Ind ep en d en t R e s p o n s e s .............................................................................. 58
C T o ta lly D ep en d en t R esp o n ses ................................................................. . . . . . . . 59
D. T h e G eneral C a s e ........................................................................................ .r * . ; .................60
E. R eco rd s A lig n m e n t, E stim ation o f L a t e n c i e s ........................................................... 61
R e f e r e n c e s ........................................................................................................................................... - - e ................. 64

C h ap ter 6
F req u en cy D o m a in A n a ly sis
I. in t r o d u c tio n ..................................................................................................................................................65
A. F req u en cy D o m a in R ep resen tation .................................................................................65
B. S o m e P roperties o f the Fourier T r a n s fo r m ................................................................. 65
1. T h e C o n v o lu tio n T h e o r e m ...................................................................................66
2. P ar se v a l's T h e o r e m ................................................................................................. 6(r
3. F ourier T ransform o f Peril ’ . a l s ......................................................... 67
C. D isc r e te and Fast Fourier T ran sform s (D F T . F F T ) ...............................................68
II. S p e ctra l A n a ly sis .....................................................................................................................................7i
A. T h e P ow er Spectral D en sity F u n c t io n .......................................................................... /i
B. C ro ss-S p ectr a l D en sity and C oh e ren ce F u n c tio n s................................................... 11
HI. Linear F ilterin g ................................................................................................................... 73
A. Introduction .................................................................. ......................................... 73
B. Digital F ilters.. ....................................................................................................... 74
C. The Wiener F ilte r .................. ...............................................................................74
IV. Cepstral Analysis and Homomorphic Filtering..................................... ...................... 76
A. In tro Ju ctio n ................ .................................................................................... 76
B. The C e p stra .............. ......................................... . — ......................................... 76
C. Homomorphic Filtering........................................................................................ 77
R eferences.. . . . . . . ......................... .................................................................................................... 80

Chapter 7
Tim e Series Analysis-Linear Prediction
I. Introduction.................................. ....................................................................................... 81
II. Autoregressive (AR) M odels........... ............................ ......................... 85
A. In troduction............................................................................................................ 85
B. Estimation of AR Parameters — Least Squares M ethod.............................. 85
III. Moving Average (MA) Models ................................ ...................................................... 89
A. Autocorrelation Function o f MA Process.........................................................89
B. Iterative Estimate o f the MA Parameters ......................................................... 89
IV. Mixed Autoregressive Moving Average (ARMA) M o d e ls.......................................90
A. Introduction............................................................................................. ..............90
B. Parameter Estimation of ARM A Models — Direct M e th o d ....................... 90
C. Parameter Estimation of ARMA Models — Maximum Likelihood Method
93
V. Process Order E stim ation..................................................................................................95
A. introduction .............................................................................................................95
B. Residuals F latness..................................................................................................95
C. Final Prediction Error (FPE).............................. ................................................ 96
D. Akaike Information Theoretic Criterion (A IC )................................................97
E. Ill Conditioning o f Correlation M atrix.............................................................. 98
VI. Lattice Representation ....................................................................................................... 98
VII. Nonstationary P ro c e sse s................................................................................................... 99
A. Trend Nonstationarity — A R IM A ..................................................................... 99
B. Seasonal P ro c esses__ 1................................................................... ...................101
VIII. Adaptive S egm entation .................... .............................................................................. 101
A. Introduction .................................................... ........................... .......................... 101
B. The Autocorrelation Measure (ACM) M ethod.............................................. 102
C. Spectral Error M easure (SEM) M ethod...........................................................103
D. Other Segmentation M ethods............................................................................ 105
References......................................................... ............................................................................... 106

Chapter 8
Spectral Estimation
I. Introduction......................................................................................................................... 109
II. Methods Based on the Fourier Transform ....................................................................110
A. Introduction....... .................................................................................................. 110
B. The Blackman-Tukey M eth o d ...........................................................................I l l
C. The Periodogram .................................................................................................. 112
1. Introduction.............................................................................................. 112
2. The Expected Value of the Periodogram ...........................................114
3. Variance of the Periodogram................................................................ 116
4. Weighted Overlapped Segment Averaging (W O S A )......................117
5. Smoothing the Periodogram..................................................................119
III. M axim um Entropy Method (MEM) and the AR Method......................................... 122
IV. T he M oving Average (MA) M ethod............................................................................ 125
V. Autoregressive Moving Average (ARMA) M e th o d s................................................ 126
A. The General C a s e ............................................................................................... 126
B. Pisarenko’s Harmonic Decomposition (PHD)................................................ 127
C. Prony’s M e th o d ...................................................................................................130
VI. M aximum Likelihood Method (MLM) — Capon’s Spectral Estim ation...............133
VII. Discussion and Comparison o f Several M ethods.......................................................134
References........................................................................................................................................ 137

Chapter 9
Adaptive Filtering
I. Introduction........................................................................................................................141
II. General Structure of Adaptive F ilters.......................................................................... 142
A. Introduction..........................................................................................................142
B. Adaptive System Parameter Identification..................................................... 142
C. Adaptive Signal Estimation............................................................................... 142
D. Adaptive Signal Correction............................................................................... 143
III. Len^t Mean Squares (LMS) Adaptive Filter................................................. ...........143
A. In tro duction..........................................................................................................143
B. Adaptive Linear C om biner................................................................................144
C. The LMS Adaptive Algorithm.......................................................................... 145
D. The LMS Adaptive Filter...................................................................................147
IV. Adaptive Noise Cancelling..............................................................................................147
A. Introduction..........................................................................................................147
B. Noise Canceller with Reference In p u t............................................................ 148
C. Noise Canceller without Reference Input.......................................................153
• D. Adaptive Line Enhancer (ALE) .......................................................................154
V. Im proved Adaptive Filtering........................................ ................................................. 154
A. Multichannel Adaptive Signal Enhancement..................................................154
B. Time-Sequenced Adaptive F ilterin g ................................................................156
References........................................................................................................................................ 158

In d e x ..................................................................................................................................................161

TABLE OF CON TENTS

Volume II

Chapter 1
Wavelet Detection
I. Introduction........................................................................................................................... 1
II. Detection by Structural F eatures......................................................................................2
A. Simple Structural Algorithms............................................................................... 2
B. Contour Lim iting....................................................................................................5
III. M atched Filtering................................................................................................................ 6
IV. Adaptive Wavelet D etectio n ............................................... .,........................................... 9
A. Introduction ............................................................................................................ 9
B. Template A daptation........................................................................................... 10
C. Tracking a Slowly Changing W avelet. . . . ....... .............................. 12
D. Correction of Initial T em plate. . . . . . . . . . . . . . . . ..... ......................... 12
V. Detection o f Overlapping Wavelets .............................................................. 14
A. Statement ot the P roblem .. . ....... ............... ................. .................... 14
B. initial Detection and Composite Hypothesis Form ulation. . ......... 15
C. Error Criterion and M inim ization........................................ . .......... . 16
R eferences.............. ................................................................................. ....................... 17

C h a p te r2
Point Processes
I. Introduction...................... — ....................... ....................... ......... ............... 19
II. Statistical P relim inaries..... ............................. ................................ .......... 20
HI. Spectral A n a ly sis.................... ....................................................................... 24
A. Introduction .. .................................................................................. 24
B. Interevent intervals Spectral A n aly sis..... ........ ............................... 24
C. Counts Spectral A nalysis............ ....................................................... 25
IV. Some Commonly Used M o d e ls........................................ ............................. 26
A. Introduction.............. .......................................... ................................ 26
B. Renewal Processes . ...................................................................... 26
1. Serial Correlogram ........................................ .................. . 27
2. Flatness o f S pectrum ................................... — ............. 27
3. A Nonparametric Trend T est........................................ .— 28
C. Poisson Processes . . .............. ................................. ............................ 28
D. Other D istributions.................................................... . . . . .............. .31
1. The Weibull D istribution..................................................... .31
2. The Erland (Gamma) Distribution ....................................... .32
3. Exponential Autoregressive Moving Average (EARMA), .32
4. Semi-Markov Processes....................................................... .32
V. Multivariate Point P ro cesses............ .......................... ............................. .33
A. In troduction ..................... ............................................ ...................... .33
B. Characterization of Multivariate Point P ro cesses.............. ......... .33
C. Marked P ro c esses............................................................................... .35
R eferences..................................................... ................................................................. . 35

Chapter 3
Signal Classification and Recognition
I. Introduction..................................... ................................................. ............... .3 7
II. Statistical Signal C lassification......................................... ........................... .3 9
A. In troduction.......................................................................................... .3 9
B. Bayes Decision Theory and Classification.............................. — .3 9
C. k-Nearest Neighbor (k-NN) Classification..................................... .5 0
III. Linear Discriminant F unctions............................... ...................... ............... .53
A. In troduction.......................................................................................... .5 3
B. Generalized Linear Discriminant Functions.............. ..................... .55
C. Minimum Squared Error M ethod..................................................... .5 6
D. Minimum Distance Classifiers.......................................................... .5 8
E. Entropy Criteria M ethods................................................................... .6 0
1. Introduction.............................................................................. .6 0
2. Minimization of Entropy....................................................... .60
3. Maximization of E n tro p y ..................................................... .6 2
IV. Fisher’s Linear D iscrim inant................ ......................................................... .6 3
V. Karhunen-Loeve Expansions (KLE)................................................................................66
A. In troduction............................................................................................................66
B. Karhunen-Loeve Transformation (KLT) — Principal Components Analysis
(P C A )...................................................................................................................... 67
C. Singular Value Decomposition (SV D ).............................................................. 69
VI. Direct Feature Selection and O rdering...........................................................................75
A. In troduction............................................................................................................75
B. The Divergence ....................................................................................................76
C. Dynamic Programming M eth o d s.......................................................................77
VIL Tim e W a rp in g .................................................................................................................... 79
R eferences.......................................................................................................................................... 84

Chapter 4
Syntactic Methods
I. Introduction.................................................................................................................. ....8 7
II. Basic Definitions of Formal Languages.......................................................... .............89
III. Syntactic Recognizers........................................................................................................92
A. In troduction.................................................................................. ...... .................. 92
B. Finite State A utom ata..........................................................................................92
C. Context-Free Push-Down Automata (PD A )..... ............................................... 95
D. Simple Syntax-Directed T ranslation...................................... ........................100
E. P a rsin g ...................................................................................................................100
IV. Stochastic Languages and Syntax A n aly sis................................................................101
A. In troduction ........................................ ...................... ......................................... 101
B. Stochastic R ecognizers...................................................................................... 102
V. Grammatical Inference.................................................................................................... 104
VI. E x a m p le s.......................................... ........................................................... ..................104
A. Syntactic Analysis of Carotid Blood P re ssu re.............................................. 104
B. Syntactic Analysis of E C G ................................................................................106
* C. Syntactic Analysis of E E G ................................................................................110
References........................................................................................................................................ i ii

Appendix A
Characteristics of Some Dynamic Biomedical Signals
I. Introduction........................................................................................................................113
II. Bioelectric S ig n als............................................. ..............................................................113
A. Action P otential...................................................................................................113
B. Electroneurogram (E N G )...................................................................................113
C. Electroretinogram (E R G ).......................................................................... ........113
D. Electro-Oculogram (EO G )................................................................................. 114
E. Electroencephalogram (EE G )............................................................................ 114
F. Evoked Potentials (E P )................................................................................ . 117
G. Electromyography (E M G )............ ....................................................................119
H. Electrocardiography (ECG, E K G )................................................................... 121
1. The S ig n al............................................................................................... 12i
2. High-Frequency Electrocardiography..................................................124
3. Fetal Electrocardiography (FE C G )..................................................... 124
4. His Bundle Electrocardiography (H B E )............................................ 124
5. Vector Electrocardiography (V C G ).................................................. Iz4
I. Electrogastrography (EGG).......................................................................... •1 2 4
J. Galvanic Skin Reflex (GSR), Electrodermal Response (EDR).................. 125
III. Im p ed ance. .................................................................................................................... 125
A. Bioimpedance ..................................................................................................... 125
B. Impedance Plethysm ography.................................................... ...................... 126
C. Rheoencephalography (R E G )........................................ ............j......................126
D. Impedance Pneumography............... . ........................................j.....................126
E. Impedance Oculography (Z O G )............................................... 1......... . 126
F. Electroglottography.............................................................................................126
IV. Acoustical Signals .............. . . . . . ............... ..................................................... ............126
A. Phonocardiography . . . . . . ........ ......................................................................... 126
1. The First Heart Sound . ....................... ............................... ............... 126
2. The Second Heart Sound .................................................................... 127
3. The Third Heart Sound .......................... ............................................. 127
4. The Fourth Heart S o u n d ......................................................................127
5. Abnormalities of the Heart S o u n d .............. ...................................... 127
B. Auscultation.........................................................................................................127
C. V o ic e ...................................................................................................................128
D. Korotkoff S o u n d s....... ........................................... .........................................129
V. Mechanical S ig n als........................................................................................................ 130
A. Pressure Signals ................................. ................................................................ 130
B. Apexcardiography (ACG) .................................................................................130
C. Pneumotachography.............................................................................. ............130
D. Dye and Thermal Dilution............................................................................... 130
E. Fetal M ovem ents................... ............................................................................ 131
VI. Biomagnetic Signals .......................................................................................................131
A. Magnetoencephalography (M EG).................................................................... 131
B. Magnetocardiography (M C G )......................................................................... 131
C. Magnetopneumography (M PG )........................................................................131
VII. Biochemical S ig n als......................................................... .......................................... 131
V lli. Two-Dimensional Signals............................................................................................. 132
References...................................................................................................... ............................ .134

Appendix B
Data Lag Windows
I. Introduction.............. ........................................................................................................ 139
II. Some Classical W indow s.............................................................................................. 139
A. Introduction.................................................................... ................................. 139
B. Rectangular (Dirichlet) W indow..... ..................................................... .......... 140
C. Triangle (Bartlet) Window ...................................' .......................................... 140
D. Cosinea Windows .............................................................................................141
E. Hamming W in d o w ............................................................................................143
F. Dolph-Chebyshev W indow ...................................... ....... ................................ 145
References......................................................................................................................................151

Appendix C
Computer Programs
I. Introduction........................................ .............................................................................. 153..
II. Main P ro g ram s....... ........................................................................................................ 154
• NUSAMP (Nonuniform Sam pling).................................................................154
• SEGMNT (Adaptive Sesm entation)............................................................,.1 5 8
• PERSPT (Periodosram Power Spectral Density E stim ation)..................... i62
• WOSA (WOSA Power Spectral Density Estim ation)................................. 162
• MEMSPT (Maximum Entropy [MEM] Power Spectral
Density E stim ation)........................................................ ...................................165
• NOICAN (Adaptive Noise Cancelling)......................................................... 167
• CONLIM (Wavelet Detection by the Contour Limiting M ethod)............ 169
• COMPRS (Reduction of Signal Dimensionality by Three Methods:
K:T+iunen-Loeve (KL], Entropy [ENT], and Fisher Discriminant [F I]... 171
III. S u b ro u tin es.......................................................................................................................174
• LMS (Adaptive Linear Combiner, W idow’s Algorithm)................ ............174
• NACOR (Normalized Autocorrelation Sequence)........................................ 175
• DLPC (LPC, PARCOR, and Prediction Error of AR Model of Order
P) ...........................................................................................................................176
• DLPC 20 (LPC, PARCOR, and Prediction Error of All AR Models of
Order 2 to 2 0 )............................................................................................ .....1 7 7
• FTOIA (Fast Fourier Transform {FFT])........................................................178
• XTERM (Maximum and Minimum Values of a V ecto r).......................... 180
• ADD (Addition and Subtraction of Matrices)............................................... 180
• MUL (Matrix Multiplication)........................................................................... 181
• MEAN (Mean of a Set of Vectors).................................................................181
• COVA (Covariance Matrix of a Cluster o f Fectors)................................... 182
• INVER (Inversion of a Real Symetric M a trix )............................................183
• SYMINV (Inversion of a Real Symetric Matrix, [Original Matrix
Destroyed]).......................................................................................................... 183
• RFILE (Read Data Vector From Unformatted F ile )................................... 185
• WFILE (Write Data Vector on Unformatted F ile)...................................... 186
• RFILEM (Read Data Matrix From Unformatted F ile)................................187
• WFILEM (Write Data Matrix on Unformatted File)................................... 188

Index 189
Volume //; Compression and Automatic Recognition 1

Chapter 1

W AVELET DETECTION

i
— i
I. INTRODUCTION

A com m on problem in biomedical signal processing is that of detecting the presence o f

a wavelet in a noisy signal. A wavelet is considered here as any real valued function o f
time, possessing some structure. The approximate shape of the wavelet, expected to be
present in the noisy signal, may be known and the requirements are to estimate its time o f
occurrence and its exact shape. The problem, of course, is not restricted to biomedical
signals; it appears, for example, in communications, radar, sonar, or seismologicai signals.
Two main approaches are used to solve the problem. The first uses strucrural features of
the wavelet and the second uses template matching techniques.
The approximate knowledge of the wavelet’s shape can be used to select typical structural
features. These features are .random variables with some empirically determined probability
density functions. The noisy signal is then continuously searched for these features.
The search can be conductcd in various ways. Sophisticated algorithms that can be used
to recognize a wavelei, once its features have been extracted, are discussed in Chapters 3
and 4. These algorithms are based on the decision-theoretic or syntactic approaches to pattern
recognition. In this chapter we shall consider much simpler techniques, which are usually
implemented by relatively inexpensive, dedicated hardware The algorithms are primarily
heuristic and are usually specific to a given wavelet (say the QRS complex in the ECG).
A more general approach is the one where the approximate knowledge on the waveiet is
used to generate a template, which is some average wavelet determined by the a priori
knowledge on the signal. The signal is then continuously matched with the template by
means of correlation, matched filtering, or other pattern recognition technique;. The method
is general in the sense that one can use the same algorithm for the detection o f any wavelet
by just replacing the template. The choice between these two approaches depends, among
others, on the wavelet, the signal-to-noise ratio, available time of computation, and cost.
Wavelet detection is required in many biomedical signal processing applications. Detection
is required for monitoring, alarm systems, and as an initial part of automatic classification.
Detection and classification of aperiodic EEG waveshapes' in the time domain are used for
a variety o f clinical applications, such as the treatment of epilepsy or sleep state analysis.2
The EEG signal contains several aperiodic wavelets which are of clinical interest. These
are, for exam ple, spikes, K-complexes, or sleep spindies. Several algorithms for the detection
of the various EEG wavelets using both the specific and general methods were report
ed.3 10 Evoked response w avelets,” as well as multispike train recordings,1- 13 were also
detected by such methods. The problem of detecting and classifying wavelets also exists in
electrocardiography, for QRS complex detection,15 fetal heart rate determination,*6 P-wave
detection,17 in phonocardiography for detection and classification of heart sounds,18 in breath
signal analysis,!9 and many more.
In this chapter we shall discuss the problems associated with wavelet detection. First the
problem is mathematically formulated. Detection algorithm for time invariant nonoverlapping
wavelets is discussed followed by a discussion on an adaptive algorithm. Problems of wavelet
alignment, detecting overlapping wavelets and normalization of wavelets by time warping,
are also discussed.
r ‘ •
|2 Biomedical Signal Processing

II. DETECTION BY STRUCTURAL FEATURES

A. Simple Structural Algorithms

In this section we shall discuss the detection o f wavelets in noisy signals by means of
simple structural analysis. We assume a class of wavelets, Sj(t), i = 1,2, all having
some common structure. As an example consider the QRS complex in the ECG signal (see
Appendix A). The QRS complexes, even consecutive ones, are not identical, but possess
common structure o f typical Q, R, and S waves. We shall consider wavelets to appear
together with additive noise, such t^at the observation signal x(t) is given by:

x(t) “ m(t)(S(t) + n(t)) (1.1 A)

where

S(t) = 2 G iS iU - t,) (1 -1 B )
i= 0

with

S ;(t) = 0 for t < 0

t ^ T ^ T (1.1C)

and

i +
- J S?(t - t;)dt - 1 for all i (L ID )

In general, the wavelets Sj(t) are stochastic and our goal is to determine the expectation
E{Si(t)}. In some cases the wavelets are deterministic and we are interested in the exact
shape of each one (for example, consider the problem of single evoked potential estimation).
Sometimes it will be sufficient to get the mean of several wavelets, say

_ t 1
S(T ) = 7 £ S,(T)
I i= l

as is the case in average evoked potential analysis.

In Equation 1.1, n(t) is an additive noise generated by other biological signals, which are
of no interest for the application at hand, or by the measurement system. The multiplicative
noise, m(t), often arises in biological signals. This may be, for example, a modulating noise
due to breathing while monitoring the ECG signal.
Equation 1.1 can be written with no direct multiplicative noise by

x(t) = S(t) + n(t) (1.2A)

where

S(t) = i mdjG.S.a - t,) = 2 G,(t)S,(t - t,) (1.2B)

i=0 i=0
Volume II: Compression and Automatic Recognition 3

and

n(t) = m(t)n(t) (1.2C)

where the gains, Gj’s, are now funcjions of time and the noise process, n(t), is nonstationary.
The signal, S(t), consists o f a series of wavelets Sis each with gain G*, appearing at times
tj. Each wavelet is finite in duration and its energy is normalized to 1. We shall also assume,
for the time being, that the wavelets do not overlap, namely,

T M t , * , ~ t.) for a lii (1.3)

An a priori knowledge on the wavelets, Sj(t), is given in terms of the estimate S(t), which
is termed “ the template” . The template S(t) is zero outside the range 0 ^ t ^ Ts < T.
The problem of wavelet detection can now be formulated as follows. Given the initial
template, S(t), estimate the occurrence times, tj, the exact shape, S;(t), and the gains, G;.
This is, of course, the general problem. For some applications it may be sufficient justtc
determine whether a wavelet was present in a given time window.
The sampled version of Equation 1.2 is given by:

x(k) = 2 ) G,S;(k - k.) + n(k) (1.4A)

i=<>

with

Si(j) - 0 for j < 0 and j ^ iM - 1 (1.4B)

and

j k, +M- I
- 2 ) S?(k - k,) = i (1.4C)
M k=

where the sampling interval was assumed to be one without the loss of generality. (If sampling
interval is important just replace k and M by At*k and AtM, etc.)
Sophisticated algorithms are available to detect the presence of a wavelet by analyzing
its structure. These syntactic methods are discussed in Chapter 4. Several methods exist for
shape analysis of waveforms. General discriptors like the Fourier discriptors,20 polygonal
approximations,21 and others are used mainly for the analysis of two-dimensional pictures,
but also for one-dimensional signals. Most often, however, simpler methods are used,
specifically designed to the application at hand. The advantage of these methods is their
simplicity and the ability to implement it on relatively inexpensive, dedicated hardware. The
main disadvantage, however, is the fact that each method is specific to a given wavelet and
cannot be generally applied. These schemes are usually rigid and do not lend themselves to
adaptation. It is mainly applied to QRS detection22 and to the detection of wavelets in the
EEG.23 Since methods of the type discussed here depend on the wavelet, the best way to
describe it is by an example.

Example 1.1 — QRS complex detection

The detection of the QRS complex in the ECG is required for monitoring and analysis.
The QRS complex is the aistinguished part of the PQRST wavelet o f the ECG (see
Figure 1 and Appendix A). Two main problems prevent the simple detection of the R wave
by threshold techniques. First, baseline shifts due to movements of electrodes may place
Biomedical Signal Processing

A
V

FIGURE 1. Contour limits for QRS complex detection: upper trace,

synthesized ECG signal; lower trace, contour Sines.

large positive or negative bias in the signal. Second, sometimes line frequency noise may
interfere. To overcome these problems, we shall use the sequence of first difference of the
observed ECG signal (Equation 1.4). Assume the line frequency is 60 Hz, and we sample
the ECG at the rate of 1800 samples per second.
Consider the first difference:

x(k) - x(k) - x(k - 30) (1.5)

Note that the difference is 30/1800 = 1/60 sec, namely, one period of the line frequency.
The first difference is thus synchronized to the line interferences in such a way that these
interferences do not appear in x(k) (see discussion on seasonal time series in Chapter 7.
Volume I). Note also that base line shifts, which are usually much slower than the power
line interferences, will also be eliminated by Equation 1.5.
The wavelet present in the signal, x(k), will now be transformed to

Sj(k) - Sj(k) - S,(k - 30) ( 1.6 )

Equation 1.6 may increase the sensitivity of the algorithm to noise, since it is analogous to
differentiation.24 Since the difference operator has removed most of the base line shift, we
can now apply threshold techniques. Consider the following threshold25 procedure. Let

XM , = Min x(k)
k

X MAX = Max x(k) (1.7)

The threshold THR is then given by

THR - XM,N + 1/6(X max - XM1N) (1.8)

Volume II: Compression and Automatic Recognition 5

The presence o f ith QRS wavelet is determined at k,, the ith time that the threshold is being
crossed:

j x(k)S *T H R (1.9)

This algorithm has not used all the structural characteristics of the QRS. The R wave consists
of an upslope o f typical slope and duration26 followed immediately by a downslope with
characteristic slope and duration. Condition 1.9 can be considered as an hypothesis for QRS
complex. T he hypothesis can be accepted if the upslope and downslope in the neighborhood
of kj meet the R wave specifications.
The accuracy of the QRS detector is important, especially when high frequency ECG is
of interest (see Appendix A). The inaccuracies in the QRS time deiecliop, known as jitters,
in simple threshold detection systems were discussed by Uijen and co-w orkers.<r'
Many other algorithms for QRS detection have been suggested.37 24 A weil-known al
gorithm is the amplitude zone time epoch coding (AZTEC).,K This algorithm has been
developed for real time ECG analysis and compression. AZTEC analyzes the ECG waveform
and extracts two basic structural features: plateaus and slopes. A plateau is given in terms
of its amplitude and duration and a slope by its final elevation and duration. Another
algorithm, the coordinate reduction time encoding system (CORTES). has been suggested
which is an improvement to the AZTEC.

B. C o n to u r L im iting
The methods discussed in the previous section suffer from the fact that they are “tailored”
to a specific wavelet. The method of contour limiting*5-’2 is more flexible in the sense that
it can easily be adapted to various wavelets. The contour limiting uses a template, which
is some typical wavelet. The template can be constructed from some a priori knowledge
about the wavelet or by averaging given wavelets that were detected.and aligned manually.
Knowledge about the expected variations of each sample of wavelet is also required. From
this know iedge upper and lower limits are constructed. Consider rhe observation vector x(k)
of Equation 1.4. The upper limit S *(k) and lower limit S ~(k) are given by

S (k) = GS(k) + L~ (k)

k = 1,2,. ..,M (1.10A)

S~(k) = GS(k) - L~(k)

k - 1 ,2 ,...,M (1.10B)

where GS(k) is the template and L~(k) and L _(k) are functions derived from the variations
of the template. These can be taken, for example, to be the estimated variance at each point.
Detection is performed as follows: At the time k, an observation vector x(k) is formed from
the observation signal x(k), such that

xT(k) = {x(k),x(k - I),...,x (k - M + 1)1 (1.11)

In the same manner, the upper and lower limit vectors S + and S" are defined:

S + = [S *(M),S + (M - 1) ...... S 1(!)]' (1.12)

S = [S~(M),S (M - 1)...... S -( I )] T (1.13)

Biomedical Signal Processing

FIGURE 2. Detection o f QRS complex by the contour limiting method: upper trace, noisy ECG signal;
lower trace, PQRST complex detection.

At each time k, the observation vector is compared with the limits. A wavelet is assumed
to be present in the observation window if

S ~ ^ x (k )^ S + (1.14)

If Equation 1.14 is not true, the observation window' is assumed not to contain a wavelet.
The data in the observation vector is shifted by one sample and the new vector x(k -f 1)
is checked by the Equation 1.14.
No general rules can be given concerning the exact construction of the limit functions
L +(k) and L~(k). Making these limits large will sometimes allow noise to be recognized
as a wavelet (false positive). Making the limits small will cause some wavelets to be rejected
(false negative). The decision as to the required safety margin depends on the relative
importance of the above two errors to the particular application. Equation 1.14 can be relaxed
by requiring that only a certain fraction, say 90%, of the M elements of the observation
vector x(k) obey Equation 1.14. This will reduce the sensitivity to noise.

Example 1.2 — QRS detection by contour limiting

Consider the signal given by Example L I; we shall define the upper and lower limits for
the wavelet of Equation 1.6 by

S + = S(k) + (0.1S(k) + 0.3)

S - = S(k) - (0.1S(k) + 0.3)

and detect the QRS complex o f the signal of Example 1.1, by means of Equation 1.14.
Figure 2 shows the detection results.

III. M A T C H E D F IL T E R IN G

The problem of detecting wavelets in noise is an important problem in communication

theory. Matched filters have been used for optimal detection of Morse codes and digital
communications.33 Let us transfer the observation signal x(t) through a linear filter. We
Volume II: Compression and Automatic Recognition 7

require that the filter will cause the wavelet (when present) to be amplified while relatively
attenuating the noise, thus increasing the signal-to-noise ratio. The output o f this filter will
be subjected to a threshold to detect the presence of a wavelet. Assume that we want to
consider an MA filter (Chapter 7, Volume I). The output of the filter is given by the
observation vector yT(k) = [y(k - 1), . . . ,y(k - M + 1)], where

M - I

y(m) = 2 “ j) (115)
j-0

We have chosen a filter of order M so that if a wavelet is present, me output of the filter
will contain information about the complete wavelet. Consider now the following signal to
noise ratio, SNR0, at the output of the filter.

(E{y(m)|x(m) = s + n} - E{y(m)|x(m) = n}):

snro = --------------------r r r r M ---------\ ( 116^
Var{y(m)|x = n}

The numerator of Equation 1.16 is the power of the difference between the filter’s output
with and without a wavelet at the input. The variance of output’s noise serves as a nor
malization factor. Let us arbitrarily choose the \th wavelet to represent the time frame
including a wavelet, then m = k, + M - 1 and

, M -1 >>

E{y(m)x(m) = s i- n} = E | b / G ^ M - 1 - j) + n(k, + M - 1 - j)) j

M I
= 2 b^G^M - 1- j)} (1.17)
j-0

Where we have used the assumption, the noise has a zero mean:

E{y(m)|x(m) = n} = 0 (1.18)

If we also assume that the noise samples are independent, we get

M- I
Var{y(m)|x(m) = n} = a ; ^ b,2 (1.19)
j=o

where cr; is the variance of the noise.

Introducing the last equations into Equation 1.16 yields

/M -I C 't / /M-I \ /M-I V

( 2 bjElGftfM - 1 - j)}3) ( 2 b;j ( 2 (M-1 - j)}')

SN R. = i'° "
2 b; X b;
j= o j= 0

(1.20)

The right term in Equation 1.20 is due to the Schwarz inequality. Maximization of the
signal-to-noise ratio of Equation 1.20 is given when equality occurs. This takes place for

= KE{GiSi(M - 1- j)} (1.21)

8 j Biomedical Signal Processing
I
Let us denote the template GS(j) by

6S (M - 1 - j) = E{G|Si(M - 1 - j)}

j = 0 ,1 ,...,M ~ 1 (1.22)

Hence the optimal MA filter is given by

b3 =".K6S(M W :! - j)

j = - 1 (i.23)

Where K is an arbitrary constant, we shall thus choose K = 1. This optimal filter is known
as the matched filter.
We can rewrite the filter’s coefficients m ^vtui lurm:

bT = [b0,b ......,bM_,j (1.24)

and the tem plate’s vector:

ST - [S(M - 1), S( M - 2),.,.,S(0)1 (1.25)

Equation 1.23 is then

b = GS (1.26)

and the filter's output from Equation 1.15 is

y(m) = GSTx(m) (1.27)

where x(m) is defined in Equation 1.11. The last equation states that the matched filter is
equivalent to a cross correlator, cross correlating the observation window x(m) with the
template J . The maximum signal-to-noise ratie for the matched filter is achieved by intro
ducing Equation L26 into 1.20.

M - 1

2 (G§(M - 1 - j))2 62M _,

Max(SNR0) = ------------ --------------- = — 2 % = — STS (1.28)
k=0 ■ CT- ~ -

The matched filter procedure can be summarized as follows. We estimate the template
GS and store it in memory. For each new sample of the incoming signal* x(k), we form the
observation vector x(k) (by opening a window of M samples). We cross correlate the template
and observation window to get the kth sample of the output. This we compare with the
threshold. The observation window for which y(k) has crossed the threshold is considered
to contain a wavelet. Correlation-based detection procedures have been applied to biomedical
signals.34'37 Note that here as in the previous discussion, we only determine the presence or
absence of a wavelet in the observation window, but not its exact shape.
Volume U: Compression and Automatic Recognition 9

IV. ADAPTIVE WAVELET DETECTION

A. Introduction
We shall consider now the problem of wavelet detection while estimating and adapting
the tem plate.34 T his is required when the a priori information is insufficient or when the
wavelets are nonstationary and the template has to track the slow variations in wavelets.
We consider here a modification of the filter discussed in the previous section,
Consider the average squared error e2(k,M) between the signal x(k) and the estimated
wavelet:

e-(k.M) = A<M k) - 6(k)S )T(x(k) - 6{k)S) (1.29).

M " - - -

The best least squares estimate of the gain, G0(k), is the one that minimizes the error of
Equation 1.29. This optimal estimated gain is given by

i (£T(k& R
G0(k> = — ----------- = (1.30)

where Rx. r M '(xTv) is the M window cross correlation.

Note that for a constant template S. the denominator of Equation 1.30, namely, the energy
of the template (R-), is a constant. Introducing Equation 1.30 into Equation 1.29 yields

€2(k.M> = Rx{k) - Gr>(k)R; (1 .3 !A)

at time k^ when the observation vector contains the j th wavelet. Sj4 Equation 1.30 becomes

e2(kj.M) = G;(Rm) - ( j ^ R , + 2G,R„ + R„ (1.31B)

For a stationary noise process and large M, the last term of Equation 1 .3IB is almost
constant. Since the noise and wavelets are independent, the third term becomes small. The
first tw>o terms of Equation 1.3IB denote the error between the energy of the wavelet GjSj
and its estimate G(kj)S. This term will yield a minimum at times kj. Detection of time of
occurrence kj is thus achieved by finding the local minima of the error e-.
The minima at times kj are local minima. For example, in the noise-free case, the estimated
gain and the error will be zero in segments with no wavelet, while the minimum error at
times kj will be some local minimum, probably above zero. To produce an improved error
function we shall introduce a weighting function. This function will ensure higher error
values for data associated with low probable gains. Suppose the gains G3are random variables
w^ith known probability distribution P(G). Define a weighting function W(G):

P(E{G})
W(G) = f ( G ) P(G) (L32)
«
The weighted error e2(k,M) is

eJv(k,M) = e2(k,M ) • W(G) (L33)

10 Biomedical Signal Processing

and is inversely proportional to the gain probability distribution. The function f(G) in Equation
1.32 is chosen such that

lim €i(G ) = oo (1.34)

G—O

The weighting function assures high errors for very low gains and reduces the error for the
more probable ones. Since for low gains the error approaches zero as G 2, the function f(G)
must obey:

iim G 2f(G) = oo (1.35)

G—»0

For the examples described in this section, the gains were assumed to be gaussian distributed
and the function f(G) was chosen as

( exp(G/E{G} - l ) \y
t(G)={ - - - B E IG ) j
7 > 2 (1.36)

The parameter y is heuristically determined; other parameters of Equation 1.36 are determined
from a priori knowledge of wavelets statistics or by sample estimation during training state.
Assuming the correlation window is large enough that Rs„(k) — 0, it can be shown that the
expectation of the weighted error can be approximated by

E{e;.(k,M)} = ( G k | ) ( G k( R s - ^ ) )

@ k = kj (1.37)

The detection of the presence of a wavelet in the signal is performed by placing a threshold
on the weighted error function. The value of the threshold level LIM is experimentally
determined using a training signal. Assume an initial sample o f the analyzed signal as a
training signal. This record is analyzed, for example, visually, by a trained person, with L
wavelets at times k*, i = 1 , 2 , . . . ,L, detected. The unweighted error is calculated and its
mean is estimated by .

1 L
fi{e2(k,M)}„,,„ = - X e2(k„M) (1.38)
L t=i

If the range of gains of interest is G, G ^ G2 then the threshold level is heuristically

determined as

LIM — max(W(G,),W(G2)) * E{e2(k,M)}min (1.39)

B. Template Adaptation
The need to adapt the template arises in two cases; when the initial information concerning
the wavelet is insufficient, and thus S0 has to be improved, and when the wavelets are slowly
changing in time so that tracking action is required. Template adaption is performed each
time a wavelet is detected. Assume that at time m k the kth wavelet was detected; then
adaptation is achieved according to
Volume II: Compression and Automatic Recognition 11

§ k = p(k)Sk_, + i|i(k)x(mk) (1.40)

where JS* is the adapted template, Su is the previous template, and p(k) and vji(k) are weights.
The current template is thus a linear combination of the last template and durrent observation
signal. The kth template can be expressed in terms of the (k - N)th tenjiplate:

§k ~ P n - i(k)S^ _ n + Sk + nk

k 5* N (1.4IA)

where

i
ll p(k - j) ; x 2* 0
(1.41B)
0 ; x < 0

. N - I
(1.41C)
§" = 2
j-n
Pj-,<kW>(k - j)G k-i§k-j

N- I

Iff = 2 Pj-.flOMk - j)n(mk-j) (1.41D)

j«0

Equations 1.41 A to D express the kth template as a combination of a weighted average of

the noise vectors associated with the last N wavelets.
The adapted template is a random vector. It is of interest to examine its signal-to-noise
ratio and compare it to that of the observation vector. Denote the tem plate's noise power
bv En; then for a stationary uncorrelated noise process having the variance a j:

j N - I

En = — E{(n^)Tn^} = o- ^ P m -i(W 2(k - m) (1.42)

M m ■-=0

Denote the tem plate’s power by E<; then

E, = ^ E{(S? + pN.,(k )§ k_N)T(Sk + pN_ ,(k)Sk_N)} (1.43)

M - ~ “
The tem plate’s signal-to-noise ratio is given by:

SNRs = — f n ------- —---------------- (1.44)

<^2 Pm- ,(k)*li2(k - m)
m= 0

The signal-to-noise ratio of the observation vector at time mk is similarly given by:

E{G3S[Sk Rs(mk)E{G2} „ _
i E{nTn} a2

The ratio SNR§/SNRXgiven by Equations 1.44 and 1.45 is a m ^ su re of the relative noisiness
o f the adapted template. We shall now deal separately with the two cases o f adaptation.
12 Biomedical Signal Processing

C . Tracking a Slowly Changing Wavelet

In this case, the adaptation must he performed continuously; since no information on the
time course o f change is given, constant weights will be used. Denote p(k) - p; t|/(k) =
i|t and assume slow variations such that for the last N wavelets, the following assumptions
can be made

Gj — (3; S, * S for all j € (k - N,k) (1.46)

Since only slow changes in wavelets are allowed, one can assume “ almost” stationary
conditions with

E{Sk_.} — E{SJ — E{Sk} (1.47)

Taking the expectation of Equation ! .40, with £{n} = 0 and the assumption of Equation
1.47 yields

\\) = (1 - p)/G (1.48)

The kth estimate of the template is given from Equations 1.41 and 1.46:

Ik = PsL n + —-+ 2 PJi \ - i

I ~ p j -- 0

0 < p < i (1.49)

The relative noisiness of the estimate is given from Equations 1.44 to 1.46 by

SNRg = (pN - l)(p + 1) 2 R ^ ..s PN(P + 1 ) 1 R ;,_NP2N(p- - 1)

SNRX (pN + I)(p - 1) + yG R s pN + 1 i|i2G2 R, p2N - 1

Q ^ p C l (1.50)

In selecting the adaptation coefficients p and *1/, both the tracking rate and noisiness of
the template have to be considered. Define the tracking coefficient Tc to be the ratio between
the template (GS) part and the initial estimate (S* _ N) part of Equation 10.49, hence

1 - pN+1
Tc = 4/G — — - — p~ N
1 - p

0^ p < 1 (1.51)

In order to optimally select the adaptation coefficients, we shall define a cost function, Ic:

_ SNRs - SNRX
Ic = n(Te) - ^ (1-52)

The logarithm of Tc is taken in order to equalize order of magnitude of parameters. Max

imization o f Equation 1.52, together with Equation 1.40, provides the optimal adaptation
coefficients.

D. Correction of Initial Template

In this case, the wavelets are assumed to be constant. It is required to improve the initial
Volume II: Compression and Automatic Recognition 13

estimate of the template. Here, the adaptation coefficients are time dependent. Exponential
decay has been chosen such that

p(k) = I ~ (j - p0) e - k ; 0 « p0 *s 1

i)/(k) = ih,e"“k ; ^ I (1.53)

The adapted template is thus a weighted average of about K = 5/a initial templates. After
this period, 4»(k) — 0 and the adaptation process is terminated. A priori knowledge and
estimation o f the goodness of the initial template S0 allows the determination of a .
In this case, the convergence rate coefficient C\ will be defined instead o f the tracking
coefficient (Equation 1.51).

AE(0) - AE(k)
Q.(k) = -----------r ------------ (i . 5 4 )

where

dE(k) ~ — { ( S k — SV'iS,, - S)} <t-55)

M ~ - - -

lE (k ) is thus the mean square error between the current template and the wavelet. The
relation between the adaptation coefficients a ,p 0,ili„ is given by the maximization of the cost
function

^ SNR,(k) - SNRX
I - c ------- -------------------- 11.56)
SNRX

Note that here, the assumption in Equation 1.47 can be used only with great care since
templates may be highly nonstationary.

Exam ple i.3 — adaptive wavelet detection of QRS complexes

Consider the problem of detecting the QRS complex in a noisy ECG. In this example the
ECG signal was sampled at 2 kHsz and a template of M = 512 samples was used. The
template’s duration is thus 256 m sec which is more than sufficient for the QRS complex.
The real ECG signal was corrupted with additive white noise. Figure 3 shows the signal,
the template, and the weighted error for ECG with various signal-to-noise ratios. The template
and the parameters for the Gaussian distribution were estimated from a sample of 13 QRS
detected m anually. Detection was performed here with no template adaptation.
To check the adaptation algorithm, consider three types of initial templates: a monophasic
positive triangle, biphasic positive and negative triangles, and triphasic negative-positive-
negative triangle templates. Figure 4 shows the adaptation of the templates and the error for
the first five wavelets.
Consider now the case where the noise consists of pulsative interferences and base line
shifts. To simulate such interferences, the ECG was contaminated by a train of random
pulses, the height and width o f which were uniformally distributed in the range of ± 10%
of the average R wave amplitude and width. Pulse interferences were placed between each
two successive QRSs. To demonstrate base line shifts, a slow sine wave was added to the
ECG. Figure 5 shows the results.
14 Biomedical Signal Processing

FIGURE 3. Detection error for real ECG signal. (A) Template; (B), (C)
signal and weighted error, SNR = infinity; (D), (E) signal and weighted error,
SNR = 2.48; (F), (G) signal and weighted error, SNR = 1.24. (From Cohen,
A. and Landsberg, D ., IEEE Trans. Biomed. Eng., BME-30. 332, 1983 (©
1983. IEEE). With permission. )

V. DETECTION O F OVERLAPPING W AVELETS

A. Statement of the Problem

The assumption made in Equation 1.3 that wavelets do not overlap may be too restrictive
for some applications. Consider, for example, the problem of multispike train analysis where
an electrode is used to record an action potential from a neuron or a muscle. Very often
neighboring neurons fire at the same time and generate several overlapping wavelets. Another
example may be the recording of visual evoked responce potential, V E R .11 The VER is
assumed to be an aggregate of overlapping wavelets, generated by multiple spatially disparate
sources. The various wavelets are unknown in their exact shape and timing. Dye dillution
curves (see Appendix A) also contain overlapping wavev3ts that are interfering with the
detection o f the main wavelet.
We shall follow De Figueiredo and Gerber37 in formulating and solving the overlapping
problem. We shall remove the restriction Equation (1.3) and replace it by the assumption
that in any given observation window the same wavelet S; can appear no more than once;
we allow, however, n different wavelets to overlap in the window. Hence

x(0 = 2 GjSj(t ~ li) + n({)

t€(t,t - T) (1.57)
Volume II: Compression and Automatic Recognition 15

J -------------Ar - - ........| r ----------- --------------| f — B

\ — | — If— If Ir

FIGURE 4. Adaptation of templates. (A) Signal; (B), (C) monophasic tem

plate and corresponding weighted error; (D). (E) biphasic template; (F), (G)
trisphasic template (errors not plotted to scale). p() = 0.8; a = 0.5; v|i0 = (1
- p„)/G. (From Cohen, A. and Landsberg, D ., IEEE Trans. Biomed. Eng..
BME - 30, 332, 1983 ( 0 1983, IEEE). With permission.)

is before, the Gj’s and tj’s are unknown and n(t) is white noise. De Figueiredo’s algorithm
s implemented in two steps.

J. Initial Detection and Composite Hypothesis Formulation

Consider n matched Filters. Rather than using the correlation filters, DeFigueiredo sug
gested much simpler “ area” filters. The output of the j th filter, Zj( t ), is

£ j(T ) = jjfijS /t - 7) ~ X ( t ) |d t

j = 1 ,2 ,..: ,u (1.58)

where G S , is the j th template and most probable gain.

16 Biomedical Signal Processing

T D

f F

FIGURE 5. Rejections o f interferences from real ECG record. (A) Signal;

(B) template; (C), (D) signal with puisative interferences and additive noise
and corresponding weighted error; (E), (Fs signal with low frequency sine
wave interference and corresponding weighted error. (From Cohen. A. and
Landsberg, D ., IEEE Trans. Biomed Eng., BME-30, 332. 1983 (© 1983,
IEEE). With permission.)

The output of the filter is achieved by integrating (over the window) the absolute value
of the difference between the observation and the template. This procedure is repeated while
shifting the template with various delays, t . Note that these filters require no multiplications.
The point where Zj( t ) gets its minimum is the most likely location ( t ) of Sj in the composite
wavelet and serves as the first estimate for Tj, and the actual participation of Sj. The minima
points of all Zj( t ) , j = 1,2, . . . ,n, are compared with a threshold. All templates whose
corresponding filters provide minimum below the threshold are hypothesized to be present
in the composite wavelet.

C. Error Criterion and Minimization

The gains Gj and delay t; are estimated by minimizing the performance index

J(G ,t) = ~ J]x(t) - X 6& (t ~ Ti)|2dt (1.59)

I i= i

J(G,t ) is differentiated with respect to G, and t, and the result is set equal to zero. This
leads to the set of 2n normal equations

E - t.) • G, = Rs.s(Tj)
i= I

j = l ,2 ,...,n (1.60A)
Volume II: Compression and Automatic Recognition 17

i
i-I
R*<tj - T<> • o, = R,„(Ti)

j = 1 ,2,...,n j (1.60B)

where

= /a(X)b(X + r)dX (1.61)

and Sj is the time derivative of Sr

Equations 1.60A and B cannot be solved explicitly; De Figueiredo has solved the min
imization by a variable metric algorithm. The interested reader is referred to his paper. '

REFERENCES

1. G odfrey, K. R . and Bruce, D. M ., The identification o f isolated events in electroencephalograms, in

Identification and System Parameter Estimation, Rajbman, Ed., North-Hoiland. New York. 1978. 549.
2. Lini, A. J. and W inters. W . D ., A practical method for automatic real time EEG sleep state analysis,
IEEE Trans. Biomed. Eng.. 27, 212. 1980.
3. Sm ith, J. R ., Automatic analysis and detection o f EEG spikes, IEEE Trans. Biomed. Eng.. 21. 1. 1974.
4. Frost, J. D ,, Microprocessor based EEG spike detection and quantification. Int. J. Bio-Sled. Comput..
10, 357, 1979.
5. Ktonas, P. Y ., Luoh, YV. M ., KejarmaJ, M. L ., Reilly, K. I,., and Seward, M. A ., Computer aided
quantification o f EEG spike and sharp wave characteristics. Eleetroencephalogr. Clin. Xeurophysiol., 5 !.
237, 1981.
6. Saltzberg, B ., Lustick, S ., and H eath, R. G ., Detection of focal depth spiking in the scalp EEG of
monkeys, Eleciroencephalogr. Clin. Neurophysiol.. 31, 327. i971.
7. Brem er, G ., Sm ith, J. R., and Karacan. I., Automatic detection of the K-complex in sleep electroen
cephalograms, IEEE Trans. Biomed. Eng.. 17, 314. 1970.
8 Sm ith, J. R ., N egin, N ., and Nevis, A. H .. Automatic analysis of sleep EEG’s by the hybrid computation.
IEEE Trans. Syst. Sci. Cybern., 5, 278. 1969.
9. W idrow, B ., The "Rubber Mask” technique. I. Pattern measurements and analysis. Pattern Recognition,
5, 175, 1973.
10. G iaquinto, S . and M arciano, F ., Automatic stimulation triggered by EEG spindles, Eleciroencephalogr.
Clin. Neurophysiol., 30. 151, 1970.
11. Senm oto, S. and Childers, D. G ., Adaptive decomposition of a composite signal of identical unknown
wavelets in noise, IEEE Trans. Syst. Man. Cybern., 2. 59. 1972.
12. Abeles, M . and Goldstein, M . H ., J r ., Multispike train analysis, Proc. IEEE, 65, 762. 1977.
13. Sanderson, A . C ., Adaptive filtering of neuronal spike train data, IEEE Trans. Biomed. Eng., 27, 271.
1980.
14. D ’H ollander, E . H. and O rban, G . A ., Spike recognition and online classification by unsupervised
learning system ,-IEEE Trans. Biomed Eng., 26, 279, 1979.
15. Cashm an, P. M . M ., A pattern recognition program for continuous ECG processing in accelerated time,
Comput. Biomed. R es., 11, 255, 1980.
16. Azevedo, S. and Longini, R. L ., Abdominal lead fetal Electrocardiographic R-wave enhancement for
heart rate determination, IEEE Trans. Biomed. Eng., 27, 255, 1980.
17. Brodda, K ., W ellner, U ., and M utschler, W ., A new method for detection of P waves in ECG s, Signal
Process., 1, 15, 1979.
18. Iwata, A. N ., Ishii, N ., and Suzum nra, N., Algorithm for detecting the first and the second heart sounds
by spectral tracking, Med. Biol. Eng. Comput., 18, 19. 1980.
19. Cohen, A. and Landsberg, D ., Analysis and automatic classification of breath sounds. IEEE Trans.
Biomed. Eng.. 31, 585, 1984.
20. Pavlidis, T ., Algorithms for shape analysis of contours and waveforms, IEEE Trans. Pattern Anal. Mach.
Intelligence, 2, 301, 1980.
18 Biomedical Signal Processing

21. Lew is, J . W. and G raham , A. H ., High speed algorithms and damped spline regression and electrocar
diogram feature extraction, paper presented at the IEEE Workshop on Pattern Recognition and Artificial ^
Intelligence, Princeton, N .J., 1978
22. Holsinger, W . P ., K em pner, K . M ., and M iller, M. H ., A QRS processor based on digital differentiation,
I E E E T ra n s . B io m e d . E n g ., 18, 212, 1971.
23. D e Vries, J ., W ism an, T ., and Binnie, C . D ., Evaluation o f a simple spike wave recognition system,
E le c tro e n c e p h a lo g r. C lin . N e u r o p h y s io l., 51, 328, 1981.
24. Goldberger, A. L . and Bhargava, V ., Computerized measurement o f the first derivative o f the QRS
complex: theoretical and practical considerations, Comput. Biomed. Res., 14, 464, 1981.
25. H aywood, L. J ., M ur thy, V . K ., H arvey, G ., and Sattzberg, S ., On line real time computer algorithm
for monitoring ECG waveforms, Comput. Biomed. R es.. 3, 15, 197C
26. Fischhof, T . J ., Electrocardiographic diagnosis using digital differentiation. Int. J. Bio-Med. Comput., 13,
441, 1982.
27. Colm an, j . D. and Bolton, M . P ., Microprocessor detection o f electrocardiogram R-waves, J. Med. Eng.
Techno!.. 3. 235, 1979.
28. Talmon, J, L. and K asm an, A ., A new approach to QRS detection and typification, IEEE Comput.
Cardiol., 479. 1981.
29. Nygards, M. E. and Sorntno, L ., Delineation o f the QRS complex using the envelope of the ECG, Med.
B io l Eng. Comput., 21, 538, 1983,
30. Uijen, G. J . H ., De W eerd, J . P. C ., and Vendrik, A. J . H ., Accuracy o f QRS detection in relation to
the analysis o f high frequency components in the ECG, Med. Biol. Eng. Comput., 17, 492, 1979.
31. Van den Akker, T . i M R os, H . H ., Koelm an, A. S. M ., and Dekker, C ., An on-line method for reliable
detection o f waveforms and subsequent estimation o f events in physiological signals, Comput. Biomed.
Res., 1 5 ,4 0 5 , 1982.
32. Goovaerts, H. G ., Ros. H . H ., Van den Akker, T . J ., and Schneider, H. A ., A digital QRS detector
based on the principle o f contour limiting. IEEE Trans. Biomed. Eng., 23. 154, 1976.
33. Papoulis, A ., Signal Analysis, McGraw-Hill, Kogakusha, Auckland, 1981.
34. Cohen, A. and Landsberg, D ., Adaptive real t'.me wavelet detection, IEEE Trans. Biomed. Eng., 30,
332, 1983.
35. Collins, S. M. and Arzbaecher, R. C ., An efficient algorithm for w aveform analysis using the correlation
coefficient, Comput. Biomed. R es., 14, 381, 1981.
36 Fraden, J . and Neum an, M . R ., QRS wave detection, Med. Biol. Eng. Comput.. 18, 125. 1980.
37. De Figueiredo, R. J . P. and G erber, A ., Separation o f superimposed signals by a cross-correlation
method, IEEE Trans. Acoust. Speech Signal Process., 31, 1084. 1983.
38. C ox, J . R ., Nolle, F. M ., Fozzard, H . A ., and Oliver, G . C ., AZTEC: a preprocessing program for
real time ECG rhythm analysis, IEEE Trans. Biomed. Eng., 15, 128. 1968.
39. Abenstein, J. P. and Tom pkins, W . J ., A new data reduction algorithm for real-time ECG analysis, IEEE
Trans. Biomed. Eng., 29, 43, 1982.
Volume II: Compression and Automatic Recognition 19

Chapter 2

POINT PROCESSES

I. INTRODUCTION

Point p ro cesses'1 are random processes which produce random collections of point oc
currences, or series o f events, usually (but not necessarily) along the time axis. In univariate
point process analysis, the exact shape of the event is of no interest. The “ tim e" of occurrence
(or the intervals between occurrences) is the only information required. A more general case
is the multivariate point process in which several classes of points are distinguished. In the
multivariate case, the shape of the event serves only to classify it. The statistics of the
process, however, are given in terms of the intervals only. Point processes can be viewed
as a special case o f general random processes4 and can be dealt with as a type of time series.
Point processes analysis has been applied to a variety of applications ranging from the
analysis o f radioactive emission to road traffic studies and to queuing and inventory control
problems. Point processes theory has been applied to the analysis of various biomedical
signals.4 * The main application, however, has been in the field of neurophysiology.7 16
A neural spike train is the sequence of action potentials picked up by an electrode from
several neighboring neurons. The neurophysiologist is interested in the underlying cellular
mechanisms producing the spikes. He may investigate, for example, the effects of environ
mental conditions such as temperature, pressure, or various ion concentrations or the effects
of pharmacological agents. The analysis of the spike train may be used for the description,
comparison, and classification of neural cells. Different interval patterns mas result from
the same cell under different conditions. Interneural connections may be investigated by
analyzing the corresponding spike trains. Multivariate point processes analysis is sometimes
required1" when the spike train contains action potentials from more than one neuron.
Classification of each spike into one of the classes to be considered in the analysis is needed.
Classification o f spikes is done by means of the methods discussed in Chapter 1, Volume
II. Figure 1 shows a record of neural spike train.
Analysis of myoelectric activities has also been performed by point processes meth
o d s.17,4 Here the motor unit action potential train has been modeled as a point process.
Characteristic deviations from normal motor unit firing patterns were suggested to serve as
a diagnostic tool in neuromuscular diseases.20 It was found, for example, that both firing
rate and SD of the interpotential intervals increase in patients with myopathy.
The ECG signal can be considered a point process21-22 when only the rhythm of the
heartbeat is of interest and not the detailed time course of polarization and depolarization
of the heart muscle. The occurrence of the R wave is defined as an event and the R-R
interval statistics is of interest. Figure 2 shows a record of ECG signal. The high signal-to-
noise ratio allows the detection of R waves with a simple threshold device thus generating
a point process record.
l he occurrence of glottal pulses during voiced segments of speech23 can be analyzed as
a point process. The time interval between consecutive glottal pulses, known as the pitch
period, is a function o f the vocal cord’s anatomy. Laryngial disorders can be diagnosed24
by means of speech signal analysis. Here the detection of the event is not an easy task.
Several algorithms have been suggested23 for pitch extraction. Figure 3 shows a sample of
voiced speech where the pitch events are clearly seen.
Once the events of the process have been defined, dntn fitted25 into a noint process
model. The most often used models are the renewal process, Poisson distribution, Erlang
(Gamma) distribution, Weiball distribution, and AR and MA processes. Analysis of the
20 Biomedical Signal Processing

M— u — JL-J—

liiiliiiiiiiijiiiiiiiiiij!! J p

0.5 nA

~~u
j 20 mV

^ U 4 - 4 t l U i U 0 4 l ~ J ___ U -— __

FIGURE 1. Neural spike train. Spikes recorded from a photoreceptor stimulated by a light step.
(From Alkon, P. and Grossman, Y ., J. Neurol., 41, 1978. With permission.)

process includes statistical tests for stationarity, trends and periodicities, and correlation and
spectral analysis. These will be discussed in the following sections.

II. STATISTICAL PRELIMINARIES

The point process is completely characterized by one or two of the canonical form s:1 the
interval process and the counting process. These are schematically described in Figure 4.
The interval process describes the time behavior of the events. The random times, tj, i
= 1,2, . . . ,M, at which the \th event occurs is a way to describe the process. Here an
arbitrary point in time is chosen as a reference. At this origin point, an event may or may
not have occurred. The time intervals between two adjacent events. T s. i = 1,2, . . . ,M
- 1, can also descnoe me process.
Of interest also are the higher-order intervals. The nth order interval is defined as the
Volume II: Compression and Automatic Recognition 21

I ' I I I I II I I I I I I I I I I I I I I I I I I

FIGURE 2. ECG signal presented as a point process.

V O ICE
EVENTS

TIME

FIGURE 3. Voiced speech signal presented as a point process.

22 Biomedical Signal Processing

EVENTS

COUNTS

1*1 O rder lnt«rvol

Ord«r In te rv al

FIGURE 4. Events train.

elapsed time between an event and the nth following event. Denote the nth order interval
by Tjn\ then:

T T = 2 Ti+
i=0

i = i ,2 ,... (2 . 1)

We shall define the quantity N (t„t2) to be the random number of events in the range
(t,,t2). We shall require that there are essentially no multiple simultaneous occurrences,
namely, the following condition exists:

Lim (Prob{N(t,i + At) > 1}) = 0(At) ( 2 .2 )

A t- » 0

A process for which Equation 2.2 holds is called an “ orderly process” . The random
quantity N(o,t) yields the counting canonical form of the process. The various random
variables defined above are drawn from the unucilying probability distribution of the point
process under test. Their statistics, usually first- and second-order statistics, are used to
characterize the investigated process.
Note that for the case of higher-order intervals, as the order n becomes larger there is_
substantial overlapping of the original intervals. The central limit theorem suggests that for
most distributions of the original interval, the nth order interval distribution will tend toward
gaussian.
The random variable tt (or T s) is described by one of several equivalent functions. The
probability dersity function, p(t), describing the random variable ti? is defined such that
p(t)*At is the probability that an event occurs between t and t + At. The probability density
function (PDF) may be expressed as:
Volume 11: Compression and Automatic Recognition 23

Prob{T < T, ^ T 4- AT}\

P,(T) = Lim ( (2.3)
AT— 0 \ AT /

with:

r- p,(T)dT = 1 (2.4)

The interval histogram is often used as an estimator for the interval PDF. The cumulative
distribution function, P, (T), is the probability that the random variable 7 t is not greater than
T , hence,

P,(T) - Prob{T, ^ T} = [ Pi( 7 )dT (2.5)

The probability that the random variable is indeed greater than T is termed the survivor
function, R ;(T):

R,(T) = ProbjT, > T} = 1 - P,(T) = £ p,(T)dT (2.6)

Sometimes the logarithmic survivor function is used refering to the ln(Rj(T)).

Another function that is sometimes used is the hazard function, 4>(t,T). The hazard function
is defined such that 4>(t,T)At is the probability that an event has occurred in the interval (t,t
+ At) given that the previous event has occurred at time t . Hence,

/Prob{N(t.t + At) > 0|(last event occurred at t )}\

<{>(t,T) = Lim ( ------------------------------- ---------------------------------)
■At—
*0 \ At /

t < t (2.7)

The hazard function is also known as the “ postevent probability” , “ age specific failure
rate” , “ conditional probability” , or "conditional density function” . The hazard function
may be constant (as in the Poisson process) or may vary with t . Pacemaker neurons, for
example, exhibit interspike interval distributions with positive hazard function. Some neurons
in the auditory system, for example, exhibit interval distributions with negative hazard
functions. A similar function is the “ intensity function” . The complete intensity function,3
h0(t), is defined as:

.................. /Prob{N (t,t + At)} > 0 \ , „ os

h„(t) = Lim --------------—--------------j (2.8)

The conditional intensity function. h(x), is defined such that h(T)At is the probability that
an event has occurred at time (t + t ) given that an event has also occurred at time t, hence,

f. /Prob{N(t * T,t -f t + At) > 0(N(t,t -I- At) > 0 } \

h(T) = i s ( -----------------:— s -----------;-----------) (29)
Note that for a stationary process the conditional intensity function is not a function of t.
Note also that the difference between Equations 2.7 and 2.9 is that the hazard function is
24 Biomedical Signal Processing

conditioned upon having the previous event at t , namely, no event has occurred in the
interval while the intensity function is conditioned only to the occurrence of an event
at 7 .
The point process can also be described by means of the counting process (Figure 4).
The counting process, N(t), represents the cumulative number o f events in the time interval
(0,1). Hence,

N(t) = N(0,t) (2.10)

The relationship between the two forms, the counting and interval form, is as follows:''4

N(t) t (2.11)

k= 1

Equation 2.11 states that at all times smaller than ts, the cumulative event counts must be
smaller than i. This is true since no simultaneous events are allowed (Equation 2.2).
Equation 2.11 yields (using Equation 2.6):

Prob{N(t) < i} = Probit, > t} = R,(t) = 1 - P,(t) (2.12)

hence,

Prob{N(t) = i} = P,(t) - Pi+1(t) (2.13)

and also,

Pi«) = 1 - E Prob{N(!) = qj = 2 Prob{N(!) = q} (2.14)

q= 0 q=i

The last equations show that a direct relationship between the counting and interval forms
exists. The two processes are equivalent only by way of their complete probability distri
butions.1 In usual practice the analysis is based only on the first- and second-order properties
of the process. Such an analysis, based on the first and second order of a counting process,
is not equivalent to the analysis based on the interval process and information is gained by
considering both forms.

III. S P E C T R A L A N A LY SIS

A. Introduction
In general, the intervals (counts or event times) are statistically dependent. Hence the
joint PDF, p(TlfT2, . . . ,Tn), rather than Equation 2.3 has to be considered. The dependency
is usually experimentally analyzed by mean., of joint interval histograms (or scattering
diagrams) where two-dimensional plots describing the relations between p(T,) and p(T, + j)
are given.
The second-order statistics are very often analyzed by means of the correlation and power •
spectral density functions. In the analysis of point processes, two different types of frequency
domains have been introduced, that of the intervals and that of the event counts.

B. Interevent Intervals Spectral Analysis

The relationships between interevent intervals can be measured by me..ns of the scattering
diagrams discussed before. Another quantitative measure is the measure known as “ serial
Volume II: Compression and Automatic Recognition 25

correlation coefficients” of interval lengths'u 3 which is indeed the normalized autocovariance3

of the process.
Denote the kth cpvariance of the event inteijval, Ck; then.

Ck = Cov{T,,T, +k} = EiiT, - purXT,^ - fxr)}

k = ... - 1,0.1,. (2.15)

where jxr = E{T} is the expectation of the stationary interval process. The expectation
operator in Equation 2.15 means integration over the joint PDE. Let the variance of the
interval process be

o'! — E{(T — ji^j) } (2.16)

The serial correlation coefficient. pt . is the normalized auiocovariance given by Equations

2.15 and 2.16: ; ' ■

Pk CTf (2.17)
k = . . . . - 1 ,0 ,1 ,...

The sequence {pk} is known as ihe serial correlogram. It is easily shown that - 1 pk
1. The serial correlation coefficients have been used extensively to describe statistical
properties of neural spike intervals. In practice the serial correlation coefficients have to be
estimated from a finite sample with N intervals. A commonly used estimate15 for pk is

X ’"^i Mr(^)HTi +k jxT(k))

Pk = — N (2.18A)

with:

M--r(k) — S Tj +k (2.18B)

The interval power spectral density (PSD), S,(w), is given by the Fourier transform of the
serial correlation; hence:

S,(w) = ^ 5) PkCxp(- jkw) = ^ + 2 2 PkCOS(kw)) (2.19)

where is the interval variance (Equation 2.16).

The estimation of the PSD function is discussed in Chapter 8, Volume I. The PSD function
is used as a test o f independence and to compare several point processes.

C. Counts Spectral Analysis

Let us introduce3-9 the local rate. A(t), defined by:
26 Biomedical Signal Processing

E{N(t,t + At)}
X(t) = Lim (2.20)
At

X(t) is thus the local number o f events per unit time. In genera] for a nonstationary process,
the local rate is a function o f time. The counts PSD function, Sc(w), of a stationary process
(X(t) = X) is given by:3

( 2 . 21 )

where h (t) is the conditional intensity function given in Equation 2.9 Sc(w) is the Fourier
transform of the counts autoco variance. Methods for estimating the PSD function have been
reported in the literature .9rI

IV. SO M E CO M M O NLY USED M ODELS

A. Introduction
The event generating process is usually to be estimated, or modeled, with the aid of the
finite time observed data. The various models are given in terms of the probability distribution
functions. The motivation for modeling the point processes mainly to represent the event
generating process in a concise parametric form. This allows the detection of changes in
the process (due to pathology, for example) and comparison of samples from various processes.
In a stationary point process, the underlying probability distributions do not vary with
time. Hence phenomena, common in biological signals, such as fatigue and adaptation,
produce nonstationarities. Testing stationarity and detecting trends are important steps in the
investigation of the point process; in fact, the initial step of analysis must be the testing of
the validity of the stationarity hypothesis. In the remainder o f this section, various distribution
models will be discussed. These models have been used extensively for modeling neural
spike trains, EMG, R-R intervals, and other biological signals.

B. Renewal Processes
An important class of point processes often used in modeling biological signals is the
class of renewal processes. Renewal processes are processes in which the intervals between
events are independently distributed with identical probability distribution function, say g(t).
In neural modeling it is commonly assumed8 that the spike occurrences are of a regenerative
type which means that the spike train is assumed to be a renewal process. This is used,
however, only in cases o f spontaneous activity. In the stimulated spike train, the neuron
reacts and adapts to the stimuli so that the interval independency is violated.
Consider the intensity function, h(t) (Equation 2.9), of the renewal process. Recall that
h(t)At is the probability of an event occurring in the interval (t + At) given that an event
has occurred at t = 0. The event can be the first, second, third, etc. occurrence during the
time interval (0,t).
It can be shown14 that when k events have occurred during the interval (0,t), the intensity
function of the renewal process becomes:

h(t) = g(t) + [g(t) * g(t)l + [g(t) * g(t) * g(t)]

+ [g(t) * ,..., *g(t)] ( 2 . 22 )

where (*) denotes convolution and the last term contains (k — 1) convolutions. Equation
2.22 is better represented via the Laplace transformation. Define
V o l u m e 11: C o m p re ss io n a n d Automatic Recognition 27

H(s) = L[h(t))

G(s) = L[g(t)] (2.23)

Then we get from Equation 2.22:

k G(s)(l - G“(s))
H(s) = 2 (G(s)V = (2.24)
i- I 1 - G(s)

Hence, to characterize an ordinary renewal process, all that is required is an estimation of

the parameters of the PDF, g(t). possible by means of a histogram.
When assuming a renewal process, it is first required to test the interval independence
hypothesis. Several tests have been suggested, e.g., References 1 and 15: a few of these
will be briefly discussed here.

/. Serial Correlogram
The assumption o f interval independency (in the sense of weak stationarity) can be tested
using the estimation of the serial correlation coefficients defined in Equation 2.17 and
estimated by Equation 2.18. The exact distribution of pk is, of course, unknown. However,
under the assumption that the process is a renewal process and for sufficiently large N, the
random variable pk/(n - 1 ) '2 (k > 0) has approximately normal distribution,'5 with zero
mean and unit variance. The null hypothesis H0 is that the interval sequence {T,.T2, . . . ,TN}
is drawn from a renewal process. The alternative, H,. hypothesis is that the intervals are
identically distributed, but are not independent.
A test based on pk will be to reject the renewal hypothesis H„. if:

where a is a predetermined significance level and za/2 is given by the integral equation over
the normalized (0.1) gaussian distribution:

(2.26)

(e.g., see Bendat and Piersol,27 Chapter 4). It has been argued that measurement errors (in
the case o f neural spike trains)'4-28 may introduce trends and dependencies between intervals,
thus rendering the serial correlogram test unreliable.
Perkel et a l.13 have suggested subjecting the sequence of intervals to random shuffling
and recomputing the correlation coefficients. Serial correlation due to the process (if it exists)
will be destroyed by the random shuffling. Computational errors, however, exist in the
estimation o f both original and shuffled correlations. A test for independence can then be
constructed from the comparison of the two correlograms (e.g., by means of the sum of
squares o f the difference between corresponding correlation coefficients). Other tests have
been suggested.35

2. Flatness o f Spectrumi
A renewal process has a flat intervals PSD function. Deviations from a flat spectrum can
be used as a test for interval independence.' When the spectrum is estimated by the per
iodogram (Chapter 8, Volume I), the flatness can be tested by the quantities C f
28 Biomedical Signal Processing

(2,21)

where c,, i ~ 1,2, . . . . N/2 - 1, are the elements of the periodogram. The quantities C
of Equation 2.27 represent the order sta tistic sfro m a uniform distribution. The Kolmogorov -
Smimov statistics79 iiiay be used to test the C ’s.

3. A Nonparametric Trend Test

The renewal process is characterized by the equal distribution of intervals, in addition to
interval independence. A common source for “ nonrenewalness” is the presence of a long
term trend in the data Several tests to detect the presence of trends have been suggested.
Let T ftT>, . . . ;TN be N observations of the intervals, and let G.fF) be tiie cumulative
distribution function of the }th observation. Define the null hypothesis, H,„ as the one for
which Gj(T) - Gj(T) for i,j = 1,2, . . . ,N and all T, are statistically independent.
Suppose now that the process is not a renewal one and a positive trend exists. An alternative
hypothesis, H2, can be defined as G,(T) ^ G2(T), . . . ^ Gn(T) for all T and at least one
o f the inequalities holds. A test statistic, D, known as the Mann-Whitney statistic-6 can be
used:

D = 2;<, 2(j - i)uii (2.28A)

where

(2.28B)

If the intervals between events tend to increase with time, the T ‘s will increase with their
subscripts, causing the statistic D to be large. It can be shown that for sufficiently large n,
given H0 is true, D is approximately normally distributed with:

n2(n + l)2(n - 1)
Var{D|Ho} = (2.29)
36

The test, therefore, calls for rejecting the H0 assumption of no trend (hence the necessary
requirement for renewal process) fo r larger values of D.

C. Poisson Processes
Poisson processes are a special case o f renewal processes, in which the identical interval
distribution is a Poisson distribution. In the theory of point processes, the Poisson process,
due to its simplicity, pla^ > a ^umewhat analogous role to that of normal distribution in the
study o f random variables.
V o l u m e II: C o m p r e s s i o n a n d Auto ma ti c Recognition 29

: . The Poisson process, with rate X, is defined by the requirement that for all t, the following
exists as At —*• 0:

Prob{N(t,t + At) ® 1} = XAt + 0(At) (2.30)

The constant rate, X, denotes the average events per unit time. An important aspect of the
definition (Equation 2.30) is that the probability does not depend on time. The probability
of having an event in (t,t + At) does not depend on the past at ali.
It is well known that for a random variable for which Equation 2.30 holds, the probab:,ity.
of r events occurring in t (starting from some arbitrary time origin) is

(Xt)r
Prob(N(t) = r) = — - e x p ( - X t) (2.31)

which is the Poisson probability. The probability of having zero events in time T. followed
by one event in the interval T tt* dt, is given by the joint probability of the two. However,
the two probabilities are independent, due to the nature of the Poisson. process Aiso the
probability o f having one event in the interval T 4- dt is by Equation 2.30 Xdt. hence,

p(T)dt = (XT' C*P- ~ — • Xdt (2.32A)

p(T) ~ X exp( - XT) (2.32B)

Equation 2.32B gives the PDF of the Poisson distribution.

Refer to Figure 4 and consider the nth order intervals, Tjn\ given by Equation 2.1. Due
to the overlapping, the nth order interval will no longer be Poisson distributed. Consider
the occurrence of n — 1 events before time t; the probability for this is (from Equation
2.31):

Prob(N(t) = n - 1) = ^ , ex p (-X t) (2.33)

(n - 1)!

The probability that in the following time interval of t -f dt one and only pne event will
occur is Xdt. Since the two are independent, the joint probability of their occtp en ce is given
by: -

Pj(t)dt = - --j X exp( —Xt)dt (2.34A)

\ n /" T ( n ) \n - 1

i p(T<n)) = / e x p (-X t) (2.34B)

(n - 1)!

The PDF o f the nth order interval given by Equation 2.34B is known as the Gamma
distribution.
30 Biomedical Signal Processing

The survivor function (Equation 2.6) for the Poisson process is given by integrating
Equation 2.32B:

RjCT) = Prob(Tj > T) = e x p (-X T ) (2.35)

Consider now the autocovariance and the spectrum of the Poisson process. Since the interval,
T,, is independent o f Tj for all i # j , the autocovariance o f the process (Equation 2.15)
becomes a delta function. Its Fourier transform, the interval power spectral density function
(Equation 2.18), is thus constant (flat):

' W f r - a b (2 36)

It can also be shown that for the Poisson process the relative intensity function h (t ) =
X. Hence the counts’ power spectral density function (Equation 2.20) is also flat, with:

S,(w) - ~ (2.37)

Several statistics to test the hypothesis that a given sequence o f intervals was drawn from
a Poisson process have been suggested. For the Poisson process, the quantities

P = t/tN

i = 1,2,. ...,N (2.38)

(Figure 4) represent the order statistics from a random sample size N, which is uniformly
distributed with zero mean and unit variance.14 A modification to Equation 2.38 shows1
that when rearranging the intervals sequence to generate a new sequence {Tf} in which
T ?+1 ^ T? the quantities

, N -t-2 —i

P. = r 2 t?
IN
n ji =
= ll

i - 1,2,. ..,N (2.39)

also represent a similar order statistics. The Kolmogrov-Smirnov1’29 statistics can then be
used to test the Poisson hypothesis.
Other tests based, for example, on the coefficient of variations15 have been suggested. It
is sometimes of interest to test whether the Poisson process under investigation is a ho
mogeneous or nonhomogeneous Poisson process. A nonhomogeneous Poisson process is
one in which the rate o f occurrence, X, is not constant but time dependent — in other words,
a Poisson process with trend in the rate of occurrence. The Wald-Wolfowitz ran test27 may
be used for this task. For this test we define a set of equal arbitrary time interval lengths
(TIL). If the number o f events in the TIL exceeds the expected number for this interval, a
( + ) sign is attached to the TIL. If the number o f events is below the expected number, a
( - ) sign is attached. When the number of events equals the expected number, the TIL is
discarded. A sequence of ( + ) and ( - ) signs is thus generated. The number of runs, r, is
determined by summing up each uninterrupted sequence of ( -I-) or ( —). The sequence ( + +
------------+ - + + ) yields r = 5.
V o l u m e II: C o m p r e s s i o n a n d Automatic Recognition 31

The total number of ( + ) is denoted by N + and the total number of ( —) is denoted by

N _ . The mean and variance of r are

|i r = 1 + (2N +N_)/N (2.40A)

and

(2.40B)

For large samples, the approximate standard normal variate,

z = (|r - |l r| - 0.5)/or (2.41)

is used to test the data.27

D. O th e r D istributions
In some cases the process under investigation does not fit the simple Poisson distribution.
Other distributions have been found useful in describing biological point processes. The
more commonly used ones are discussed here.

I. The Weihull Distribution

This distribution is sometimes used to model renewal neural spike trains.1'' The probability
density function of the Weibull distribution is'0 31

T > €

0 ; T < e
k > 0 ; v > e (2.42)

and the cumulative distribution function:

T > e

P(T,v,e,k) = <
0 T < €

k > 1 ; v 3= e (2.43)

A random variable, T, with a Weibull distribution has the expectation and variance given
by:29

e {t } = (v - € )r a + k - ’) (2.44A)

Var^T} = (v - e)2[ r ( l + 2 k " 1) - T2(l + k ’)] (2.44B)

where eamma function. Note that for k = 1 the Weibull density reduces to the
exponential density.
32 Biomedical Signal Processing

2. The Erlang (Gamma) Distribution

This distribution has also been used to model renewal neural spike trains.53 Its probability
density function is

f k ) (XT)k ' expC*“XT) ; T > 0

0 ; T < 0

k > 0 (2.45)

where F(-} is gamma function. A random variable. T, with Erlang distribution has the
expectation and variance:

E{T} = ^

VarfTj = (2.46)
A~

The Eriang distribution with r = 1 becomes the exponential. distribution.

3, Exponential Autoregressive Moving Average (EARMA)

An important stationary class of point processes which sometimes is useful as an alternative
to the Poisson process is the ARM A processes. In these nonrenewal processes, exponentially
distributed intervals are statistically dependent upon past intervals in an ARMA sense (Chap
ter 7, Volume I). Such a process is termed EARMA (p.q) where p is the AR order and q
is tile MA order of the process. The general EARMA process is called exponential auto
regressive (EAR(p)) process when q = 0 and is called exponential moving average (EMA(q))
process when p - 0. For a more detailed discussion on these processes, see Cox and Isham.3

4. Semi-Markov Processes
A sequence of random variables, xn, is called Markov if for any n we have the conditional
probability:

P(xn|xn_ ,,x n_2,...,x ,) = P(x,Jxn_ ,) (2.47)

namely, the probability of the current event depends only on the event preceding it. Assume
now that the random variable, xn, is a discrete ran<jom variable taking the values a,,a2, . . . ,a„.
The sequence {xn} is then called a Markov chain. A semi-Markov process is a process in
which the intervals are random-variably drawn from any one of a given set o f distribution
functions.3 The switching between one probability function to another is controlled by a
Markov chain.
Consider the case with k “ classes” or “ types” and a set of k2 distribution functions Fi j5
i,j = 1,2, . . . ,k. Assume that each interval of the point process is assigned a “ class”
type, 1,2, . . . ,k. The assignment is determined by a Markov chain with transition matrix
P = (Pj j). A sequence of intervals beginning with type i and ending with type j is drawn
from the distribution Fy. The transition matrix P is such that when an interval had been
assigned a class i, the probability of the next interval to get the class j is Pio.
A special case of the Semi-Markov process is the two-State Semi-Markov model (TSSM).
Here the transition matrix P is
V o l u m e II: Comp re ss io n a n d Automatic Recognition 33

P. 1 - P,
P = (2.48)
I - P'S P2

and the four distributions Fs, are

F; , » F? , = F2 (2.49)

Equation 2.49 states that the interval probability distribution depends only on the type of
the interval and not on adjacent types. In a Semi-Markov process for which Equation 2.49
holds, the num ber of consecutive intervals which have the same distribution is geometrically
distributed.'
The TSSM model has been applied to spike train activity analysis. It was found, how ever,11
that it can be used only for a limited part of all experimental stationary data. The more
general nongeometric S em i-M a rk o v model has been applied with more success.3- The
nongeometric two-state S em i-M ark ov process has also been applied to neural analysis by
De K w aadsteniet.’1 The term semialtemating renewal (SAR) is used there.

V. MULTIVARIATE POINT PROCESSES

A. Introduction
In multivariate point processes. two or more types of points are observed. This may be
the case, for exam ple, when two or more univariate point processes are investigated and
the relationship or dependence between them is sought. Another example may be where
several d iffe re n t processes are recorded together. M ultispike trains are com m on in
neurophysiology10 ?3 when an electrode picks the spikes of the neuron under investigation
together with spikes from neighboring neurons. It is often possible to distinguish between
the various neuron spikes based on the difference in pulse shapes.1,1 The record thus can be
considered a multivariate point process. A similar process occurs when recording muscle
activity. Several motor units form a multivariate point process.
In general, the various types of the multivariate process are dependent on one another; it
is therefore necessary to have the conditional probability functions in order to characterize
the process.

B. Characterization of Multivariate Point Processes

We shall discuss the stationary case where just two types of events are present in the
multivariate process.3 The general case can easily be extended from the following discussion.
Let ‘N(t) and,2N(t) be the number of type I events and type 2 events, in the interval (0,t),
respectively: W e shall retain the orderliness of the process (in the sense of Equation 2.2).
Hence, no simultaneous events are allowed. It i» often required, however, to assume that
events of different types may occur simultaneously, as is the case in multispike neural train
analysis. We can easily overcome the problem by assigning a separate class to all events in
which two types have occurred simultaneously.
Denote the intervals !T(i) and 2T(i) as the intervals between the ith and (i - 1)th events
of type 1 and type 2, respectively. The relationships between the counts *N(t) and 2N(t) and
the intervals are similar to those of the univariate process (Equation 2.11), namely,
34 Biomedical Signal Processing

’NO.) < n,

2N(t2) < n2 (2.50A)

if and only if

X 'T(i) > t,
•—I
n2
S ’ T ( i) > t 2 v2.50B )
i= I

Similar to the univariate process we shall define the cross intensity function 2h(7) as a
generalization of Equation 2.9.

iu/ a i • /P rob{‘N(t -f t , t + 7 + At) > 0 |2N (U + At) > 0}\

Ui(t ) - Lim ( --------------- ----------------------------- !------------------------- I (2.51)
■ . , iH o \ At J

The cross intensity function, |h(7), is similarly defined. The cross intensity, 2h(7)At, yields
the probability of having an event o f type 1 at time 7 , given an event of type 2 has occurred
at the origin. Note that jh(7) is the univariate conditional intensity. The complete intensity
function of the multivariate process is defined (see Equation 2.8):

/P rob{’N(7,7 + At) > 0 ,2N(7,7 + A7 ) = 0}\

Mf l ■ s i ;

■ w ■ u r n ( f t ° t | 'NlT- ' * M - |° ' - N t e * A" • °1 ) ,2 52,

and the complete intensity function of simultaneous occurrence o f the two types at time 7
is

. ^ I /Prob{'N (7,7 4- A7) > 0 ,2N(7,7 + At) > 0 } \

I-‘-h0(7) = Lim I -------- --------- --------- ----------------- ------------- I (2.53)
At-»0 V At /

It is sometimes required that one ignore the different types of the multivariate process
and consider it a univariate process. The conditional intensity function of such a process is
given, in terms of the intensities of the multivariate process:3

h(T) = — 7
A.j 1—A.2 (|h (T ) + ?h(T)) + 7 -^ —
A.| T A.->
U h {T ) + 5h(T)) (2 .5 4)

where k {, k 2 are the rates o f the first and second types.

Covariance and spectral analysis o f multivariate processes are important tools in the
investigation of these processes: they have been applied33 to the analysis of neurological
signals. Define the cross covariance density,3 2C(7),

t . /C ov{!N(t 4- 7,t + 7 + At),2N(t,t + At)}\

2 ( t ) = Lim I --------------------------- -- --------------------------- j =
A,^ ° V M J (2.55)
\ 2 ih(T) - X,X2
Volume 11: Compression and Automatic Recognition 35

with the cross covariance density, }C(t), defined similarly. The cross spectral density
function, iS(w ), is the Fourier transform of Equation 2.55:

JS(W) = j ^ 5C(7)exp(-jwT)dT

(2.56)

with the cross spectral density, {S(w) defined similarly. A discussion concerning the ap
plication o f the cross spectral density function to neural spikes processing has been given
by Glaser and Ruchkin."

C. Marked Processes
A multivariate point process can be expressed as a univariate process with a marker
attached to each point marking its type. Such is often the case in neuroohv'cinfncY^i re
cordings o f action potentials. A microelectrode may record the action potentials generated
by several neurons in its vicinity. Action potentials (spikes) of the various neurons, as
recorded by the microelectrode, differ from one another in shape and can be classified10 by
means of w avelet detection methods (see also Chapter 1, Volume II). The recording of the
microelectrode, known as multispike train,10can thus be analyzed as a marked point process.
We shall denote the mark of the \th point by M, and the accumulated mark at time t by
M(t), hence.

M(t) = IM , (2.57)

where the summation takes place over all points in the interval (0,t). M(t) is a random
variable with statistics to be estimated from the given data.
In the general case we shall be interested in the joint probability distribution of (M(t),N(t)).
From it we can derive the dependence (if any) of the different types on one another. Consider
the more simple case where the point process has a rate, and the marks are independent
(of one another and of the point process) and are uniformally distributed. Assume a record
of length NT with N(A) = n and M(A), the sum of the n independent marks. It can be
shown' that:

E{M(A)} = \|A]E{M}
Var{M(A» = \|A|Var{M} + (E{M} + (E{M})2Var{N(A)} (2.58)

Cov{M(A),N(A)} = E{M}Var{N(A)}

Methods for the analysis of the marked point process with dependencies between marks and
process have been reported in the literature; these are, however, outside the scope of this
book.

REFERENCES

1. Cox. D. R. and Lewis, P. A. W ., The Statistical Analysis o f Series o f Events, Methuen, London, 1966.
2. Lew is, P. A. YV., E d., Stochastic Point Processes: Statistical Theory and Applications, Wiley-Interscience,
New York. 1972.
3. Cox. D. R. and Isham, V ., Point Processes, Chapman and Hall, London, 1980.
4. Brillinger. D. R ., Comparative aspects of the study otdinary time series and of point processes, inDe-
vel<>pmcnts in Statistics, Krishnaiah. P. R., Ed.. Academic Press, New York, 1978, 33.
36 Biomedical Signal Processing

5. Sayers, B . Me A ., Inferring significance from biological signals, in Biomedical Engineering Systems, Clynes,
M. and Miisuin, J. H-, Eds., McGraw-Hill, New York, 1970, chap. 4.
6. Anderson, D. J. and Correia, M . j . , The detection and analysis o f point processes in biological signals,
Proc. IEEE, 65(5). 773, i977.
7. Ten Hoopen, M. and Penver, H. A ., Analysis o f sequences o f events with random displacements applied
to biological systems, Math. Biosci. , i , 599, 1967.
8. Fienberg, S. E ., Stochastic models for single neuron firing trains: a survey, Biometrics, 30, 399, 1974.
9. Lago, P, J, and Jones, N . B ., A note on the spectral analysis o f neural spike train, Med. Biol. Eng.
Com put,, 20, 44, 1982.
IQ. Aheies, M , and G oldstein, M . H „ Multispike train analysis, Proc. IEEE, 65(5), 762, 1977,
11. De Kvvaadsleniet, J. W.« Statistical analysis and stochastic modeling o f neural spike train activity, Math.
Biosci., 60, 17, 1982. ’
12. Ten Hoopen, M ., The correlation operator and pulse trains, Med. Bio!. Eng., 8, 187, 1970.
13 Perkel, D. K ., Gerstetn, G . L ., and M oore, G. P ., Neuronal spike trains and stochastic point processes.
L The single spike Main, II. Simultaneous spike m u m , Biophys. J ., 7. 39 L 419, 1967.
14. Landoftv j . P. and Correia, M. J ., Neuromathematical.concepts o f point processes theory, IEEE Trans.
Biol. M ed.-Eng.. 25{1). i, 1978.
15. Yang, G. L. and Chen, T. On statistical methods in neuronal spike train analysis. M ath. Biosci., 38,
1, 1978.
vfo. Sam path, G. and Srinivasan, S. K ., Stochastic M-odeis fo r Spike Trains o f Single Neurons, Lecture Notes
in Biometries, Vol. 16, Springer-v'erlag, Berlin, 1977.
17 Clam ann, H. P., Statistical analysis of motor unit firing patterns in human skeletal muscle, Biophys. J .,
9. 1233. 1969.
18. Parker, P. A. and Scott, R. N.,-Statistics of the myoeletric signal from monopolar and bipolar electrodes,
M ed. Biol. Eng., ! i, 591, 1973.
19. Lago, P. J. A, and Jones, N. B ., Turning points spectral analysis o f the interference myolectric activity,
Med. Biol. Eng. Comput.. 21. 333, 1983.
20. Andreassen, S ., Computerized analysis o f motor unit firing, in Progress in Clinical Neurophysiology, Vol.
10. Computer Aided Electromyography, Desmcdt, J. !:.. Ed., S. Karger. Basel, 1983, 150.
21. Ten Hoopen, M ., R-wave.sequences- treated as a point process,.progress report 3, Inst. Med. P hys., TNG,
Utrecht, Netherlands, 1972. 124.
22. G oldstein, R, E. and Barnett, G. O ., A statistical study o f the ventricular irregularity of atrial fibrillation.
Comput. Biomed. Res., 1, 146. 1967.
23. Schafer, R. W . and M arkel, J. D.« E ds., Speech Analysis, IEEE Press. New York, 1979.
24. Kasuya, H ., Kobayashi, Y ., Kobayashi, T ., and Ebihara, S ., Characterization o f pitch period and
amplitude perturbation in pathological voice, in Proc. IEEE Int. Conf. Acoust. Speech Signal Process.,
IEEE, New York, 1983, 1372.
25. Brassard, J. R ., Correia, M. J ., and Landoit, J. P.< A computer program ior graphical and iterative
fitting of probability density functions to biological data, Comput. Prog. Biom ed., 5, 11, 1975.
26. Lehmann, E ., Non Parameirtcs: Statistical Methods Based on Ranks, Hoiden-Day, San Francisco, 1975.
27. Bendat, J. S. and Pier sol, A. G ., Random Data: Analysis ana M easurement Procedures, Wiley-Inter
science, New York, 1971.
28. Shiavi, R. and Negin, M ., The effect of measurement errors on correlation estimates in spike interval
sequences, IEEE Trans. Biomed. Eng., 20, 374, 1973.
29. M ood, A, M ., Graybill, F. A ., and Boes, D. C ., Introduction to the Theory o f Statistics, 3rd ed ., McGraw-
Hill, Kogakusha. Tokyo, 1974.
30. Parzan, E ., Stochastic Processes, Holden-Day, San Francisco, 1962.
31. M ann, N. R ., Schafer, R. E ., and Singpurwalla, N. S ., Methods fo r Statistical Analysis o f Reliability
and Life Data, Wiley-Interscience, New York, 1974.
32. Ekholm , A ., A generalization of the two state two interval semi-Markov model, in Stochastic Point
Processes, Lewis, P. A .. Ed., Wiley-Interscience, New York, 1972.
33. G laser, E. M . and Ruchkin, O. S ., Principles o f Neurobiological Signal Analysis, Academic Press, New
York, 1976.
34. Bartlett, M. S ., The spectral analysis o f point processes, J. R. Stat. Soc. Ser. B, 25, 264, 1963.
35. L i, H . F. and Chan, F. H. Y ., Microprocessor based spike train analysis. Comput. Prog. Biomed., 13,
61, 1981.
V o l u m e II: C o m p r e s s i o n a n d Automatic Recognition 37

Chapter 3

SIG N A L CLASSIFICATION A N D RECOGNITION

I. INTRODUCTION

Modem biomedical signal processing requires the handling of large quantities of data. In
ihe neurological clinic, for example, routine examinations of electroencephalograms are
usually performed with eight or mo-e channels, each lasting several tens o f seconds. In
more elaborate examinations for sleep disorders analysis, hours-long records may be taken.
Several hours of electrocardiographic recordings are sometimes required from patients re
covering from heart surgery. Various screening programs are faced with the pioblem of
handling a large number of short-term ECG and other signals.
Storing and analyzing such large quantities of information have become a severe problem.
In some cases manual analysis is cost prohibitive; in others, it is completely impossible.
The problem has therefore been rccogni/,cd rs an important part of any modern signal and
pattern analysis system. Signals are in essence one-dimensional patterns. The methods and
algorithms developed for pattern recognition are in general applicable to signal analysis.
The topics discussed in this chapter are based on the decision — theoretic approach to
pattern recognition. A different approach — the syntactic method — is discussed in Chapter
4,
The signal to be analyzed, stored, or transmitted contains quite often some redundancies.
This may be due to some built-in redundancies, added noise, or the fact that the application
at hand does not require all the information carried by the signal. The first step for sophis
ticated processing will be that of data compression. Irrelevant information is taken out such
that the signal can be represented more effectively. One accepted method for data compression
is by features extraction (refer to Figure 2. Chapter 1, Volume I). Based on some a priori
knowledge about the signal, features are defined and extracted. The signal is then discarded,
with the features being its representation. Features must thus be carefully selected. They
must contain most of the relevant information while most of the redundancies are discarded.
Optimal feature selection routines are available. For some applications compression is re
quired for storage or transmition purposes, so that the signal must, at a later stage, be
reconstructed. Features based on time series analysis (ARMA, AR — see Chapter 7, Volume
I) can be used for such applications where the reconstructed signal has the same spectral
characteristics as the original one. In other applications, automatic classification is required.
Compression is then performed with features that not necessarily allow reconstruction, but
provide distinction between classes.
Any linear or nonlinear transformation of the original measurement can be considered as
features, provided they allow reconstruction or give discriminant power. Various transfor
mations, optimal in some sense, have been used to compress signal data. These transfor
mations can be used without the need for a priori knowledge of the signal. Other features
require some assumptions on signal properties — these may be, for example, the order of
the ARM A model or the range of allowable peak of a waveform.
In many cases, the features extracted are statistically dependent on one another; some
methods, however, provide independent features. The computational cost and time required
for the feature extraction process usually dictate the need to reduce the number of features
as much as possible. A compromise has to be taken between that demand and the accuracy
(in reconstruction or classification) requirement. Methods for (sub) optimal determination
of the number of features are available; some are discussed in this chapter.
The material covered in this chapter is based on the vast Iitei iture on pattern and signal
recognition textbooks.15 reference books,0-7 and papers.8 Signal classification methods have
38 Biomedical Signal Processing

F IG U R E i . S c h e « ia iic re p re s e n ta tio n o f fe a tu re e x tra c tio n , s e le c tio n ,

an d c la r if ic a tio n s y s te m .

been applied to various biomedical problems such as neural signals,4 electroencephalogra

phy , ,8-53 electrocardiography ,IM'3233 phonocardtography ,33 breath sounds,34■35 electromvo-
graphy,36 voice,'7"'8 and many more.
The recognition of the signal is done by means of a classification process. The machine
has a priori knowledge on the types (classes) of signals under consideration. An unknown
signal is then classified into one of the known classes. One of the classes can be a “ reject”
or '/unknown'*' class. The input to the recognition machine is a set of N measurements given
in terms of the vector, x. (e.g., samples of the signal) and called the “ pattern vector” . The
pattern vector contains all the measured information available about the signal. A set of
features, arranged as the features vector, g , is extracted from the pattern vector (Figure 1).
The classifier operates on the features vector p . with functions known as the “ decisions”
or “ discriminant functions” to perform the classification. The decision functions, d,(g), are
scalars and are single-valued functions of g . in general when M classes are involved, we
use M decision functions, djCg). i.~ . 1,2, ... . ,M. The signal with pattern vector, x, and
feature vector, {3, will be classified into the \th class, w(, if:

4 (g ) > d^g); i,j = 1,2,..,,M

i - j

Consider the simple example depicted in Figure 2. Here the features vector is of dimension
three g T = anc* two classes, w ,, w2. are given. The two clusters of features of
w, and w2 belong to signals known in advance to be in either w, or w 2. These are known
as the training set. The clusters of features may be considered as an estimate for the probability
distribution of the features. The projections of the clusters in the three-dimensional feature
space are shown in Figure 2. it is clearly seen that classification of the two classes can be
made with feature {32 alone, since the projections of w2 and w, (w'-r’ and w',2)) do not overlap.
The projections on other feature-axes show that overlapping exists between the two classes,
A linear decision function can be drawn in t h e ( p ,,p 2) or 0 2’3.>) planes to discriminate
between the two classes.
An example for the procedure described above can be that o f automatic classification of
ECG signals. Here samples of records of ECGs of normal and pathological states are given.
These have been diagnosed manually by the physician and constitute the training set. From
this given set, templates (for the normal state and each one of the pathological states) are
generated and the statistics of each class are estimated. It is clear that the more information
there is in the training set, the better is the training process and the probability of correct
classification.
In some cases training sets are not a priori classified. The system must then “ train” itself
by means of unsupervised learning. Cluster-seeking algorithms have to be used in order to
automatically Mentify groups that can be considered classes. Unsupervised recognition sys-
Volume II: Compression and Automatic Recognition 39

I KlURIi. 2. Projections oi' cluster* in the features space.

terns require a great deal of intuition and experimentation. The interested reader is referred
to the pattern recognition literature.
Two important topics are discussed in this chapter: features selection and signal classi
fication. Both were included in one chapter since in many cases there are similarities in the
discussions of the two problems. It was probably logical to open with the discussion on
features selection and signal compression since in most cases these are done prior to clas
sification (Figure 2). ft was found, however, that from the point of view of material pres
entation it is more convenient to discuss the topic of classification first.

II. STATISTICAL SIG NAL CLASSIFICATION

A. In tro d u ctio n
We may look at the signal classification problem in probabilistic terms. Assume that we
have M classes and an unknown signal to be classified. We define the hypothesis, Hk, that
the signal belongs to the wk class. The problem then becomes a problem of hypothesis
testing. In the case of two classes we have the null hypothesis (usually denoted by H0, but
for convenience denoted here by H,) that the signal belongs to w, and the alternative
hypothesis. H : . that the signal belongs to w2. It is the task of the classifier to acceptor
reject the null hypothesis. For this we need the probability density functions o f the various
classes, which are usually a priori unknown. The methods for statistical decision learning
and classification are discussed in this section.

B. Bayes Decision T heory an d Classification

Assume that all the relevant statistics of the problem at hand are given. In particular the
probabilities o f the classes P(wt), i - 1,2, . . . ,M. and the conditional probability densities
p(£/Wj) are a priori given.
Consider the two classes problem. Assume we have an unknown signal represented by
40 Biomedical Signal Processing

the featu res, g . The con d ition al probability that this signal b elo n g s to the jf/r c ia ss, P ( w JB l
is g iven by B a y es rule:

(3.1*

w here p^gjw,) & the con d ition al probability d en sity o f getting p g iv e n the cla ss is W,, p (g i
i> the probabim v d en sity function o f p . and P (w ;) is the probability o f the j//? class,
N ote a lso .that *!or the tw o c la sse s case):

Equation 3.1 (lie a pmierlori probability P(vv,jB) in terms of ihe a priori prohabill:>
F(ws). h i> logical (0 c la ssify the signal p as follows: if P ( w ,j g f > P (w2jg) u e decide 6 €
w,, and if P (w ; .g ) > P(w?jg) we decide g e w ,. i f PCwJg) --- Piw .;p; wc remain un
decided.'.
A n alyzin g ail p o ssib ilities w e see that a correct ’c lassifica tio n occurs when:

g e w | and P (w ,jg ) > P(w?jS)

g e w, and P (w ,|g ) < P(w:jg)

and .an error in cla ssifica tio n occurs w here;

g € w , and P i w j g ) < P (w ,jg )

g € w , and P{w ,1g) > P (w ?jg}

In hypoth esis testing lan gu age the errors are called the error o f ih e ‘'first k in d r’ and the
’ ‘secon d k in d ” or " fa lse p o sitiv e" and " fa lse n eg a tiv e''. T h e probability o f an error is
thus:

PC w Jg) if P (w 2|g ) > P (w ,|g >

?! w ;!g) if P (w t|g ) > P ( w ,|g )

and the average error probability is

P(error) = j P (erro r|g )p (g )d g

It can e a sily be show n that the in tu itive d ecisio n rule w e have ch o sen m in im izes the average
error p robability. T he B a y es d ecisio n rule can b e written b y m ea n s o f the con d ition al
probabilities:
Volume II: Compression and Automatic Recognition 41

which means that when the left side of the inequality (Equation 3.5) is larger than the right
side, we classify £ into wy. when it is smaller, g is classified into w: .
We want now to generalize the decision rule of Equation 3.5. Assume we have M classes.
For this case we shall have the probability oi g , p(g), given by Equation 3.2, but with
summation index running j = L . . . ,M. We also want to introduce a weight on the various
errors. Suppose that when making a classification g t w5, we take a certain action, a,. This
may be, for exam ple, the administration of certain medication alter classifying the signal
as belonging to some illness w,. We want to attach a certain loss, or punishment, when we
take an action a, when indeed g e w,. Denote this loss by Akxjw.) = A...
Suppose that we observe a signal with features vector g and consider taking the action
a,. If indeed 8 € w . we will incur the loss \ (a,jw;). The e\pec:ed loss associated with
taking the action a , (also known as the conditional risk) is

V. V

R (ajg ) - X X<u.jW,)P{wjg) = 2 0 .6 )
i■* ‘ i• ! ' .

The classification can be formulated as follows. Given a feature vector g , compute ail
conditional risks R (a jg ), i - 1,2, . . . ,M. and choose the action a, (classification into \v;)
that minimizes the conditional risk (Equation 3.6).
Consider, for example, the two classes case. Equation 3.6 for this case is

R tajg .) - A.,Pl w,jg) -I- Xl2P(w,||3)

R(oujg) - X:iP( vv.jp j + A.,P(wJg) (3.7)

Note that \ M and A;: are the ioss for correct classification; these are less than the loss ibr
making an error *X;;.X2,). We classify 8 into w, if R (a.!g) < Rks jg). Prom Equations 3.7
and 3.1 we get classification into w, when:

<X21 - \ ;i ipCgjwJPfw. j > (A,2 - A,,)p(g]w'2}P(w:) (3,8)

w,
p(g|w,) > ( \ l2 - X22)P(w?)
(3.9)
p(gjw,> < - Xn )P(w,)
w\

The left side o f the inequality (Equation 3.9) is called the likelihood ratio. The right side
can be considered a decision threshold.
In general, classification is performed with discriminant functions. A classification ma
chine computes M discriminant functions, one for each class, and chooses the class yielding
the largest discrim inant function. The Bayes rule calls for the minimization of Equation 3.6;
we can then define the discriminant function of the ith class d;(g) by:

d;(g) = —R (a jg ) (3.10)

and the classification rule becomes: assign the signal with feature vector g to class w; if:

d,(g) > d /g ) for all j ^ i (3.11)

42 Biomedical Signal Processing

DISCRIMINANT

FIGURE 3. Classification by discriminant functions

Since the logarithm function is a monotonicaliy increasing function, we can take also the
logarithm Equation 3.11 without changing the rule. Figure 3 shows the general classifier
scheme.
Consider now a simple loss function:

XCaJwj) = | f° i = j

i * j'
i,j = ' 1 ,2,...,M (3.12)

The average conditional risk of Equation 3.6 becomes:

R (a,|g) = 2 P(w jg) = 1 - P(wJg) (3.13)

i^.i

To minimize the risk, we want to choose that w, for which PCwjjS) is maximum. Hence,
for this case, known as the “ minimum error rate", the classification rule becomes: classify
p into w; if:

P(w,|g) > PCwjg) for ail j # i (3.14)

We can define the discriminant function as:

dj(g) = ln(P(Wj|g)) = !n(p(gjws)) -f ln(P(wi)) - ln(p(g)) (3.15A)

Note that the last term of the discriminant o f Equation 3.I5A depends only on g and not
on Wj. This term will be present in all discriminants d;(P), i = 1,2, . . . ,M. Since we are
looking for the largest d;(g), any common term can be ignored. We shall therefore define
the discriminant without the last term:

d - m = IhCpCgk)) ■+ ln(P(w,)) (3.15B)

Consider the case where the features are normally distributed. The probability distribution
of a signal belonging to the \th class, w4, represented by, g , is

p(g|Wi) = ( 2 n “ "/2) |2 J " l/2 e x .p ^ - - <g - V ‘(3 - ^ (3.16)

Volume II: Compression and Automatic Recognition 43

where = E{g} is the expectation of the ith class and X, its n • n convariance matrix:

£, - E{(g - jx,)(g - jx,)1} <3.17)-

The discriminant function (Equation 3.15) for this ease is

d,(g) -• " ~ ln !-;; ' ~ (g ~ jx,)1 I , '(g - | l() + ln(P(w,)) (3.I8A)

The term - n 2 in (211) was dropped for the same reasons discussed above. Equation 3.18A
can he rewritten in the form of a quadratic equation:

d.( 6) • g 'A .g + b,'g + c, (3.18B)

where

A. V - J S ."

and

- u.1 !M-, - ~ 1n j j | + ln(P(\v,))

The soV tion of d.ip) 0 yields the decision surfaces in the features space. In the general
case, the^e are hyperquadratic surfaces. In the special case where

1 lor all i = 1.2.

we are d.v.jr.g with M classes equally distributed, but with different expectations y,,. In this
case the f:r>t term of Equation 3. IS \ ^an also be ignored as well as the first term of Equation
3.18B. Tr.e discriminant function becomes linear, with

d,(g) = b,rg + c, ' (3.19A)

where

b, “ -

c, = - ^ M-,r X " 1M-i + ln(P(wj))

The first term of the right side Equation 3.19B is the square of the Mahalanobis distance.
If, in addition, the a priori probabilities P( w j are equal they can be ignored, and classification
is performed by choosing the class with the minimum Mahalanobis distance (between the
44 Biomedical Signal Processing

signal £ and the cla ss m ean j^ ). I f P fw ,) are not equal the d istance is b ia sed in favor o f the
in ore probable cla ss.
T h e .xiinpie^ casp is the e ise w h ere not on ly all cla sses have the sam e covarian ee m atrices,
but a .s c (he feature ^ art statistically independent. For this c a se.

1 ; - % - ( i2l for a l l i (3 .2 0 )

The discriminant becomes

d jg ) = - - Eif (§ - B ) + imr-t%v,)t

^ ^ 3 life - Mr.ir + (3.2>A)

Here w e h ave the discrim inant g iv e n in terms o f the E uclidean d ista n ce. The discrim inant
can a lso be written as

d,ig) = - j g j i - + in<P(w;» (3:2IB)

where in Equation 5 .2 1 8 term s w h ich are com m on to ail c la sses w ere ignored. N ote that
She o n h calculation?, required in Equation 3 .2 * B are the vector m u ltip lication s in the First
term T h e rest are precalculated and stored as constant in the cla ssifier.

Exam ple 3.1

A ssu m e tw o cla sses wv. w , g iv e n in a tw o-dim enM onal-feature -p a ce i p / , p 2) as described
in Figure 4 . T he training set c o n sists o f fiv e signal:, from ea ch cla ss. A ssu m e the features
are norm ally distributed with the sam e covarian ce matrix. T h e discrim inant g iv en by Equa
tion s 3 .1 9 A and 3.T 9B can be u sed . T he training set and four unk.nov. n sig n a ls (x ) are g iv en
b elo w .
S in ce n o inform ation is g iv e n on the probabilities o f the c la sse s, w e shall use P (w ,) ~
P(w2) - 0 .5 .

PL = [ - 0 .5 ,1 ! §L - [ -2 ,0 1 PL = l- i.o ]

PL = [0,2] P L = [ - 0 .5 ,- 0 .5 ] E » = [0,0]

P L = M .n g L = [ - 1 ,- 1 .5 ] g L = [ 2 .- 2 ]

§ L = [0.5,0.5] P L = [ 0 .5 ,- 1 ] g L = [2.0]

§ 1 . = [ 1 .- 0 .5 ] f g , = [ 0 ,- 2 ]

T h e exp ectation o f the tw o c la ss e s is estim ated by:

& = i = 1.2
j =*

w hich y ie ld s p j = [0.4,0.8] andjiT = [ - 0 . 6 . — 1]. S in ce c la sses are assu m ed to p o ssess

id en tical covarian ce m atrix, it is estim a ted by:
Volume If: Compression and Automatic Rccoyjiition

FIGl Ki Data ior hvamp'c i

~ = T~;IX i
>0 j.j i O i ! " M - . - ’- S . i
— — - fi:)r
— + 6,.>
- - jU ( § j.2
- - M M
- jI

which for the data, given is

0.54 - 0 .3 5 2.779 1.430

V _ V •■i ~

- 0 .3 5 0.68 1.430 2.207

The discriminant functions d.(g) and d:(g) are calculated from Equation 3.19A 10

d,(g) = (2.256, 2.337)g - 2.079

d,(g) = ( -3 .0 9 8 , - 3.065)g - 3.155

The discriminant functions for the data set are given below:

d ,( g M) - -0 .8 6 9 ; d2( g u ) - -4 .6 7 1
46 Biomedical Signal Processing

0 .2 1 8 ; ^ 2( ^ 4. i) ~ - 6 .2 3 6

< W g 5;,) = 0 ,9 9 2 ;
<M g w > “
- 4 .7 2 0

-6 ,5 9 0 ; 3 .0 4 0
<W g. j =

d , ( g 2J = -■4,376; 4 < g 2.2) - -0 .0 7 4

= -7 ,8 4 1 ; - 4 .5 4 0

- -3 .2 1 9 ; - - 1.63V

d ,( g ,o ) = -6 .7 5 4 ; d2(g 5 2) = 2 .9 7 5

The discriminant functions have correctly classified all training data, since d ,( ^ ,) > d:({3,,)
and d,(gj 2) < d2(g j2) for all j ’s. Consider now the four unknown signals g , K:

m y j = - 4 .3 3 5 <Mgi.*) - -0 .0 5 7 fi.- , e W2
d r(g 2>x) - - 2 .0 7 9 d2C82.;> = 3.155 —* fc. x 6 w2

J == - 2 .2 4 3 -3 .2 2 0 —> g ,. v € W,

<Mg4.,} = 2.432 m 4.x) ~ -9 .2 8 5 -*■ X € W,

Exam ple 3.2

A newborn's cry conveys information concerning different states, needs, and demands of
the neonate. It has been demonstrated, by spectrographic studies and by trained listeners,
that cry signals originated by birth, hunger, pleasure, and pain have different characteristics.
The pain cry of infants with brain damage, hyperbilirubinemia, meningitis, or hypoglycemia
has been shown to possess different characteristics. Figure 5 shows typical records of hunger
and pain cries from newborn infants. The hunger cry' was recorded from a healthy infant
await’ng feeding. 4 hr after the previous feeding. The pain cry was recorded during hip joint
examination (Ortolani test). Bayes classification was used for automatic classification of the
cry signals- An AR model (see Chapter 7, Volume I) of order 20 was used to model the
signals. Figure 6 shows the pole map and LPC estimated spectrum of the two cries. The
first ten LPCs were chosen as the feature vector:

g T - {a,,a2,...,a l0] (3.22)

Hunger and pain cry records from five infants were used as a training set. The mean feature
vectors for the hunger cry j l H, and for the pain cry, j i (>, were estimated by:

1 Ni

i = H ,P (3.23)

where gj is the j th training vector of the i € (H,P) class. The covariance matrices and
were estimated by:
V o l u m e 11: C o m p r e s s i o n a n d Automatic Recognition 47

i. = ^ 2 « - - H,)r

i - H,P (3.24)

The Bayes rule (Equation (3.5)) in its log form with the probability of Equation 3.16 for
this case becomes:

(g - M-i,)1 '<£ - t ..) - \ - £r>' '<£ - j i p)

25)

where \v„ and wp denote the hunger and pain classes, respectively. Define the quadratic
distance. D ,^ (Mahalanobis distance):

DMi = (g - £ / - jy

j = H,P (3.261

and

D = D„.a - Dpjj (3.27)

and

THR = 2 In _ |n / f e h (3 28)
Vp(Wp) / V liJ /

Then the decision rule (Equation 3.25) can be rewritten:

I
wH
(3.29)

w„

Equation 3.29 is the quadratic Bayes test for minimum error. This classifier does not allow
rejects. Each data record is forced to be classified even if it is a “ bad” record in the sense
that it includes artifacts or does not belong to each one of the two classes. Consider now
the case where we introduce two threshold levels, R, and R2, such that:

f= £ R , ^ g e w H
D = D„ tt - Dpjjj 3= R, - » J3 e wp (3.30)
I otherwise —> |3 e Reject
48 Biomedical Signal Processing

TIME- (S E C .)

TIME { S E C .}
A
FIGURE 5. Typical hunger and pain cry (A) Time Records; (B) power spectral density
estimated by*FFT

With no rejections, two types of classification errors were present: (I) errors due to the
classification of pain cry as hunger cry (denote the probability of this error by eH!p):

/t h r

€„|P = j ^ p ( D |g € w p)dD (3 .3 !A)

and (2) errors due to the classification of hunger cry as pain cry (with the probability.
Sin)*-

£piH ~ I P iH P e wH)dD (3 .3 IB)

V o l u m e II: C o m p r e s s i o n a n d Automatic Recognition 49

h u n g e r
S»S»0

'W . J a ------*_ 4

2 3

i:
!U :
!i !
I

F.'GURi: SB.

With rejection, usini* decision rule (Equation 3.30), the errors are

f R,
= J P(D|g e wp)dD =s elljP (3.31C)

and

tim = [ P<D|P £ wH)dD *s e,,;„ (3.3 ID)

1 ,/w- —

However, rejections will be present. Some of the rejections were correctly classified by
Equation 3.29. The probability of a hunger cry record, correctly classified by Equation 3.29
and rejected by Equation 3.30. is
50 Biomedical Signal Processing

FIGURE 6. Poles maps o f hunger and pain cries.

f THR

£g;H = j K p ( D |g e w H)dD (3.31 E)

and similarly for pain cry:

<=“,, = r p(D||3 e wp)dD (3.31F)

JTUR ~ ’

The probability densities, p(D |0), are estimated from the training data. We shall choose the
rejection threshold Rj such as to minimize a linear combination of the error probabilities
efliMand €HP. Similar considerations will dictate the decision for R2. Hence,

R, Min(o),€gIH + o>2eff !,)

R -,

R2 Min(o)3€pjp +*>,€„;„) (3.32)

■' R2

where <t>(, i = 1,2,3,4, are weights, determined by the relative importance of each one of
the errors. Figure 7 depicts the training and classification system. Automatic classification
of infant's cry was performed by the classifier (Equation 3.30), with error rate of less than
5% .3S The quadratic classifier (Equation 3.29), with no rejection, was also applied to the
classification o f single evoked potentials39 with similar results.

C. k-Nearest Neighbor (k-NN) Classification

In most practical cases, the statistics o f the features are unknown. The classification rules
discussed before can be applied therefore only after the probabilities have been estimated
from the given data.
The probability density of the features vector g is to be estimated. Consider a given region
R in the features space; the probability P that the vector § will fall in this region is

P = _[ p(g)dg = p(g)V (3.33)

V o l u m e II: C o m p r e s s i o n a n d Automatic Recognition 51

FIGURE 7. Training and classification system for classifier with rejection.

where V is the volume of R.

The right side o f the equation holds if R is very small. Suppose that we have drawn m
samples of the feature vector, k of which fell in the region R. The probability P can be
estimated by k m: hence,

p(g) = — ■ (3.34)

In Equation 3.34 one must decide the size of V to be used. Clearly V cannot be allowed to
grow since this will cause the estimation to be "sm oothed". On the other hand, V cannot
be too small since then the variance of the estimate will increase. One method for choosing
the volume V is to determine k as some function of m such that, for example, k ~ km =
k-VreTThe .ume chosen, denoted by Vin. is determined by increasing V until it contains
km neighbors o f g . This is known as the km-nearest neighbor estimation. Note that the volume
chosen for the estimation of the probability functions becomes a function of the data. If the
features are densely distributed around £ ,th e volume containing km neighbors will be small.
M

Assume that the training set consists of total N = ^ N* samples of feature vectors, where
i=i
N, is the num ber of samples belonging to the ciass w,. An unknown signal, represented by
the vector, g . is to be classified. We find the volume Vm around g that includes km training
samples. Out o f the km samples, k, samples belong to the class w;. An estimate for the
conditional probability becomes:

iw g k ) =

and
56 Biomedical Signal Processing

where f„ * ,(g) •= I, Define u»c Hr

g - ffI(§ ),f2( g ),...,fn( g )^ ] 0 .4 7 )

Then Equation 3.46 can be expressed by

d(g) - p lg t (3.48)

Note that while the functions f /g ) can be nonlinear. Equation 3.48 is linear with respect to
&•
Consider, for example, the case where the' ta c tio n s'fj(g ).are quadratic functions of the
form:

k,€ = l,2 ,,..,n

In this case and tor the vwo-dimensional feature space, the discriminant is

d(g) = p n {3f + PJ2Bi{^2 + P22K + PlPl + P2p2 + P?

For the general n dimensional space,

d© = i
j*1
Pjjp / + ' t t
j - t k=j+l
+ t
j~l
PiPi + Pn+ , <3.49)

which can be written in a matrix form as:

d(g) - g TA g + § Tb + c ' (3.50)

The matrix A vector b and e determine the hyperplane. Equation 3.50 uses weights and
functions tor the discriminant. For a given problem, a set o f functions has to be determined,,
and the proper weights found.

C. Minimum Squared Error Method

The decision surfaces d(g) = 0 have to.be defined such that the classification error be
minimum. This calls for some optima! method to determine the weighting vector, p.
Consider the two-class problem. Assume g ; e w,; then it is correctly classified if:
Volume 11: Compression and Automatic Recognition 57
d(B:) = p j& > 0 (3.51 A)

and gj «• w2 if:

d<fe> “ filfe < 0 (3 .5 !B)

We can replace all |3’s belonging to w2 with (~ j> ). The classification rule for w2 becomes
similar tG Equation 3.51 A.
Several methods are used to find the optimal weighting function for Equation 3 .5 1A.
Among them are gradient descent procedures, the perception criterion function, and various
relaxation procedures. We shall present here the method of minimum squared error. Instead
of solving the linear inequalities (Equation 3.51 A), we shall look for the solution of the set
of linear equations:

= bi

i = 1,2...... N (3.52)

where b*, i — 1,2, . . . ,N are some arbitrarily chosen positive constants known as the
“ margins” . The set of N equations (Equation 3.52) is solved to determine the weighting
vector £>a , in such a way that the N known samples will be classified with minimum error.
Define the Nx(n + 1) matrix F:

FT - ...... ig N] (3-53)

and define the constants vector bT = (b,.b;s . . . ,bNj; then Equations 3.52 becomes:

FpA = b (3.54)

Note that the matrix F is not square; hence Equation 3.54 cannot be solved directly. The
pseudo inverse o f F must be used, Define the error vector:

e = FpA - b (3.55)

and find £ A that minimizes the sum squares of e:

Min eTe = Min(Fj>A - b)T(FjgA - b) = M in (p IF F £ A - 2pJFTb) (3.56)

Pa Pa Pa

The solution o f Equation 3.56 yields the least squares estimate:

£a = ( F F ) - ‘ F b (3.57A)

Note that the (n -f l)*(n + 1) matrix FTF is square. It may, however, be ill conditioned;
in such cases the ridge regression method should be used such that:

£ a = lim(FTF + e l ) - 1 F b (3.57B)
(S—*0 *

The solution (Equatic i 3.57) depends on the margin vector b. Different choices of b will
lead to different decision surfaces.
58 Bi omedical Signal Processing

0 . M inimum Distance Classifiers

In cases where the various classes are well clustered and separated from one another,
classification can be performed based on the proximity of the unknown vector to the various
clusters. The unknown vector is classified into the cluster to which it is closest. For such a
classification scheme we need to define a prototype to represent each one of the clusters
(classes) and a scalar measure of proximity. We call the measure of proximity a “ distance”
and the classification scheme “ minimum distance classification” .
Consider the case with M classes each represented by a template, or a prototype, m„ i
= 1,2, . . . % M. The template can be, for example, some weighted average of the class
training set. The distance measure can be defined in various ways. Consider first the Eu
clidean distance, D„ between the unknown signal to be classified g and the ith template
m-:

DI ® ]|g - m f = (g - m ^ f g - m.) (3.58)

We can choose Df to be the distance measure, since all distances are positive. To classify,
g , we calculate Equation 3.58 for ali i and choose the class j that yields the smallest distance:

Dj = Min(Df) -> g e w( (3.59)

A closer look4 at Equation 3.58 will show the relation between the minimum distance
classifier and linear discrimination:

D; = ( g - “ '15i)T( g ~ J5i) = g r£ ~ 2 ^ g Tn}i - (3.60)

The first term on the right side is independent of i and thus can be ignored in the minimization
process. Minimization o f D j is the same as maximization of the second term of the right
side. Hence, we can define an equivalent decision function:

djCg^ = g rm, - ^ m 'm ,

i = 1,2,...,M (3.61)

Define a weighting vector £ such that:

P? = (3.62A)

and the augmented feature vector, g A:

gx = [gT:l] (3.62B)

Then the decision function (Equation 3.61) becomes:

di(g) = p7gA

i = 1,2,...,M (3.63)
V o l u m e II: C o m p r e s s i o n a n d Au tomatic Recognition 59

HCiURU 12. Minimum distance classifier.

which is a linear discriminant function discussed in Section III. Figure 12 shows a simple
case of two classes in a two-dimensional features space. It can be shown that in this case
the decision surface. d(g) - 0. is a hyperplane normal to the line segment joining the two
templates, located at an equal distance from them.
Note that the Euclidean distance of Equation 3.58 gives equal importance to each one of
the elements of the features space. If. however, we have some a priori knowledge on the
statistics of the classes, w>e may want to place weights on the features. If, for example, it
is a priori known that some of the features have large variances, we may want to consider
them ‘Tes^ reliable'' when defining the proximity measure. This leads intuitive!) to a distance
measure where the weights are inversely proportional to the features covariance matrix,
namely.

Df = (g ~ mi)T I f !(g - (3.64)

which is the Mahalanobis distance discussed previously (Equation 3.19). Note that if Equation
3.64 is used, the problem is a quadratic one.

Example 3.4
Consider the minimum distance classification for the data of Example 1 (Figure 4). The
Euclidean distance (Equation 3.58) for the four unknown signals is calculated below:

Df D;

g , x —> 2.6 1.16 —» J3, x e w2

g 3 v ~+ 0.8 1 .3 6 -* g ,.s € w,

g <s -> 10.4 7.76 -* g 3 x e w2

3.2 7.76 —» g 4 , € w,
60 Bi omedical Signal Processing

The signals were classified as in the 5-NN (Example 3 3 ).

E. Entropy Criteria Methods

/. Introduction i
The amount of information content is: often measured by the enfropy-, which is a statistics!
measure o f uncertainty/ Consider the probability density function, p*p), of the feature-,
population.
The entropy, HfBK ih given by (sec also Chapter 8, Volume i, Section lfi)

H(g) ~E{ln(p<g))} - “ Jp(2)hi(.p(g)}dg (3.65)

Consider M classes. \v„ i ~ 1 , 2 , . . . ,M. We shall make the restrictive assumption that
the signals of all M classes are normally distributed, The signals belonging the \th class
have the expectation jx! and covariance matrix, We assume therefore that all classes
have the same covariance matrix. This may be the case, for example, where the classes are
deterministic vectors jx=, i — i,2 , . . . ,M, and the measured signals are noisy signals from
zero mean normally distributed noise process common to all measurements.
The conditional probabilities p(g|w;), i = 1,2, . . . ,M, are given by Equation 3.16. The
entropy o f \vt is

H,(g) = ~ j ptgjw^lfiCpCgj w ^ d g (3.66)

where the integration is performed over the features space. We would like now to find a
linear transformation:

y = TTg (3.67)

that transfers the n dimensional feature vector, g , into a reduced dimensional vector .y. The
transformation matrix T is thus of dimension n x d. This is a procedure commonly taken
when the goal is signal compression.

2. Minimization o f Entropy
Here we shall look for the transformation that not only reduces the dimensionality of the
problem, but mainly preserves or even enhances the discrimination properties between
classes. The entropy is a measure of uncertainty since features which reduce the uncertainty
of a given situation are considered more informative. Tou4 46 has suggested that minimization
of the entropy is equivalent to minimizing the dispersion of the various classes, while not
effecting much the interclass dispersion. It is thus reasonable to expect that the minimization
will have clustering properties.
The d x d covariance matrix, of the reduced vector y, whose expectation is jl, is
given by

~ £)<£ ~ fi)T} = TT V (3.68)

Since the new vector, y, is the result of a linear transformation of a gaussian process, it is
also gaussian. The conditional probability density in the reduced space is thus:

p(yK> = (2ri)d/2!Xy|~ I/2 e x p ^ - i ( y - £ ) T 2 y !(v “ (3-6^

and the entropy in the reduced feature space:

V o l u m e II: C o m p n ssion a n d Au tomatic Recognition 61

Hv) - ’• JlH>jwt)hup(v;vv,)dy ~ + ~ inp.J + ^ ln(211)

L i‘ 3.?0A)

The determinant o f the eovariance matrix in Equation 3.70A is equivalent to the product
o f its eigenvalues; hence.

■<,v, = \ X In K 4 <1 + ln(21I)i (3.70B)

\Ovk* A, are »K eigenvalues o f the covariance matrix S y. The minimization’* of Equation

tr ; optimal transformation T, which turns out to be a matrix whore d ( o lm n s
«ii* t k d u joi,' ectors of associated with the smallest eigenvalues X .. \
. . . .A.,, o! 1,;. Hence. the entropy of the reduced vector is

H(v, ^ 2 in X, + 5j(l -r ln(2I()} (3.7GQ

I j L
\

In Equation 3.70C we have assumed that the eigenvalues of had been arranged in
decreasing order.

Exam ple 3.5

Consider again the two classes of Example 3.i depicted in Figure 4. It is required to
apply the minimum entropy criterion to reduce dimensionality while preserving clustering.
The estimation of the covariance matrix was calculated in Example 3.1. The eigenvalues
are given by

0.54 - X - 0 .3 5
d e K i - XI) = \1 ~ k\ = 0
- 0 .3 5 0 .6 8 - X

The solution yields th- two eigenvalues. X, = 0.96693 and X, = 0.25307. The corresponding
eigenvectors u ; are given by:

lu , - XjUj

i - 1,2

The soiution yields

u'r = [0 .6 3 3 9 9 ,-0 .7 7 3 3 4 ]; uT = [0.77334,0.63399]

The required transformation is given by u ?, the eigenvector corresponding to the smallest

eigenvalue. Hence,

1 = [0.77334, 0.63399]£

All signals in the new one-dimensional features space are the projections of on the line
along the eigenvector u2. This projection is shown in Figure 13. This figure clearly dem
onstrates that the clustering in the reduced, one-dimensional space was preserved. Note that
projections on the eigenvector (corresponding to the largest eigenvalue) do not preserve
62 Biomedical Signal Processing

F IG U R E 13. C o m p re s s io n a n d c la s s ific a tio n b y m e a n s o f e n tro p y c rite ria .

the discrimination between classes. For classification we need to find a decision surface (a
threshold number in the one-dimensional case). If, for example, we take y = 0 to be the
threshold, we see that all the training data are classified correctly. The unknown data is
classified as follows:

{3, x € w2 g 2 x € Undefined g 3 x e w, J34 x e w,

Comparing these results with Examples 3 .1 ,3 .3 , and 3.4, we note that the various classifiers
differ in the classification of signals g 2;x and x which are indeed borderline cases.

3. Maximization o f Entropy
We shall consider now a transformation similar to Equation 3 .67, but rather than requiring
the minimization of the dispersion, we shall require that maximum information be trans
formed. This means we want to maximize the entropy, H(y) (rather than minimize it as was
done in the previous section). Let us assume that the probabilities involved are normal, with
identical covariance matrix, for all classes. The transformation T that maximizes the
entropy will be the transformation, the columns of which are d eigenvectors of Xp, corre
sponding to the largest eigenvalues.

Example 3.6
Consider the data given in Example 3.1 . Find the transformation into the one-dimensional
space that will preserve maximum information in the sense of the entropy. Clearly the
Volume 11: Compression and Automatic Recognition 63

transformation is the vector u, of Example 3.5. The vector u, is shown in Figure 13. Note
that the projections o f the signals onto u, completely destroy the discrimination between the
classes.

IV. FISHER’S LINEAR DISCRIMINANT

Fisher’s linear discriminant is a transformation that reduces the dimensionality of the

features vector from n into d = M - I (where M is the number of classes involved), while
optimally preserving the separability between classes. The compression ratio is dictated by
the original dimensionality and the number of classes and cannot be chosen at will as was
the case, for exam ple, with the minimum entropy criterion.
The idea behind the Fisher’s linear discriminant is the projection of the n dimensional
features vectors onto a lower dimensional surface. The surface is to be chosen such that
separation between classes is kept as much as possible.
Refer to Figure 13 and note that if we have chosen (as the minimum entropy method did)
to project the data onto u2, the projected classcs remain separate. Any other one-dimensional
surface would have yielded projected classes which are less separated or even intermingled
as is clearly the case if u, is chosen.
In order to find the optimal surface to project onto, a measure of separability is required.
Optimality is then understood in the sense of maximizing separability. The criterion chosen
in the discussion presented in the last section was that of minimum entropy. Here we take
a different approach.
Consider first the two-class problem. Suppose w ^ have N known samples, g,. N, of which
belong to w, and N 2 of which belong to w2. Consider y, to be a linear combination of (he
features g,:

Y, = fi'g , (3.71)

The n dimensional vector, £, can be considered a line in the n dimensional space: then y,
is the projection o f g* on this line (scaled by ||g||). Let ^ be the mean of the N, samples of
class w. in the n dimensional space:

* - s j , 6 o m

and the mean o f the projected points yj on the line g, p.;, is the projection of {!,:

M-i = r r E y> = r r S PTfr ^ PT& (3 73)

The separation o f the means on £ is given by

IPm - P j = |£ t(Mm - M (3.74)

It can be made as large as required by scaling £. This by itself is, of course, meaningless
since the separation o f the two classes must include the variance of the samples.
Define the n x n scatter matrix, W h of the ith class as:

% = 2 (g - jy < g -

i = 1,2 (3 .7 5 )
64 Biomedical Signal Processing

W. is the estimation o f the covariance o f the ir/i class in the n-dimensional feature space. It
represents a measure of the dispersion o fihe signals belonging to \vr The within-ciass scatter
matrix, W, is defined as: '

W W.j (3.76)

T he tincHiinienM i’na! scatter “ m atrix*’ o f the proji^nons ^isrmarry g iven by:

of = V (y - = £ ;o !|? - p ’ii.!-

- S ~ - i f ) ' * f rVv ^ J1)

+' <3-5 « £ W ' a

Consider row the variance between the means of the two clashes. Denote the matrix. B,
the ' “-between class scatter m atrix", in the'original n-dimensionai features space:

B - (jx, ~ M - ■
“ jfc)T (3.79)

.T his matrix represents the dispersion between the means of the various classes. Also the
variance of the means in the one-dimensiona! projection is

(& ~ &)"’ = (PrAi ~ P%)~ = PT(£, " Mr-XA-* “ ~ i>IBP <3.S0)

Noxe that for every n-dimensionai vector v, we have from Equation 3.79:

Btj. = (£ , - & )(£ , ~ & Tv (3.81)

Since (|Xt — jx2)Tu is a scalar denoting the projection o fv of (§Xt - j l 2), we conclude that
Bv is always a vector in the direction of (p., — &>).
A criterion of separation can now be formulated’in terms o f the new scatter matrices. For
good separation we require that the variance of the populations o f each class be smaii. Hence
a good separation measure, J(p), is

J(E> = H (382)

which is known as the Rayleigh quotient. If W is nonsingular, the maximization of Equation

3.82 is given by the eigenvalue equation;

W sBp — Xp (3.83)

where the optimal weighting vectors., g, are the eigenvectors of W !3 . In this case, however,
we need not solve for the eigenvectors. Recall that Bp is a vector in the direction of
(£-! “ £ 2) k t length be X. We can do this without loss of generality since the length of
the required vector is of no importance; its direction is what we look for. Hence, we get

p = W - ‘(£, - £ 2) (3.84)
Volume II: Compression and Amomoik Rero^ihicp (IS

The line along the vector £ given by Equation 3.84 is the optimal line in the sense that
the projections o f w, and w-> on it will have the maximum ratio of “ between”’ 1 wiiUm”
class scatter. The classification problem has now been reduced io iha‘. findiiig a decision
surface (threshold number) on the line £ to discriminate between the projections of w, and
w?. I

Example 3.7
Consider again the data given in Example 3.1 (Figure 4). It is de>sr:J t**. reduce the
dimensions of the feature vector, (3, to one dimension using the Fisher \ 'i i 'u n i n a n t . We
thus have to calculate the transformation vector, p. of Equation 3.71. The direction of the
vector is given by Equation 3.84. For this example (see Example 3.1),

5-0(H) j 1
£ = - PlJ = 5.353 i ~ 5 353
1.009j 1

which yields a transformation very close to u: o f Example 3.5 (Figure 13). The classification
results will be similar to those of Example 3.5 with signal g 3 x undefined.
Consider now' the general case where M classes are present. The within class scatter
matrix (Equation 3.76) will become:

M
W = ^ W : (3.85)

For the generalized between matrix, we calculate the mean of classes:

I vA
= - 2 (3.85)
N i ~

and tiie between scatter matrix,

M
B = X Ni<Pri “ _ Jfr>T <3 -87>
i- I

We shall calculate M ~ 1 discriminant functions.

The reduction in dimensionality will thus be from the original n to M - 1. We of course
assume that n > M — I. The discriminant functions are given by the projections of the
signal samples g onto the lines £;:

y. = t f i
i = 1,2,.,.,M - ! (3.88)

The M -- 1 equations can be written in a matrix form using the M - 1 dimensional vector
y “ [ y i • • • ,yM-i n x (M - I) matrix T whose columns are the weighting
vectors Pj.

i - T 'g (3.89)
66 Bi omedical Signal Processing

Equation 3.89 gives the transformation onto the M — 1 space. The optimal transformation
matrix T is to be calculated.
The within class seatter matrix in the reduced M — I space is denoted by Wy and the
between class scatter matrix in that space is denoted by By. Similar to the two-classes case,
we have i

Wy = T fWT

By = T TBT (3.90)

We heed now a criterion for separability. The ratio of scalar m easures used in the reduced,
one-dimensional case cannot be used here since ratio of matrices is not defined. We could
have used the criterion tr(Wy !By) using the same logic as before. Another criterion can be
the ratio o f determinants:

« r > - E a M l)

The matrix, T, that maximizes1 Equation 3.91 is the one whose columns are the solution
o f the equation:

Bp{ = XiWp,

i — 1,2,...,M - 1 (3.92A)

which can be solved either by inverting W and solving the eigenvalue problem:

W ^ B fc - Xjgi (3.92B)

or by solving' for the eigenvalues X, from

|B - XjW| = 0

and then solving p, from

(B - XjWlpi = 0 (3.92C)

The transformation (Equation 3.89) that transforms the n-dimensional features vector {3 into
a reduced, M - 1, dimensional vector while maximizing Equation 3.91, is given by
Equation 3.92. The optimal transformation is thus the matrix, T, whose columns, p,, i =
1,2, . . . ,M — 1, are the eigenvectors of W ~ 'B . The Fisher's discriminant method is
therefore useful for signal compression when classification in the reduced space is required.

V. K AR H U N E N -L O EV E EX PA N SIO N S (KLE)

A. Introduction
The problem of dimensionality reduction is well known in statistics and in communication
theory. A variety of methods have been developed, employing linear transformation, that
transform the original feature space intn a lower order space while optimizing some given
performance index. Two classical methods in statistics are the principal components analy
sis3'5-40 (PCA), known in the communication theory literature as Karhunen-Loeve Expansion
Volume //: Co m p r e s s i o n a n d Automatic Recognition 67

(KLE), and factor analysis (FA). The PCA optimizes the variance of the features while FA
optimizes the correlations among the features.
The KLE has been used to extract important features for representing sample signals taken
from a given distriBution. To this end the method is well suited for signal compression. In
classification, however, we wish to represent the features which possess the maximum'
discriminatory information between given classes and not to faithfully represent each class
by itself. There may be indeed cases where two classes may share the same (or similar)
important features, but also have some different features (which may be less important in
terms of representing each class). If we reduce the dimensions of the classes by keeping
only the important features, we lose all discriminatory information. It has been shown40 that
if a certain transformation is applied to the data, prior to KLE. discrimination is preserved.
The KLE applied to a vector, representing the time samples, can be extended to include
several signals. W e arrange the vectors representing a group of signals into a matrix form
and tr y' u, ;c;.. _ .......Jata” matrix in lower dimension, namely., by means o f a lower
rank matrix. This extension to the KLE (PCA) is known as singular value decomposition
(SVD).
Principal components analysis (PCA, KLE) has been widely applied.to biomedical signal
processing. ,4-2u x<> SVD methods45-47 50 have also been applied to the biomedical signal
processing, in particular to ECG51 and EEG processing.47

B. K arhunen-Loeve Transformation (KLT) — Principal Components Analysis (PCA)

Consider again the transformation of Equation 3.67. Assume that d = n. namely, the
transformation from the n-dimensional space onto itself. We wish to choose tran formation
T with orthonormal vectors:

(3.93)

Note that here T is a square n x n orthogonal matrix for which

T ’ = TT (3.94)

and

g , T"2 = = X y& (3.95)

i~ 1

where y t are the elements of

The transformation from the original n-dimensional features vector, g , into the new (n
dimensional) features vector, is given in Equation 3.95 as an expansion of $ by means
of a set o f n orthonormal vectors,
We wish now to reduce the dimensionality of ^ from n to some d < n. Wre shall select
d parts of Equation 3.95 and replace the rest (n - a) by preselected constants b,. We shall
then reconstruct the original features vector by

<j n
jj(d) = V y.<j). + V b <j, (3 % )
68 Biomedical Signal Processing

since b;, i = d -f i, . . . ,n, are preselected constants, the vector y describing the signal
is d dimensional. The reconstruction error, Aj3(d), is

AgW). « g - |(d ) = fi - 2t ■ 5',$, - Si 5—d + I

= t (y, - hm a .97)
* =d i ;

The mean square reconstruction error, €<d), is given from Equations 3.97 and 3.93 by

«(<*) = E{Ag(d)Tag(d)} = e{ £ t (y, - biXy, -

U =d+ij=d-J J

= 2 Efo'i - bi)"} (3.98)

We shall choose b, and 4>j that minimize the mean square error of Equation 3.98. To get
the optimal b /s , derive Equation 3.98 with respect to h,:

^ = -2 (E { y J - b.) =■ 0
OD: (3.99)
P '
b, = E{yj =

Equation 3.98 can be rewritten using Equation 3.99:

€(d) = 2 Et(yi “ bi)(y! “

i*d r l

= S - E{p})ig - E{p})TH>,
i = d-+ i

= 2 (3.100)
l = d -rl

The optimal vectors, and thus the optimal transformation T, are given by the minimization
of Equation 3.100 with the constraint - 1. Using the Lagrange multipliers. k„ the
constraint minimization result is3

2,..a>, = M». (3.10D

The solution of Equation 3.101 provides the optimal <+>( which are the eigenvectors of the
original covariance matrix corresponding to the eigenvalues X,. Substituting Equation
3.101 into Equation 3.100 yields the minimum square error:

e(d)mi„ = y X, (3.102)

Note that since e(d) is positive, the eigenvalues are nonnegative. For data compression we
shall choose d columns in the transformation, T, in such a way that the error (Equation
V o l u m e II: C o m p r e s s i o n a n d Automatic Recognition 69

3.102) is minimal. W e shall choose the d eigenvectors, corresponding to the largest eigen
values for the transformation and delete the rest (n - d) eigenvectors. The error (Equation
3.102) will then consist of the sum of <n - d) smallest eigenvalues.
We can arrange the eigenvalues such that: j
A; 5? X, 25 X, > 3* xn s* 0

Then the required transformation matrix. Tx\consists of the columns, <J>is i = 1,2. . . . ,d.
Note also that

= T ^T 1 - T 2 J T - diag(X,,X,... .X;!) (3.103)

since T is the modal matrix of The covariance matrix of v is diagonal which means that
the features y, are unconeiated.
The calculation o f the eigenvectors of the matrix, is not an easy task. Several algorithms
for this calculation have been suggested.141
Some results o f KLE as applied to biomedical signals are shown in Figure 14. The KLE
as presented in this section is very effective for signal compression when the application is
effective storing or reconstruction. This is obvious since it is optimal in the sense of minimum
square of reconstruction error. It is, however, less attractive when the goal is classification.
Fukunaga and Koontz40 have suggested a method whereby a modified KLE can be used
for two-class discrimination. Their method is not optimal and indeed a counter example has
been presented52 thai shows poor discriminant behavior of the method.

C. S in g u lar V alue D ecom position (SVD)

The KLE was designed to expand a signal g . We can group all L given signals belonging
to one class into an n x L matrix F,

F = (3.104)

and expand the matrix. If the rank of the matrix B is k, it can be expressed as54

F - US,-VT (3.105)

where the n x n matrix U and L x L matrix V are unitary matrices (VT = V ~ !) and the
rectangular n x L matrix, Sr . is a diagonal matrix with real nonnegative diagonal elements,
s,, known as the singular values. It is obvious that k is less than or equal to the minimum
of n and L.
The singular values are conventionally ordered in decreasing order s, ^ s, 2 = ............ sk
s* 0. with the largest one, s,, in the upper left hand comer of Sf;. The singular values are
the nonnegative square roots of the eigenvalues of FFT and F^F. The n x 1 dimensional
vector Uj. i = 1,2. . . . ,n (the columns of U), and the L x 1 dimensional vectors, x>,. i
= 1.2. . . . ,L (the columns of V), are the orthonormal eigenvectors of F F and FHF.
respectively.
The representation given in Equation 3.105 states that any k rank matrix can be expressed
by means of a rectangular diagonal matrix, multiplied from right and left by unitary matrices.
Equation 3.105 can also be written as

k
F = 2 s,UjU7 (3.106)
i= I
70 Biomedical Signal Processing

o r ig in a l j A _ J . . ^ w n r
..400»sec
... -J

N O R M A L IZ E D -
-JL A Ii _

RECO N STRU CTED -ji. A J , ____

1ST e ig e n v
j i ;V i -
LJ 12 L 3

FIGURE 1 4 . Karhunen-Loeve analysis, most significant eigenvectors o f (A) ECG signal and ( B ) pain evoked
potentials.

in Equation 3.106 the matrix F is expressed in terms of the sum of k matrices with rank -
one. The matrices u ^ are called singular planes or eigenplanes.
Since the eigenvectors are orthonormal, it follows from Equation 3.106 that

1
Ui = - Fv; (3.107 A)
Volume II: Compression and Automatic Recognition 11

(3.I07B)

Recall that the L columns of the features matrix F are the signals fij, j = 1,2, . . . ,L. Each
vector then can be expressed from Equation 3.106 as

k
§ , = X SiD.jfci
i* I

j = 1 ,2 ,...,L (3.108)

where the elem ents o f v k are

vj = ...... v lL]

We can define the coefficients Q.,:

Q.j = w j

i = l,2 ,...,k

j = 1,2,...,L

and rewrite Equation 3.108 by

g , = C ,,u, + C2 jU2 + ... + Ck,u k (3.109)

In Equation 3.108 each signal j3j is expressed in terms of eigenvectors of F F . The SVD is
thus, in principle, equivalent to the PCA.
We want to use the SVD for compression. Consider now the expansion of Equation 3.106
with summation index running until d k. Note that since we have arranged the singular
values in decreasing order, the reduced expansion will include the d largest values. If we
thus denote the estimate of F by F:

a
F = X s,u{v j = USfVt
1=1

d ^k (3.110)

Then, for any matrix C, of rank d, the following holds true:

||F - F|| « ||F - C|i (3.111)

Here the norm is in the sense of the sum o f squares of all entries:

||A||- = tr[(A)T(A)] = X (3.112)

In Equation 3.110 the singular values are the d largest ones and SF is obtained from SFby
setting to zero all but the d largest singular values. Equation 3.111 states that the estimated
reduced matrix F is the best least squares estimates of F by rank d matrices.
Analogous .o PCA, it can be shown here that the estimation error (residual error), expressed
72 Biomedical Signal Processing

m terms of the norm of Equation 3. \ H , equals the sum of squares o f the discarded singular

!;F - Fif = . - S s7 ! (3.1 i3>

The reason toi ^joosisu: d Largest, singular values fo? iiEc i ;- d e a r from Equation

3.113* ■
Methods for computing the singular values from the eigenvalue^ ot F F 1 require heavy
computation load and a*e time consuming. In ad o n is., conventional methods v*eid ai! k
singular values where indeed onh the d laigesl «m<- arc required. A w tb ix J for the com
putation of-the vngulaf values, -.'tie at a time, hc^irmn;! v tih the 'Urczhi one, h-;?> b-.-ai
suggested:51" it is known a;> the power method. The computation eaa be sk pped when tlic
already acquired singular values yield residual error (Equation 3.113 j below a certain thresh
old. The method is especially attractive in cases where the data matrix is large, but its iank
is low. The computation method h briefly presented here. Note also that this method operates
on the data matrix F directly and not on the correlation matrix PFT.
The compulation is based on the solution of the two equations (Equation 3.107):

su = Fu (3.114 A)

SH ~ * 14 B >
*•
Using an arbitrary starting vector u<0), we form' the following iterative solution:

( 3. 115A I

F ru<k: u

We continue with the iteration until

u(k+n - u<k) *£ € (3,116)

where e is some predetermined stopping vector. The first (largest) singular value, s,» is then
estimated by taking the nonn of Equation 3 .1 14B and recalling that the length of u i s unity:

s, = !|FTu(k+ !)|j (3. i i 7A)

and the orthonormal vectors are

u, = u(k+J)

V, = v (k;i) (3.U 7B )

for the last k. To obtain the next singular value and eigenvectors, the estimated singular
plane is removed from F by

F(1> = F - s,UivT (3.118)

and the same iterations (Equations 3.115 to 3.118) are repeated for F<n to get s2, ju2, v 2 and
V o l u m e 11: C o m p r e s s i o n a n d Automatic Recognition 73

so on untfj.^,.- u t, i\,. The last (dth) singular value so be calculated is determined by the
thresh-ofcf- e oo the residua* error:

!|F ~ £ S M II . (3 .U 9 )
i J

'! he f-^rpcrties of the algorithm are discussed by Shlien - tJ The convergence

nit. d- ixnA. on \\>j ivho of the adiitccm highest eigenvalues, thus becoming vc-ry siow -when.'
ii’.K ; . h i o is d o c :o one. An algorithm that improves convergence in these ciidcuT eases has
Uv»; SH£g,CSi*'t!.-:

L x^m plt s.K

Co*wUfoi ifain given .m S a m p le 3.1 'Co simplify the calculations we shall regard
o r ’y *he tj.s: inn » m gnatsof \v:. We reprcscm u»csc m^ uuis by means of the matrix F:

0.5 0 1

i 2 1

The rank o f the matrix is k = 2. The 2 x 2 correlation matrix FFr is

1.25 0.5
FF*

0.5 6

The eigenvalues o f F F 1 are X, = 6.05206 and X2 = 1.19794 with corresponding singular

values s, = k \ 2 = 2.46 and s2 = X2- ~ 1.0945. The matrix U is

0.10356 -0 .9 9 4 6 2
U = [ujiuj
0.99462 0.10356

where u, and u2 are the orthornormai eigenvectors o f FFT corresponding to X, and X2.
The eigenvalues arid eigenvectors of F rF are

1.25 2 0.5
FF = 2 4 2
0.5 2 2

X, = 6.05206, X, = 1.19794, X, = 0, and:

74 Biomedical Signal Processing

0.3832 0.5490 - 0.7428

0.8086 0.18924 0.5571
0.4464 - 0.81398 - 0.3714

The matrix F can now be expanded by

F - ^ SjU.v^ —
j-i

0.0976 0.206 0.11372

0.9376 i.9784 1.0922

0.5976 - 0 .2 0 6 0.8861

0.0622 0.02144 - 0.0922

-0.5 0 1

1 2 2

If we desire to reduce the dimensions, we shall take only the first term in the expansion,
namely, the projection on the first eigenplane:

0.0976 0.206 0.11372

F = s ,u ,p [
0.9376 1.9784 1.0922

F is a matrix o f rank 1. The residual error for the estimate is

-0.5976 - 0 .2 0 6 0.8861
l|F ~ F||2 - 1.19756 = s; = k 2
0.0622 0.02144 -0 .0 9 2 2

Let us repeat this example with Shlien’s power method. Choose the initial unit length vector
to be

p ( 0) _ £3 — i /2 3 - 1/2 3 - M2]

Using the iterative equations we get

u(n = [0.1241, 0.9923]T

Volume II: Compression and Automatic Recognition IS

V<‘> = 10.3764, 0.829, 0.4517]'

u<2> = {0.1076, 0.9942]'

V<2> = (0.3823, 0.8083, 0.4478]1

u O) -
{0.1043, 0.9945J1

v <3> =
{0.3831, 0.8085, 0.4467]

We note that after the third iteration we have

|u(3’ - u,21 « | I - 0.0033, 0.0003]T|

|v<'* - v‘2)| - 1(0.0008. 0.0002, - 0 .0 0 1 1]T|

If we consider this to be converged, we shall have

u, = u<J\ v, = v(3’ and:

s, = ||BTu,jj = 2.474 - 2.46 = s,

s2 = ||Bv,jj = 1.094 « 1.0945 = s.

Expressing the matrix F by its reduced, rank one matrix, with largest singular value yields:

0.0984 0.2066 0.1146

F = 2.46 u,v| =
0.937 1.978 1.0928

which is very close to the reduced matrix calculated by the direct method.

V I. D IR E C T F E A T U R E S E L E C T IO N A N D O R D E R IN G

A. In tro d u ctio n
In classification problems with two or more classes, it is often required to choose a subset
of d best features out of the given n or to arrange the features in order of importance. To
do this we require a measure of class separability. The optimal measure of features effec
tiveness is the probability of error. In practice one can use the training data and use the
percentage of classification error as a measure. This approach is often used: it is, however,
experimental and requires a relatively large training data.
Scatter matrices can be used to form a separability criterion. Recall that the within class
scatter m atrix, W (Equations 3.75 and 3.85) is the covariance matrix of the features in a
given class. The between class scatter matrix, B (Equation 3.87), is the covariance of the
means of the classes.
A criterion for separability can be any criterion which is proportional to the between
scatter matrix and also proportional to the inverse of the within scatter matrix. Maximization
of such a criterion will ensure that while maximizing the “ distance’’ between classes, we
do not amplify (with the same rate) the scatter of the classes, thus causing no improvement
to the separability. A criterion like this was used in Equations 3.82 and 3.91.
Several such criteria were suggested (e.g., see Fukunaga):3
76 Biomedical Signal Processing

(a) Jt = tr(W - JB) {3.120A)

(b) J- = !n|W ''Bj - In(lB|/|W|) (3.I20B)

T h e v enters are relatively simple to use. However, they do not have a direct relationship
u> the piobabi»i!> o f error. More complicated criteria that can be related to the error probability
such i-s the Chernoff bound and Bhattacharyva distance are know n.’

fi. The Divergence

The divergence is a class separability measure similar to the Bhattacharyva di>tance; it is
baseu on the likelihood ratio. Let the p r o b a b ility of getting a feature vector g . given that it
belong.' to cIikvS w ,, be pfglw,). The ratio between Uicvc conditional probabilities for class
\v and class wt yields information concerning the separability between these two classes.
The logarithm of the ratio is usually taken with no loss to the concept.
The average discriminating information for class, with respect to wi is given by
averaging the log likelihood ratio over all

%<
Similarly , the average discriminating information of class w, with respect to w, is

Lj( = j
-a
pipiwJin dfi
p*g;-v.) ~
(3.1215!

' The- tola! average information in Equations 3.121 A and B is known as the divergence; let
us denote it by D- y

Dij = L,., + Lm = I (p(g|w,) - pfBiw^ln g g j j j dg (3.122!

For normal distributions with \xk and £ k. k = i,j, the expectations and covariance matrices,
respectively, the divergence becomes

d m = I <£i “ + 2 ; ‘X jii - Jij)

+ -trC S f + 2 f ‘ 2, - 21) (3.123)

When the two classes possess the same covariance matrix, Sj = 'X-. = X, the divergence
becomes

D Ki = ~ ~ &) (3.124}

w hich is the Mahalanobis distance. Methods for selecting optimal features by the maximi
zation of the divergence have been suggested.4
V o l u m e 11: C o mp re ss io n a n d Automatic Recognition 11

C. Dynam ic P ro g ram m in g M ethods

The problem of selecting the subset of best d features from a given set of n features (d
< n) has a straightforward solution, that of exhaustive search. One can form all possible
different com binations of d out of n features; for each one the measure of separability can
be calculated and the best set selected. Unfortunately such an exhaustive search method is
impossible for large n and d because of ihe amount of calculations required. For example,
if n - 40 features are given and it is desired to find the best set of d - 10 features. 8.477
x 107 combinations have to be checked; clearly this is an impossible task. Several suboptimat
! ethods have been suggested. 'Search without replacement” or “ knock-out’* methods are
simple to implement. Here we first search for the most effective single feature. The one
selected is to be included in the final set. Out of the n - 1 features remaining, we look
again for rhe single most effective feature that together with the feature chosen previously
wilJ.be the be*t set. in the sense of maximizing ihe separability criterion. This procedure is
continued until the set of d features is determined. The set thus selected is not .necessarily
the optimal set. The algorithm is, however, computationally attractive since it required only
d(n - <d - .0/2) searches. For the case described before where u = 40 and d = 10. the
number of searches is only 355,
An algorithm '6-57 based on dynamic programming has been applied to the problem of
suboprimal selection of features. In this method we start with a single feature and find another
feature that together form the optimal set of two features. We do this for ail n features. To
each set we attach the value of the criterion. At the end of the first step we have n sets of
two features each with its criterion value, it d — 2 we choose the set with the highest
criterion value. If d > 2 we proceed by adding to each one of the n sets a third feature in
such a way that the set of three features maximizes the criterion. Again we attach rhe value
of the criterion to the set. Note ihat now we have n sets of three features, each with its
criterion vaiue attached. We continue this way, with the n sets, until they include the required
d features. We then choose the best set by finding the one that has the maximum vaiue for
the criterion.
An additional advantage provided by the algorithm is the fact that when selecting the best
set of d features we also get ail best set of p features, p = 1,2, . . . ,d- This gives us
information on the added value of the last feature at each step. Let us now formulate the
algorithm . Re jail that in each step we deal with n subsets. Consider the (i - 1) .step. We
shall denote the n vectors selected up to this step by

n i- i = ( ft,

j = 1,2 ,....n (3.125)

Here jrjj.... is the j//7, (i — 1) dimensional vector and {3jr q - 1,2............i - 1, are the set
of (i - 1) features selected from the given n features (g). In the \th step we shall increase
the dimension o f the n, tj).. , vectors by adding one feature (that is not already included in
the vector) to each one of them, such that

Pi * Hi «

j - 1,2...... n (3.126)

The added feature |3| will be selected from the available n — (i - 1) features such that the
criterion will be maximized
78 Biomedical Signal Processing

A ^) « Max ( 3 .127)
(M s!-,

denotes the maximum of the divergences. The algorithm proceeds up to the d step. At
that step the set with maximum divergence is selected:
th
where D,(if}) is the value of,the divergence evaluated with the ith (dimensional vector tq- X,

= M a x O ^ )) (3.128)
j

The number o f searches required by the algorithm53 is n(d - l)(n - d/2) which is sub
stantially less than the exhaustive search (for large n and intermediate values of d). For the
example chosen previously of n = 40 and d = 10, the dynamic programming algorithm
requires 13,600 searches.

Exam ple 3.9

The speech signal may be used to model the speaker’s vocal tract. Since humans differ
in the anatomy of the vocal tract and other speech producing systems, one can use speech
features to identify or verify58 speakers. In the example discussed here, 77 average features
were extracted from frames of 15-sec speech. Each speaker was described by the 77-
dimensional feature vector. It was desired to reduce the dimensionality to ten, using the
dynamic programming method,
j. We shall not assume that the speakers possess identical covariance matrices. Hence, for
the case of i-dimensional features we shall have the within class scatter matrices (Equation
3.85):

Wj = 2 ( g - £1X 0 - jiDT; j = 1 ,2 ,...,N S

where NS is the number of speakers, Pj is the i th dimensional features vector of the j th

speaker, j i j is its estimated mean, and the summation is performed over all training samples
-available for the }th speaker. The between class scatter matrix (Equation 3.87) was defined:

Bl = £ ( & - AfK&i - &>T; j = 1 .2 .- .N S

k= 1

For each speaker, j, the divergence used was

Dj = tr((W|) !Bj)

The original set of n = 77 features is given below:

AR coefficients {a n . ’i = 1,2 10 {ap}, i = 1,2 10

Correlations { p r i i = 3,4 -,10 {pftf i = 3,4 10
PARCORs {kr>, i = 1,2 -,9 f t ”}. i = t - 2 9
Cepstrum { c n i = 2,3, .,10 {C?}, i = 2 ^ 10
Prediction error EPV
Log energy LEV LEU
Pitch P

Superscript V denotes average features for voiced (see Appendix A) segments and U denotes
unvoiced segments. For a discussion o f the features see Chapter 7, Volume I.
Volume II: Compression and Automatic Recognition 79

FIGURE 15. Distance between speakers in text independent speaker verification.

As an exam ple, the results of the dynamic programming search for one speaker (AC) is
shown here:

d Features Divergence

: [P.LE'l 250.67
? 1P.LE1. pi] 528.23
- IP.LE1 .pi.pM 684.06
5 IP.LE1 .pK
v,p;„.E ! 783.72
6 (P.p'.Ep.LE1 .pi.k; J 941.52
IP.LE1, pi .pJ„.E:.pM0j 1096.6
8 IP.p7.Ep.kf.LE' .k'.pV.Cu,] 1325.9
y (P.pV.E(N
,,k''.LE- .k i.Pt,Ep,kVi 1563.9
10 (P.p'.Ep.kJ.LE ,khpV.E;.kl.asv] 2036.9

Note that the pitch feature (P) was chosen in all subsets. The pitch is indeed known as an
important feature tor speaker identification. Note also that for low orders of features vectors
an increase in dimensions changed the features (for example, note the suboptimal vectors
of order 5 and 6). For larger orders, the main features did not change (e.g., see orders 9
and 10).
For actual speaker verification the Mahalanobis distance was used. Figure 15 shows the
distances from segments of 15-sec speech of speaker (IM) to the templates of speakers IM,
AC. and MH. The suboptimal feature vector of dimension 10 evaluated by the dynamic
programming method was used.

V II. T IM E W A R P IN G

One of the fundamental problems that arises when comparing two patterns is that of time
scaling. Up to now we have assumed that both pattern and template (reference) to be compared
share the same tim e base. This is not always correct. The problem is especially severe in
speech analysis. It has been found that when a speaker utters the same word several
times, he does so. in general, with different time bases for each utterance. Each word is
spoken such that parts of it are uttered faster, and parts are uttered slower. The human brain,
it seems, can easily overcome these differences and recognize the word. Machine recognition,
however, finds this a severe difficulty. To overcome the problem, algorithms that map all
patterns onto a common time base, thus allowing comparison, have been developed. These
80 Biomedical Signal Processing

F iG L R r i6 T ln ;c w a rp m g p la n e w u h s e a rc h a r j

are known as "tim e warping" algorithms. The basic idea of lime warping is depicted in
Figure 16. Assume we are given two signals, x(t;), xu,}:

X(t,;) , t; € (3.129A)
i-
x(t,) . I: € (t . t,.j (3.1298)

each with its own time base, t s and tj. We assume that the beginning and end o f each signal
are known. These are denoted (t}s, xi() and (t^;tK), respectively. We shall consider the discrete
case^-hdre>-bOth'sigflalsrwere sampled at the samr rate. Assume al:o that the samples have
been shifted sttchthat both signals begin at sample i - j = 1. Without the loss of generality
we have now:

x(i) . i = 1,2.....I (3.129C)

x(j) , j = 1,2,... J (3.129D)

if the two time bases were linearly related, the mapping function relating them was just i
= j * I/J. In general, however, the relation is nonlinear and one has to find the nonlinear
time w arping function. We shall make several assumptions on the warping function before
actual calculations.
The warping function, W(k), is defined as a sequence of points:

c (l), c(2),...,c(K )

where c(k) = (i(k),j(k)) is the matching of the point i(k) on the first time base and the
point j(k) on the second time base. The warping, W(k), thus allows us to compare the
appropriate parts of x(t,) with these of x(tj).
We shall impose a set of monotonic and continuity conditions on the warping function:60

0 ^ i(k) — i(k — 1) (3. BOA)

0 s? j(k ) - j(k - 1 ) =£ p i (3.130B)

Volume 11: Compression and Automatic Recognition 81

Clk-0= C(k)*(i»])

C(k-n=(i- l,j-l) C(k-0*(i, j-i)

FIGURE 57. CoRMMinrs on ibo d \ nan.ic programming.

'i he left side inequality ensures increasing monotony-; the right side incqu.ali.ty is a continuity
condition ibai restricts fine jumps in the warping. This restriction is important since
discontinuities can cause the elimination of parts of the signal, ii has been suggested 60 to
choose pi = p, ~ i; we shall adapt this here. As a resuit of conditions (Equation 3 .i3 0 ).
vve have restricted the relations between two consecutive warping points c(k) and c(k — 1 )
to be

Ui(k) , j(k) - !)
c(k — 1) = < (i(k) ~ 1 . j(k) - i) (3.131)
(i(k) -- 1 . j<k)

Figure 17 depicts the meaning of the last equation. Due to the constraints, ihere are only
three ways to get to the point c(k) ~ (i.j). These are given in Equation 3.13! and in Figure
17.
We also require boundary conditions. These will be defined as:

cii> - ( \A ) , cfK) - (U ) (3.132)

By the boundary condition, we mean that we match the beginning and end'of the signals.
This is not always a good condition 10 impose since we may not have the endpoints of the
signals accurately.
The warping function will be estimated by some type of search algorithm. We would like
to limit the area over which the search is performed. We shall restrict the search window 62
to:

\\ - j • I/Jj ^ 7 (3.133)

where y is some given constant. The last condition limits the window to an area between
lines parallel to the line j = iJ/I (see area bounded by solid lines in Figure 16).
Constraints on the slope can also be imposed. If we impose such conditions that limit the
maximum allowable slope, and minimum slope. ° f the warping function, we
end up with a parallelogram search window (see area bounded by broken lines in Figure
16).
We shall now proceed with the dynamic programming search. We recall now that the
signals nre represented, at each point, by their feature vectors, J3j(k) and P /k ). Here f3j(k)
denotes the feature vector of the signal x(t,) at the time i(k) with similar denotation forj^(k).
Define a distance measure between the two feature vectors by
82 Biomedical Signai Processing

d(c(k)) * d(i(k),j(k)) = !|gt(k) “ g,,(k)!i (3.134)

We will search for (he warping function that will minimize a performance index, D(x(t,).x(tj)).
We shall use the normalized average weighted distance as the performance index; hence.

D(x<ti) >x(tj)} = Min (3.135)

where p(k) are the weights. We shall simplify the calculation by employing weights, the
sum o f which is independent of W. Sakoe and Chiba61 have suggested the use .of

p(k) - i(k) — i(k — I) + j(k) - j(k - 1) (3.136A)

which yields

K
2 ) P<k ' = I + J (3 .1 3 6 B )

The weights are shown in Figure 17 . The performance index (Equation 3.135) becomes:

1
D(x(ti),x(tj)) — M in f X d(c(k >)p<k
w '

The dynamic programming procedure allows us to perform the minimization in an efficient

manner. Define the measure, g(i(m),j(m)):

g(i(m),j(m)) = Min d(c(k))p(k)^ (3.138A)

The dynamic programming65 (DP) equation proves the g(i(m).j(m)) measure by means of
g(i(m - 1) j(m - 1)):

g(i(m),j(m)) = Min (g(i(m - l),j(m - 1)) 4- d(c(m))p(m » (3.138B)

c (m -l)

The point, c(m), on the warping function will be determined by considering all allowable
routes into the point c(m) The route that minimizes g(i(m).j(m)) is chosen. It is clear that
the more constraints we pose on the warping function the less routes we shall have to check.
The procedure starts with the initial conditions;

g(c(l)) = d(c(l)) - p(l) (3.139A)

and finally the distance is

gK(c(K)) (3.139B)
Volume 11: Compression and Automatic Recognition 83

Hence, for the weights and constraints we have imposed, we get the algorithm initial
conditions:

g ( l,l) = 2 d (l,i) (3.140 A)

the DP equation:

f g (ij - 1) + d(i,j)
g<i,j) = Min{g(i - l.j - 1) + 2d(i,j) (3.140B)
\jl {\ - l.j) + d(i,j)

and the search window:

1 I .
- (j “ 7) ^ i ^ ~j (J + 7) (3.140C)

The distance between warped signals is

D(x<t ixit,)) = (g(lJ)) (3.140D)

Equations 3.140A to D are recurrent! \ calculated in ascending order, scanning the search
window in the order:

j = 1 .2 ... J

i = M ax^l, j !j — ...... Min ^1, j (j + 7 ) ) (3.141)

The search is conducted, row by row . beginning with j = 1 and i = 1,2, . . . .(1 + -y)!/
J, followed by j = 2, i = 1,2............(2 + 7 )I/J, thus scanning all o f the search window.
For each point scanned. g<i,j ) is calculated, until the endpoint (I,J) is reached. The algorithm
yields d i r e c t l y the distance measure between the time warped words.
Note that when calculating the measures of the ith row, previously calculated measures
from the same row or from the row <i - 1) only are needed. Hence, one has to store the
measures of current and previous rows only. At the most, this means the storage of 27I/J
numbers.
The procedure described above yields the distance measure between the two warped
signals. It does not yield the warping function itself. This is so because the optimal route
W(c(k)) can be evaluated only after the search window has been completely scanned. In the
procedure described above we have not stored the measures g(i,j). The optimal route can
thus not be retrived. Consider the following modification to the described algorithm. After
calculating the measure g(i*,j) by Equation 3 .140B, we record and store the choice made by

w(i,j) = q

where
84 Biomedical Signal Processing

j 0 . g(c(k ~ !)) = g(i,j - 1)

q = \ 1 , g{c(k - !}) = g(i —' 1J — 1) (3.142)
[2 , g(c(k - I)) = g(i - 1,j)

the-value of w(?,j) fells us from what allowable point we have reached the current point
(i.j). Hence, each point in the search window has attached to it information about the optimal
route to reach it. After scanning has terminated, we can reconstruct the optimal route W(c(k) J
by going backward irom w{I,J) to w(l ,1). For example, if \v(c(K}) ~ vv(M) =.2* we know
that c(K — 1) i\ — i J ) . To proceed wc check w(i ~ i J ? and ->o on.
The storage requirement for this procedure is much higher, oi u»urse For each point in
the search window wc require a storage of 0.25 b>?e* (q requires onK two bits). The total
slumber of points in the search window h -yH2 - y j ) (for t!v ca>c 1 ^ y il) ami required
storage is thu> 0.25 yi(2 - y j ) bvtes.
Time warping by means oi or^icd giaph search. (OGS) technique has been developed/’3
It has been shown chat the OGS algoutnm c m solve the time warping problem with essentially
the same accuracy as the DP itgorimrn w»th computations reduced by a factor o f about
2.5. This reduction in computation, however, is attained at the expense of a more complicated
combinatoric effort, h has been ^rgueu6’ therefore that when special high-speed hardware
is used for the computation, the OGS may have no advantage over the DP.

REFERENCES

1. P u ria , P , O . a n d H a r t , P . . E . , 'Pattern Classification and Scene Analysis. W i i e y - l n t e r s c i e n c e , New Y o r k -

1973.
2 . Y o u n g , T. Y, a n d C a l v e r t , T. W . , Classificm ior , h 'turnand Recognition, E l s e v i e r , N e w Y o r k , 1 9 7 4 .
3 . F u k u n a g a , K „ Introduction to Statistical Pattern < - ‘h'ton, A c a d e m i c P r e s s , N e w Y o r k , 1 9 7 2 ,
4 . Tou, j . T . a n d G o n z a l e z , i t . C M Pattern Recogtuuo ” P-muffles, A d d i s o n - W e s l e y , R e a d i n g . M a s s , 1 9 7 4 .
5 . A h m e d , N . a n d R a o , K . R . , Orthogonal Tran,Ut ^ * a Digital SiguJi Processing, S p r i n g e r - V e r l a g .
B e rlin , 1 9 75.
6 . F u , K . S . , E d . , Applications o f Pattern Recognition. C R C P r e s s , B o c a R a t o n . F l a . , 1 9 8 2 .
7 . C h e n , C . H . , E d . , Digital Waveform Processing and Recognition, C R C P r e s s , B o c a R a t o n . F t a . , 1 9 8 2 .
8 . F u , K . S . , E d . , S p e c i a l i s s u e o n f e a t u r e e x t r a c t i o n a n d s e l e c t i o n in p a t t e r n r e c o g n i t i o n , IEEE Trans.
Comput., 2 0 . i 9 7 1 .
9 . O k a d a , M . a n d M a r u y a m a , N .„ S o f t w a r e s y s t e m f o r r e a l t i m e d i s c r i m i n a t i o n o f m u l t i - u n i t n e r v e i m p u l s e s .
Comput. Prog. Biomed., \ 4 , 1 5 7 , 1 9 8 1 .
1 0 . G e v i n s , A . S . , P a t te r n r e c o g n i t i o n o f h u m a n b r a i n e l e c t r i c a l p o t e n t i a l , IEEE Trans. Pattern Anal. Mack.
Intelligence. 2 , 3 8 3 / 1 9 8 0 .
1 1 . G e r s c h , W . , Y o n e m o t o , J . , a n d N a i t o h , P . , A u t o m a t i c c l a s s i f i c a t i o n o f m u l t i v a r i a t e E E C 's u s i n g a n
a m o u n t o f i n f o r m a t i o n m e a s u r e a n d t h e e i g e n v a l u e s o f p a r a m e t r i c t i m e s e r i e s m o d e l f e a t u r e s . Comput.
Biomed. Res., 1 0 , 2 9 7 . 1 9 7 7 .
1 2 . B o d e n s t e i n , G . a n d S c h n e i d e r , W . , P a t t e r n r e c o g n i t i o n o f c l i n i c a l e l e c t r o e n c e p h a l o g r a m s , i n Proc. Int.
Conf. Signal Process., F l o r e n c e , ! 9 8 1 , 2 0 6 .
1 3 . F e r b e r , G > , A n a d a p t i v e s y s t e m f o r t h e a u t o m a t i c d e s c r i p t i o n o f E E G b a c k g r o u n d a c t i v i t y . Method. Inf.
M ed., 2 0 , 3 2 , 1 9 8 1 .
14. C h ild e r s , D . G ., B lo o m , P . A ., A r r o y o , A . A ., R o u c o s , S . E ., F is c h le r , I . S ., A c h a r iy a p a o p a n , T .,
a n d P e r r y , N . W . , C l a s s i f i c a t i o n o f c o r t i c a l r e s p o n s e s u s i n g f e a t u r e s f r o m s i n g l e E E G r e c o r d s , IEEE
Trans. Biomed. Eng., 2 9 , 4 2 3 , 1 9 8 2 .
1 5 . Y t i n k , T . P . a n d T u t e u r , F , B . , C o m p a r i s o n o f d e c i s i o n r u l e s f o r a u t o m a t i c E E G c l a s s i f i c a t i o n , IEEE
Trans. Pattern .Anal. Mach. Intelligence, 2 , 4 2 0 , 1 9 8 0 .
16. L i m , A . J . a n d W i n t e r s , W . D ., A p r a c tic a l m e th o d fo r a u to m a tic r e a l- tim e E E G s le e p s ta te a n a ly s is .
IEEE Trans. Biomed . Eng., 2 7 , 2 1 2 , 1 9 8 0 .
V o l u m e II’ C o m p r e s s i o n a n d Automatic Recognition 85

17. Lam, C. F .. /.im m erm ann, K ., Simpson, K. K ., Katz, S ., and Blackburn, J . G ., Clarification of
somatic evoked potentials through maximum enuopy spectra! analysis, Electroencephalogr Clin. Neuro-
nhy.su>>.. 53. 491, 1982.
18. Gersch. W ., M aritnelli. F., and Yonvnioto, J., A u t o m a t i c c l a s s i f i c a t i o n o f E L G , K u l l b a c k - L e i b k i n e a r e s t
n e i g h b o r r u l e s . S a e m r . 2 ( '5 { 4 4 0 2 ‘. 1 9 3 . 1 9 7 9 .
1 9 . R u t t i m a n i i . L . L . , C o m p r e s s i o n o f i h e l.;C G b y p r e d i c t i o n o r f n t e q 'o l a t i o n a n d e n u o p y e n c o d i n g , IEEE
Trans. Butt. Med. tiny., 2 6 ( ! l j . 6 1 3 . 1 9 7 9 ,
2 0 . M e t a x a i k s - K o s s i d r t i d c s . C . , A t i u m a i o s , S . S., a n d C a r o u b a l o s , C . A . . A m e t h o d f o r c o m p r e s s i o n r e
c o n s t r u c t i o n o f !:.C C i s i g n a l s . ./ iUomcd. Enx.. 3 . 2 1 4 , 1 9 S 1 .
2 1 . A b e n s t e i n , J . i \ a n d T o m p M n s , W . J . , A n e w d a t a r e d u c t i o n a l g o r i t h m f o r r e a l ti m e E C G a n a l y s i s ,
IEEE Trons. S ign ed. Eng.. 2 9 . a ' 1 9 8 2 .
2 2 . Jain, I 1., K autfharju, 1*. M . , a n ‘ Warren, ,1., S e l e c t i o n o j o p t i m a l f e a t u r e s l o r c l a s s i f i c a t i o n o f e l e c
t r o c a r d i o g r a m s . J. E iectroratdiorr.. 1 4 . 2 3 9 , 1 9 8 1 .
2 3 . K u k l i n s k i , \ \ . S . , F a s t W a l s h tr,u iN f o tu s d a t a — e o m p r e v M o n a l g o r i t h m : E C G a p p l i c a t i o n '. Med. Biol.
Eng. C am pin.. 2 S . 4 6 5 , { 9 > :i.
2 4 . N v g a r d s , M . 1 .. a n d H u l t i n g . J . . A n a u t o m a t e d s y s t e m t o r H O G m o n i t o r i n g , Comput. Bioinsd. Res . . 1 2 .
181, 1979.
2 5 . Shridar. M . and Steven*, M . F., A n a l y s i s o f L O G d a t a f o r d a t a c o m p r e s s i o n , ini. J. B 'u w d . Comput..
10. 113. 197*
2 b . Pahhn, ( ) . , Borjesson, I*. ( ) . , a n d W e r n e r , ( ) . , C o m p a c t d i g i t a l s t o r a g e o f E C G ’s , Comput. Prog.
Riomed.. 9 . 2 9 3 . 1 9 7 9 .
2 7 . Cashman. I*. A p a t t e r n r e c o g n i t i o n p r o g r a m f o r c o n t i n u o u s E C G p r o c e s s i n g in a c c e l e r a t e d t i m e ,
Comput. B itw id . fit v .. ! 1 , 3 1 ' , 1 9 7 8 .
2 8 . G u s t a f s o n , D . f - . . W i l l s k y , A . S.. W a n g . J . V .. L a n c e s t e r . M . C . „ a n d T r i e b > v a s s e r , J . H . . E C G / V C G
r h y t h m d i a g n o - K u s i n g s t a t i s t i c a l s i g n a l a n a l y s i s . 1. i d e n t i f i c a t i o n o f p e r s i s t e n t r h y t h m s . I I . i d e n t i f i c a t i o n
o f t r a n s i e n t r h ;» ih n ix . IEEE Irons. Biomcd. Eng.. 2 5 . 3 4 4 . 1 9 7 8 .
2 l». WombSc, M. E .. Halliday. J. S.. Mitter, S . K ., Lancester, M. and Triebwasser, J . H ., D a ta
c o m p r e s s i o n f • s t o r i n g a n d t r a n s m i t t i n g E C G 's V C G ’s . P n v . IEEE. 6 5 . 7 0 2 , 1 9 7 7 .
3 0 . A h m e d , N . . M i i n e . P . J . . a n d H a r r i s , S . G , . l : i e c { r t v a r d i o g - : a p h i c d a ta c o m p r e s s i o n \ h s o r t h o g o n a l
t m n s f o n n s . IEEE Trans. Burned. Eng.. 2 2 . 4 8 4 , 1 9 7 5 .
3 1 . Y o u n g . T . V . a n d H u g g i n s , W . H . . C o m p u t e r a n a l y s i s o f e l e c t r o c a r d i o g r a m s u s in g a h r e a r r e g r e s s i o n
t e c h n i q u e . IEEE Inins. Blamed. Eng., 2 1 . 6 0 . 1 9 6 -1.
3 2 . Marcus, M .. H am m erm an. H .. a n d Inbar, G, F., E C G c l a s s i f i c a t i o n b y s ig n a l e x p a n s i o n o n o r t h o g o n a l
K - L b a s e s . P , r - e r 9 . 2 5 . in P r o c . IE L F . SId e to n Conf.. T e l - A v i v , k r a e l . M a y 1 9 8 1 .
33. iwata. A ., Suzum ura, N . . a n d Ikegaja, K., P a t t e r n c l a s s i f i c a t i o n o f th e p h o n o c a r d i o g r a m s u s i n g lin e a r
p r e d i c t i o n a n a ! y > i s . Med. Biol. Eng. Comput.. 1 5 . 4 0 7 . 1 9 7 7 .
3 4 Urquhari, K. B ., McGhee, .}.. Macleod, J. F.. S . . Banbam, S . W ,, and Moran, F ., T b e d i a g n o s t i c
v a l u e o f p u l m o n a r y s o u n d s ; a p r e l i m i n a r y s t u d y b y c o m p u t e r a i d e d a n a l y s i s . Comput. Biol. Mt J .. 1 ! . 1 2 9 .
1981.
35. Cohen, A. and Landsberg, B ., Analysis and automatic classification of breath sounds. IEEE Turns.
Biomcd. Eng . 31. 585. 1984.
3 6 . inbar, G. F. and Noujaim, A. E . . O n s u r f a c e 1 £ \1 G s p e c t r a l c h a r a c t e r i z a t i o n a n d its a p p l i c a t i o n t o d i a g n o s t i c
c l a s s i f i c a t i o n . IEEE Trims. Biomcd. Eng.. 3 1 . 5 9 7 , 1 9 8 4 .
3 7 . Childers, D. G .. L a r y n g e a l p a th o l o g y d e t e c t i o n , CRC Crit. Rev. Bioeng... 2 . 3 7 5 . 1 9 7 7 .
3 8 . Cohen, A. and Zm ora, E., A u t o m a t i c c l a s s i f i c a t i o n o f i n f a n t s ' h u n g e r a n d p a i n c r y . in P '- < v . Int. Conf
Digital Signal P rocess . . C a p p e l l i n i . V . a n d C o n s t a n t i n i d e s . A . G . . E d s . . E l s e v i e r . A m s t e r d a m . 1 9 8 4 .
3 9 . Annon, J. I . a n d McGilfen, C . G ., O n th e c l a s s i f i c a t i o n o f s i n g l e e v o k e d p o te n t i a l u s i n g a q u a d r a tic -
c l a s s i H e r . Comput. Prog. Biomed., 1 4 . 2 9 . 19 8 2 .
40. Fukunaga, K. and Koontz, W. L. (J., A p p l i c a t i o n o f t h e K a r h u n e n - L o e v e e x p a n s i o n . IEEE Trans.
Comput., 19. 311. 1970.
41. Mausher, M . J . and Landgrebe, D. A ., T h e K-L e x p a ^ i o n a s a n e f f e c t i v e f e a tu r e o r d e r i n g te c h n i q u e
f o r l i m i t e d t r a i n i n g s a m p l e s i z e . IEF.L Trans. Geosci. Rem. Sens.. 2 1 . 4 3 8 , 1 9 8 3 .
42. Fernando, K. V. M . and Nicholson, H ., D i s c r e t e d o u b l e s i d e d K - L e x p a n s i o n . IEE Proc.. 127, 155,
1980.
4 3 . Bromm, B. a n d S c h a r e i n , E . , P r i n c i p a l c o m p o n e n t a n a l y s i s o f p a i n r e l a t e d c e r e b r a l p o t e n t i a l s t o m e c h a n i c a l
a n d e l e c t r i c a l s i m u l a t i o n in m a n . Electroencephalogr. Clin. Xcurophysiol., 5 3 . 9 4 , 1 9 8 2 .
4 4 . O ja, E . and K arhunen. J . , R e c u r s i v e c o n s t r u c t i o n o f K a r h u n c n - L o e v e e x p a n s i o n s f o r p a tte - m r e c o g n i t i o n
p u r p o s e s , in Proc. IEEE Pattern Recog. Conf.. M i a m i . 1 9 8 0 . 1 2 1 5 .
4 5 . Klemma, V. C . and Laub, A. J . , T h e S V D . its c o m p u t a t i o n a n d s o m e a p p l i c a t i o n s , IEEE Trans. Autom.
Control . 2 5 . 1 6 4 . 1 9 8 0 .
4 6 . Tou, J. T. and Heydorn, R . P., S o m e a p p r o a c h e s t o o p t i m u m f. itu r e e x t r a c t i o n , in Computers and
Information Sciences, V o l . 2 , T o u . J . T . E u . . A c a d e m i c P r e s s . N e w Y o r k , 1 9 6 7 .
86 Bi om ed ic al Signal Processing

47. Haimi-Cohen, R. and Cohen, A ., A microcomputer controlled system for stimulation and acquisition o f
evoked potentials, Comput. Biomed. Res,, in press.
48. Tufts, R. W „ Kumaresan, R ., and K irsteins, 1., Data adaptive signal estimation by SVD o f data matrix,
Proc. IEEE, 70, 684, {982
49- Tom inaga, S ., Analysis o f experimental curves using SV D, I E E E T r a n s . A c o u s t. Speech S ig n a l P ro ce ss . ,
2 9 ,4 2 9 . 3981.
50. Shlien, S ., A method for computing the partial iV D , IE E E T ra n s . P a t te r n A n a !. M a c k . In te llig e n c e . 4,
6 7 1 ,1 9 8 2 .
51. Ditnten. A. A. and van der Kam, J ., The use o f the SVD in electrocardiography, Med. Biol. Eng.
Comput.. 2 0 .4 7 3 , 1982.
52. Foley, D . H. and Sam mon, J . W ., An optimal set o f discriminant vectors, I E E E T ra n s . C o m p u t., 24,
28 !, 1975.
.53. Cox, J. R ., Nolle, IF. M ., and Arthur, R. Digital analysis o f the EEG. the blood pressure wave and
the ECG. Proc. IEEE, 60, 1137, 1972.
54. Noble, B. and Daniel, j . W,». Applied Linear A t zebra. 2nd ed., Prentice-Hall, Englewood Cliffs, N.J..
1977.
55. Haimi-Cohen, R. and Cohen, A ,, On-the-computation of partial-SVD. ... V......... .... d ig ita l Sig.
Proc.. Cappellini. V. and Conslantinides, A. G ., Eds., Elsevier, Amsterdam. 1984.
56. Cheung, R. S. and Eisenstein, B. A ., Feature selection via dynamic programming for text-independent
speaker identification. IEEE Trans. Acoust, Speech. Signal Process., 26. 397, 1978.
57. Chang. C. V., Dynamic programming as applied to feature subset selection in pattern recognition system.
IEEE Tram. Syst. Man Cvbern . 3. 166„ 1973.
5$: Shrdhar, M ., Baramecki, M .. and M ohanlerishm an, N ., A unified approach to speaker verification.
Speech Conunun.. 1. 103, 1982.
59. Cohen, A. and Froind, T ., Software package for interactive text-independent speaker verification. Paper
6.2.3. in ProcJ IEEE MELECON Conf., Tel-Aviv, Israel, 1981.
60 Sakoe. H. and Chiba, S ., Dynamic programming algorithm optimization for spoken word recognition.
IEEE Trans. Acoust. Spcech Signal Process., 26, 43, 1978.
61 . Sakoe, H .. Two level DP matching — a dynamic programming ba<ed pattern matching algorithm for
connected work recognition, IEEE Trans. Acoust. Speech Signal Process.. 27. 588, 1979.
62. Paliwal, K. K ., Agarwal, A ., and Sinha, S. S ., A modification over Sakoe and Chiba’s dynamic time
warping algorithm for isolated word recognition. Signal Process., 4 . 329. 1982.
63. Brown. M. K. and Rabiner, L. R ., An adaptive, ordered, graph search technique for dynamic ti ne
warping for isolated word recognition, / £ £ £ 7r««s. Acoust. Speech Signal Process., 30, 535, 1982.
64. Rabiner. L. R ., Rosenberg, A. £ . , and Levinson, S . E ., Considerations in dynamic time warping
algorithms for discrete word recognition. IEEE Trars. Acoust. Speech Signal Process., 26, 575, 1978.
65. Bellman. R. and Dreyfus, S ., E ds., Applied Dynamic Programm ir”, Princeton University Press, Princeton.
N.J., 1962.
Volume II: Compression and Automatic Recognition 87

Chapter 4

SYNTACTIC METHODS

I. INTRODUCTION

Two general approaches are known for the problem of pattern (and signal) recognition.
The first, and better know r one, is the decision-theoretic, or discriminant, approach (see
Chapter 3) and the second i:. the syntactic, or structural, approach.
In the first approach the signal is represented by a set of features describing the charac
teristics of the signal which are of particular interest. For example, when analyzing the
speech signal for the purpose of detecting laryngial disorders, features must be defined and
extracted which are independent of the text (as much as possible) and are dependent on the
anatomy of the physiological system under test. The features set serves to compiess the data
and reduce redundancy. This general method with its biomedical applications is discussed
in Chapter 3.
The syntactic m ethod1'* uses structural information to define and classify patterns. Syn
tactic methods have been applied to the general problem of scene analysis, e.g ., to the
automatic recognition of chromosomes and finger prints. It has also been applied to signal
analysis,4 y with applications to such areas as seismology10 and biomedical signal processing.
Syntactic m ethods have been applied to EEG analysis,1, M ECG analysis.15 " ^ the classi
fication of the carotid waveform.2' and to speech processing.-4
The syntactic approach has gained a lot of attention from researchers in the field of scene
analysis since this approach possesses the structure-handling capability which seems to be
important in analyzing patterns and scenes. For exactly the same reasons, this approach
seems to have a good potential in analyzing complex biological signals. The human interpreter
of biological signals, the electrocardiographer or electroencephalographer, e.g .. when ana
lyzing the signals, observes the structure of the waveforms for his diagnostic decision. The
human diagnosis process is thus more closely related to the syntactic approach than to the
more conventional decision-theoretic approach.
The syntactic approach is also known bv the terms linguistic, structural, and grammatical
approach. An analog between the structure o f a pattern and the syntax of a language can
be drawn. A pattern is described by the relationships between simple subpattems from which
it is composed. The rules describing the composition of the subpattems are expressed in a
similar m anner to grammar rules in linguistics.
The basic ideas behind the syntactic approach are similar in principal to those of the
decision-theoretic approach. In order to classify a signal into several known classes, the
structure o f each class must be learned. In a supervised learning mode, known samples of
signals from each class are provided such that the structural features (primitives) and the
rules of their combination into the given signal (grammar) can be estimated. These are stored
in the system. An unknown signal, to be analyzed and classified, is subjected to some
preprocessing in order to reduce noise, and its primitives are extracted. Classification is
made by applying the syntax of each one of the classes. By means of some measure, a
decision is m ade as to the best syntax that fits the signal. Figure 1 shows schematically a
general syntactic signal recognition system.

Example 4.1
As an exam ple consider the signal in Figure 2a with the primitives (features) defined in
Figure 2b. The signal can be described by a string of primitives in the example:
aabbccdeffgggfccccccaacbbcccc. The string representation may be sufficient to describe
simple waveform s. More complex waveforms are described by means of an hierarchical
88 Biomedical Signal Processing

TRAINING
Samptesfc—5“ “\
of classified]?; f \ P re - i Pnmiiive .-..-JK,
Gromniatsca!
signets Proccessmd m Extract inference
L U

Grammars

C L A S S IF IC A T IO N

Unknown i
^ ] _fejfP'8- i ._ „ Prs rmiive *•—-------zA Syntax
Stgndi ~ ; Proccessinaj Extract Analysis

FIGURE 1. A general syntactic recognition system.

\ __ l \
/
a b c

F IG U R E 2 . S y n ta c tic re p re s e n ta tio n o f E C G . (a ) P ie c e w is e lin e a r

m o d e l o f E C G ; (b ) p rim itiv e s .

structural tree. Consider the signal in Figure 3a. It has been segmented into six sections.
Each section is to be described by the primitives presented in Figure 3b and the complete
waveform described by the structural tree o f Figure 3c. Another possible set c f • ^ ves
for this example is shown in Figure 3d.
V o l u m e H : C o m p r e s s i o n a n d Automatic Recognition 89

<a.)

/ \ / \ — - - ~
PSLM N SLM P S L H NSLH HOR CUP PEK CAP

lb .)

P -Q QRS S -T
T V—s
/ V
\1 ^ 1 / \ V x HOR • PSLM CAP NSLM HOR
PSL M CAP N SLM HOR ,

(C.1

(d.)
F I G U R E 3 . S y n ta c tic re p r e s e n ta tio n o f F C G . (a ) E C G m o d e l; (b ) a se t o f e ig h t
p r im itiv e s : (c ) r e la tio n a l tre e : (d ) a n a lte rn a tiv e s e t o f p rim itiv e s .

II. BA SIC DEFINITIONS OF FORM AL LANGUAGES

The description o f the structure of the signal is performed by grammars or syntax rules.
Grammar can also be used to describe all the signals (sentences) belonging to a given class
(language). Usually a class is represented by a given set of known signals (training set). It
is required Jo estimate the class generating grammar from the training set — a process known
as grammatical inference.
90 Biomedical Signal Processing

We shall denote the grammar, G , as the grammar that can model the source generating
the signal set \ \ ] the sentence that can be genera ed by the grammar G constitute the set
L(G) — the language generated bv G . A phrase structure gram m ar,1 G, is a quadruple given
by:
G = (VN,VT,R,<r) <4,1)
where:

VN - finite set of nonterminal symbols (vocabularies or variables) of G;

VT = finite set terminal symbols;
a - start s\m bc! (o- € VN); and
R = finite set o f rewriting rules, or productions, denoted by a —* 8 . where a and (3
are strings.

Several notations have been introduced for syntactic analysis:

1. a the symbol —» means “ can be rewritten as” and is known as production.

2. xn: if x is a string, xn is x written n times.
3. jxj: is the number of symbols in the string x, denoted also as the length of the string.
4. V}: a set of all finite length strings which can be generated using the string VT.

Phrase structure grammars were divided into four types:'

1. Unrestricted grammars (type 0) are grammars with no restrictions placed on the pro
ductions. Type 0 grammars are too general and have not been applied much.
2. Context-sensitive grammars (type 1) are grammars in which the productions are re
stricted to the form: , _

<4.2)

where A e VN and £*,£>>3 € V*. The languages generated by type 1 grammars are
called context sensitive languages.
3. Context-free grammars (type 2) are grammars in which the productions are restricted
to the form:

A —> p (4.3)

where A € VN and {3 € V*. Here {3 is replacing A independently of the context in

which A appears. Languages generated by type 2 grammars are called context-free
languages.
4. Finite state, or regular grammar (type 3) are grammars in which the productions are
restricted to A aB or A —> b where A,B e VN and a,b € VT (all are single symbols
and not strings).

Exam ple 4.2

Consider, for example, the finite state grammar G F, = {VN2,VT,R2,oj', where

V n2 = M

VT = {a,b}

and R2: t —» a c
Volume 11: Compression and Automatic Recognition 91

The finite state language generated by grammar G n is L(Gri) = {a"b|n = 1,2, . . . }.

Example 4.3 . (
Consider the grammar Gc , = (V ^.V T .R ^a) where VN3 = {a,A}, VT = {a,b}, and

R3: a —* Ab

A - * Aa
A —> a

This is a context-free grammar, since it obeys Equation 4.3. The language generated by it
is the language:

L(G(1) - {a"b|n - 1,2,...} (4.4)

which is the language consisting of strings with n “ a V followed by one *‘b*\ Note that
this language is the same as the finite state language L ( G , o f the previous example. Different
grammars can generate the same language.

Example 4.4
Consider another exam ple1 wiih Gi : = (VN4,V,-.R4.<r), where VN4 — {or.A.B}. Vr =
{a,b}, and

R.: (1) <r —* aB (5) A —> a

(2) (r -+ bA (6) B b

(3) A —* a<> (7) B —> aBB

(4) A —* bAA (8) B —> b

Grammar Gc: is context free since it obeys Equation 4.3, namely, each production in R4
has a nonterminal to its lefi and to its right a string of terminal and nonterminal symbols.
Examples o f sentences generated by G< 2 are {(ab)"} by activation (n - 1) times rules i and
7 followed by 8, or {ba} by activating rules 2 and 5. In general, the language L(GcO is the
set of all words with an equal number of a’s and b’s.
An alternate method to describe a finite state grammar is by a graph known also as the
state transition diagram. The graph consists of nodes and paths. The nodes correspond to
states, nam ely, the nonterminal symbols of VN. and a special node T (the terminal node).
Paths exist between nodes N; and N, for every production if R of the type N, —» aN,. Paths
to the terminal node T from node A, exist for each production A, —» a.

Example 4.5
Consider the finite state grammar Gl2 = {VN5,VT,Rs,a} with VN5 = {o\A,B}, VT =
{a,b}, and

Rs: (1) ct -> aA ; (5) A —» a

(2) cj —» b ; (6) B aB
O) A —> bA ; (7) B b

(4) A —» aB :

The graph of G f,2 is shown in Figure 4.

92 Bi omedical Signal-'Processing

F IG U R E 4 . F in ite s ta te g ra m m a r . G R .

HI. S Y N T A C T IC RECOGNT7F.RS

A. Liiroducium
The signals under resting are represented by strings that were generated by a grammar.
Each class has its own grammar. It is the task of the recognizer to determine which of the
grammars has produced the* given unknown string. Consider the case where M classes are
given. \v,, i - 1,2, . . ,M, each with its grammar, G,. The process known as syntax
analysis, oi parsing, is the process that decides whether the unclassified string x belongs to
■the. language L(G ). i = 1.2, . . . ,M. If it has been determined that x € U G j), then x is
classified into w;. _ .
We shall consider first th.e recognition of strings by automata. Recognizing automata have
octrn developed for the various types of phrase structure grammars. O f interest here are the
iinite automaton, used to recognize finite state grammars and the push-down automaton used
to recognize context-free grammars. The discussion of more general parsing methods will
follow.

B. Finite S tate A utom ata

A deterministic finite state automata, A, is a-quintuple,

A = (£,Q ,8,q0,F) (4.5)

where '2 is the alphabet — a final set of input symbols, Q is a final set of states, b is the
mapping operator. q0 is the start state, and F is a set of final states.
The automaton operation can be envisioned as a device, reading data from a tape. The
device is initially at state q0. The sequence, x, written.-on the tape is read, symbol by symbol,
by the device. The device moves to another state by the mapping operator:

5(q ,.,(*) = q2

£€£ (4.6A)

which is interpreted as: the automaton is in state q, and upon reading the symbol £ moves
to state q2. The string x is said to be accepted.by the automaton A, if upon reading the
complete string x, the automaton is in one of the final states.
The transformation operation (Equation 4.6) can be extended to include strings. The string
x will thus be accepted dr recognized by automaton A, if:

8(q„,x) = p for some p € F (4.6B)

V o l u m e !l: C o m p r e s s i o n a n d Au to ma ti c Recognition 93

namely, starling from stale q0 and scanning the complete string x, the automaton A will
follow a sequence o f slates and will hah at state p which is one of the final states.

Example 4 .6
Consider a deterministic finite state automaton, A ,, given by

A, = (£|,Q ,,6.u„.F ,). with 2 , - {a.b}

Q. • and r , « {q,,q4}

The state transition mapping of A, is

- Id,} 8 ( q ,.b ) ~ { q j

6(q„.b) - iq-V 5 iq ; .a) = {q.,}

5(q,,a> = {q; f 8<q; .b) = {qj

The state transition diagram of A, is given in Figure 5. Note that terminal slates are denoted
bv two concentric circles. The strings x, ~ {(abfb}, x2 = {(ab)ma2} are recognized by A,
since Siq^.x. j {q3} € F, i - 1,2; the string x, - {(ab)"ab} is recognized since 5(q0.x3)
= { q j g F.
A nondeterministic finite stale automaton is the same as the deterministic one. except for
the fact that ihe transformation §(q,£) is a set of state-, rather than a single slate, as indicated
in Equation 4.6. Hence, for the uondcicrminisiic automaton:

ft(q i) = {q,.q: . . . . q j (4.7)

Equation 4.7 describes the transformation of a nondeterministic automaton at state q upon

reading the symbol The automaton will move into any one of the stale* q ,,q : . . • . ,qn.
it is assumed that the automaton always chooses the correct state to jump to.

Example 4,7
Consider the nondeterministic finite state automaton A2 = ( 2 MQ2,8,q0,F,) with X, =
{a,b}, Q 2 - {q0,q ; ,q2,q.}. and F, = {q,}. The state transition mapping of A2 is given by
the transformations:

S(q.,.a) = {q,} 8< q ,.b ) = {q.>

8(q„.b) = {q,} 8 (q ; .a) = {qj

S(q,.a) =■ {q;-qj} 6< q ,.b ) = {qJ

The state transition diagram of A2 is given in Figure 6. Note that the transformation S(q,,a)
= (q2,q3} makes the automaton a nondeterministic one.
The strings x4 = {abaab}, x5 = jab”’a"b} will be recognized by A2, since 5(q0,X;) = {q3},
i - 4.5. Note that the automaton A, will recognize aii strings generated by the grammar
GH2 given in Figure 4. it can be show n1 that for any nondeterministic finite state automaton,
accepting a set of strings, L, there exists a deterministic automaton recognizing the same
set of strings. It can also be shown that for any finite state grammar, G, there exists a finite
state automaton that recognizes the language generated by G.
94 B i om ed ic al Signal Processing

F I G U R E 5 . S ta te tra n s itio n d ia g r a m o f fin ite

s ta te a u to m a to n , A .

F IG U R E 6 . S ta te tra n s itio n d ia g ra m c f n o n -
- d e te r m in is tic f in ite s ta te a u to m a to n . A ,. —

Example 4.8
Consider now the ECG signal shown in Figure 3A with the primitives of Figure 3D (the
example is based on an example given by Gonzalez and Thompson2). A regular grammar
describing the normal ECG is given by

G = ({a,A ,B ,C ,D ,E,F,G ,H }, {p,qrs,t,b}, R, a)

cr —> pA A qrs C A bB

B —> qrs C C ^ bD D - » tF

D —» bE E tF F -* b

F -> bG G -> b G - > bH

G ~*pA H b H—» bff

H —» pA

The normal ECG is defined here as the one having a basic complex (p qrs b t b) with normal
variations including one b between the p and qrs waves, an additional b between the qrs
Volume II: Compression and Automatic Recognition 95

FIGURE 7. A deterministic finite state automaton for ECG unaI>MS.

and Cwaves, and additional one or two b’s between the t and the next p waves. A deterministic
finite state automaton that recognizes normal ECG is shown in Figure 7. In this diagram a
state q r has been added to denote the terminal state.

C . C o ntext-F ree Push-D ow n Autom ata (PDA)

Context-free languages (that are not finite state), defined by Equation 4.3. cannot be
recognized by the finite state automaton. One recognizer for context-free languages is the
push-down automaton (PDA). The push-down automaton is similar to the finite state one,
with the addition of a push-down stack. The stack is a “ first in-last out” string storage.
Strings can be stored in the push-down stack such that the first symbol is at the top. The
automaton always reads the top symbol. Figure 8 shows a schematic diagram of the PDA.
The stack is assumed to have infinite capacity.
The nondeterministic push-down automaton M is a septuple

M = ( £ ,Q ,r,8 ,q 0,Zo,F) (4.8)

with 2 , Q, q(„ and F the same as in Equation 4.5 and with T a finite set o f push-down
symbols; Z0 € T, a start symbol initially appearing on the push-down storage. The operator
8(q,£,Z) is a mapping operator:

5(q,£,Z) = {(q1/YlM qa/Y2),...,<qm,'ym)} (4.9)

where q ,q ,,q 2> • . - ,qfll € Q are states, £ e 2 is an input symbol, Z is the current symbol at
the top of the stack, and y x,y 2............. ^ T are strings of push-down symbols. The trans
formation (Equation 4.9) is interpreted as follows: the control is in state q, with the symbol
% Biomedical Signal Processing

\npu1 String

F IG U R E 8 . P u sh -d o w n a

Z ai the top o f the push -d ow n stack , and the input sym bol .£ is read from the input spring.
T he con trol w ill ch o o se on e o f the pairs (% ,-/,)»i = 1 ,2 , . . . .m , sa y ( q ^ . ) . !t w ill replace
"the sy m b o l Z in ih e stack l»y the string y ,, such that its leftm ost sy m b o l appears at the top
o f th«* Ntack, w ill m ove tu f.-ute q,. and w ill read the next sym bol o f the input string.
If £: = X (the null sy m b o l), then (independent o f the input s in n g ) the autom aton, in state
q , v\ i?i jep lace Z by 7 . in the stack and w ill remain in state q. W h en 7* ~ X the upper
sym b ol o f the stack is cleared . If th e autom aton reaches a step 8 ( q ,,^ c ) . n am ely, the
upperm ost s y m b o l-o f the stack m atch es, the input sym b o l, t , is p o p p ed from the stack
ex p o sin g the next in line stack sy m b o l, iq, w ith the next transform ation 5 (q i, 7 ,iq). If ihe
autom aton reaches a com b in ation o f state, input and state sy m b o ls for w h ich no transfor
m ation is d efin ed , it halts and rejects the input.
A ccep ta n ce o f strings b y the p u sh -d ow n autom aton can be ex p re ssed in tw o w ays: (1 )
I h e autom aton reads all sy m b o ls o f ih e input string w ithout bein g h alted and after the final
input sy m b o l has b een m o v ed in to a state q b elon gin g to the final set F . and (2 ) the autom aton
reads all input sym b ols w ith out b ein g halted. T h e transform ation taken after reading the last
input sam p le m o v es the au tom aton into a state q € Q w ith em p ty sta ck , n a m ely , 7 = X.
T h is typ e o f accep tan ce is ca lled “ accep tan ce by em pty store” . For th is c a se it is co n v en ien t
to d efin e the set o f final states as the null set F = 0 .

Example 4,9
C on sid er a signal w ith p rim itives {a ,b ,c ,d } as show n in Figure 9 . C onsid er a cla ss o f
sign als generated w ith th ese p rim itives such that the language d e s n lb in g the cla ss is the
n onregular (n onfm ite state) con tex t-fre e lan gu age {xjx = abncd n, n 2 = 0 }. T h e m em bers o f
the c la ss o f sign als are d ep icted in Figure 9 . T he con text-free gram m ar that generates the
c la ss o f sign al is

G C3 = (V N? V T9 ,R 9,cr) = ( { a ,A } ,{ a ,b ,c ,d } ,R ,,a )

R^ cr —» aA

A —* ! A d

A -> c (4.10)
V o l u m e II: C o m p r e s s i o n a n d Au tomatic Recognition 97

c d

obed

ab? cd2

abncd"

F IG U R Ii P r i m i t i v e s a n d s a m p l e s o f L ( G 0 ).

The push-down automaton that recognizes language L(G,.,) is the PDA M, given by

M, = ( 2 , Q , r ?8.qn.Z,,.F) = ({a,b.c.d},{q0},{or,A,B,C\D},8.q(„<T,<J>)

with the transition mapping:

(1) 5(q„.a.o-) = {(q„,DAB),(q0,C)}

(2) 5(q„.b.D) = {(q,.w\)}

(3) 5(q,..b.A) = {(q0,AB),(q0,CB)}

(4) 8(q„.c.C) = {(q„A)}

(5) 8(q„.d.B) = {(q0.X)}

(6) 5(q0.c,A) = {(q0,\)} (4.11)

Suppose that an input string x, = {abc}, that does not belong to G0 , is checked by the
........waton M ,. Th» following steps take place. Initially the stack holds the string cr; the
input symbol read first is a; hence the first transformation is invoked. The automaton has
to choose between replacing cr by DAB or by C (since this is a nondetermimstic machine).
If there is a right choice, it is assumed the automaton will take it.
98 Bi om ed ic al Signal Processing

Suppose the first choice is taken:

£(q0,a,<r) = {(q^DAB)}

The automaton remains in state q0; its stack now holds the string DAB with the symbol D
in the uppermost (reading) location and the next input symbol (b) is read. The second
transformation is invoked.

S(q0,b,D) = {(q„,X)}

The automaton remains in state q{); the symbol D is removed from the stack (replaced by
the null symbol X); the stack contains AB with the symbol A pushed into the uppermost
stack location. The next input symbol, c, is read and the next transformation is

6(q0,c,A )= .{(< *,,X)}

This transformation leaves the stack unempty, with symbol B. The input string has been all
read. It, however, cannot yet be rejected since the first choice made may have been the
wrong choice.
Return now to the first step and take the choice:

5(qc),a,o-) = {(q0,C)}

reading the next input symbol (b) and having C in the stack calls for transformation 8(q0,b,C)
which is undefined, thus causing the automaton to halt. The string {abc} is not recognized
by the PDA, M,.
Consider now the input string x, = {ab2cd2} which does belong to the langauge L(GC3).
The following steps will be taken by M,:

S(q0,a,or) = {(q0,DAB)} ;

5(q0,b,D) = . {(qoA)} :; stack = DAB

5(q0,b,A) = {(q<>,AB)} :; stack = AB

8(q0,c,A) = « q 0A)} ; stack = ABB

S(q0,d,B) = {(q<^)} ; stack = BB

5(q0,d,B) = {(qoA)} ; stack = B

The last transformation replaces the symbol B in the stack by the null symbol X and leaves
it clear at the end of the input string. The string x2 = {ab2cd2} is therefore accepted by M ,.
In iiiobt applications of syntactic signal processsing, a class o f signals is given (by means
of their primitives and grammar) and a recognizer has to be designed to recognize the class
of signals of interest. It has been proven2 that for each context-free grammar, G, a PDA
can be constructed to recognize L(G). The inverse statement, namely, for each PDA there
is a grammar recognized by the automation, is also correct.
Consider the case where a context-free grammar, G, is given and it is desired to obtain
the PDA that recognizes it. One relatively simple algorithm2 is as follows: let G = (VN,VT,R,a)
be the given context-free grammar. A PDA recognizing L(G) is A = (VT,{q0} ,r,5 ,q 0,cr,<j>)
with the push-do vn symbols T being the union of VT and VN and with the transformations
5 obtained from the productions R by:
Volume II: Compression and Automatic Recognition 99

(1) If A a is in R,

then 8(q0.X,A) {(q0,<*)},

where a is a string and X is the null string.

(2) For each terminal a in VT,

8(q0,a,a) = {(q0,X)}. (4.12)

Example 4.10
Consider the context-free grammar GC3 (Equation 4.10). The PDA, M2, designed to
recognize L(G0 ) by the rules (Equation 4.12). is

M2 = ({a,b.c.d},{q0}.{<r,a,b,c,d,A}.8, q(Mcr,<J>)

and the transformations are

8(q0,X.cr) {(qlt.aA)}

By rule (1)

8(q„,X,A) —> {(q„.bAd).(q,,,c)}

8(q(„a,a) —►{(q,,.X)}

8(q0,b,b) - » {(q„.\)}

By rule (2)
8(q„,c,c) {(q.,,X)}

8(q„,d,d) -> {(q„.X)}

Consider again the input string x: = {ab2cd:} belonging to L(GC3). The PDA will proceed
as follows:

8(q0.X,<j) = {(q(„aA)}

8(q0,a,a) = {(q0,X)} Stack = A

8(q(),X.A) = {(q0,bAd),(q0,c)} a is popped from stack

8(q„,X,A) = {(qn,bAd)} Automaton selects (q0,bAd),

stack = bAd
100 Biomedical Signal Protesting

S(qft,fo,b) - {<qoA)} b Is popped- from stack,

stack “ Ad

6(q0,X,A) - {(q0,bAd)} Automaton selects (qa.bAd),

d on stack is pushed dowsi,

stack - bAdd
b is popped from slack,

stack “ Add

Hq«X.A) = i(qi5,c)} stack ~ cdd

§{q0,c,c) - {(q,-„X)} c is popped from stack,

stack - dd

d is popped from stack,

stack = d

5(q0,d.d) = 'Hj.X)! Stack Is empty , input string has

been read — string accepted

D, Sim ple Syntax-D irected T ran slatio n '

It is often necessary’ to map a string generated by one gramma ~ into another string in
another language. Such a mapping, based on underlying context-free g:ammars, is denoted
a syntax-directed translation. With the help of simple syntax-directed translators, an input
string may be mapped (classified) into an output string in addition to being recognized.
The simplest translator is the nondetermimstie finite transducer given by the sextuple:

AT - <2,Q,A,-8,q0,F) <4.13)

where X, Q, q0, and F are the same as in Equation 4.5, A is a finite output alphabet, and
6 is the mapping operator. The translator can be seen as a finite state automaton with
additional output tape onto which output mapping is written.

E xam ple 4.11

The state diagram of a finite transducer for the problem of ECG analysis is gi ven in Figure
10, This transducer not only recognizes normal ECGs (in the sense of Figure 7), but also
classifies every ECG onto normal (N) and abnormal (Ab). in each step a symbol N or Ab
is written on the output tape. For example, an input string pbqrsbbtbbb will be recognized
by the transducer while placing the sequence Ny in the output. An input string pbbqrsbbtbbb
will produce the sequence NPAb8 in the output. An input string that has been mapped into
the output string with at least one Ab symbol is classified as abnormal.
Nonregular context-free syntax directed translation requires more sophisticated trans
ducers. A nondeterministic push-down transducer is similar in structure to the finite trans
ducer with a push-down stack added.

E . P arsing
General techniques for determining the sequence of productions used to derive a given
string x, of context-free language L(G), exist. These techniques are called parsing or syntax
V o l u m e II: C o m p re ss io n a n d Automatic Recognition 101

"*/ X' * . #
^ S/Ab 7 \ y / \ \

>-^L-
\ ,b /^ i
qr^
^ 1>^s/Afe___ {qc / ?qc K"qrs/H
A Yu * ......>:,; 5
b/N\ /b/N

^ N ' x / ^ N \5 ^ A b / /

(q j ' \ / ; /

/p .b .q r s /A b ,
i
..t____
qT
-f '

R G L 'R H 1 0 . F in ite t r a n s d u c e r t'o r b C G a n a l y s i s .

analysis techniques. Two general approaches are known for parsing the string x. Bottom-
up parsing starrs from the srring x and applies productions of G, in reverse fashion, in order
to get. io the starting symbol <r. Top-down parsing starts from the symbol a and by applying
the productions of G tries to get to the string x. Efficient parsing algorithms, such as the
Cock-Yonger-Kasami algorithm, have been developed. The interested reader is referred to
pattern recognition literature (e.g., References 1 and 2).

IV . STOCHASTIC; LANGUAGES A N D SY N TA X AN A L Y SIS

A. In tro d u ctio n
Most often signal processing is done in a stochastic environment. Noise and uncertainties
are introduced either due to the stochastic nature of the signal under test or due to the
acquisition and primitive extraction processes. The classes of signals to be analyzed may
overlap in the sense that a given signal may belong to several classes. Stochastic languages
are used to solve such problems. If n grammars, G, i = 1,2, . . . ,n. are considered for the
string x (representing the signal), the conditional probabilities p(x|Gs), i = 1,2, , . . ,n, are
required. The gram m ar that most likely produced the string x is the grammar for which
p(Gjx) is the maximum.
A stochastic gram m ar is one in which probability values are assigned to the various
productions. The stochastic grammar G„ is a quintuple Gs = {VN,V, ,R ,P, ct}. where VN,
Vx, R, and cr are the same as in Equation 4.1 and P is a set of probabilities assigned to the
productions o f R.
We shall deal here only with unrestricted and proper stochastic grammars. An unrestricted
102 Biomedical Signal Processing

stochastic grammar is one in which the probability assigned to a production does not depend
on previous productions. Consider a nonterminal, Tir for which there, are m productions: T;
~ » a (,T j~ > a 2, . . . ,T; « m; the productions axe assigned probabilities, P,j, j - 1,2, . . . ,m.
A proper stochastic grammar is one in which

2 ^ = 1 for all i (4.14)

Stochastic grammars are divided into four types, in a similar manner to nonstochastic
grammars (Equations 4.2 a n d 4 .3 ). Therefore we speak of stochastic context-free and sto
chastic finite state grammars and languages.

Example 4.12
Consider the proper stochastic context-free grammar:

G s, = ({or},{a,b},R,{p,,(l - p^K o)

R: (pi) (r —* acra

(1 - p,) cr —> bb

where the first production is assigned the probability p, and the second is assigned (I -
j. Pi), The grammar is clearly a proper stochastic grammar. The grammar Gs, generates strings
of the form xn = a*bba". n 5= i. The probability of the string is p(xn) = p"(l - p,).

B. Stochastic Recognizers
Finite state stochastic grammars can be recognized by a stochastic finite automaton. The
automaton is defined by the sextuple:

As = (2 ,Q ,5 ,q 0,F,P) (4.15)

where 2 , Q, q0, and F are the same as in Equation 4.5, P is a set o f probabilities, and 8 is
the mapping operator to which probabilities are assigned. The stochastic finite automaton
operates in a similar way to that of the finite one, except that the transition from one state
to another is a random process with given probabilities. For an unrestricted automaton the
probabilities do not depend on a previous state.

Example 4.13
Consider the finite state automaton o f Figure 7, designed to recognize normal ECG.
Assume that there is a probability o f 0.1 that there will be no “ t ” wave present. We would
still want to recognize this as a normal ECG. The automaton designed to recognize the
signal is a modification of the finite state one. Its state transition diagram is shown in Figure
11. Note that another path has been added between states qD and qG and that each path is
assigned an input symbol and a probability. When the automaton is in state qD and the input
symbol is “ b ” , it can move (with probability of 0.9) to qE or (with probability 0.1) to state
qG. The stochastic state transitions of the automaton are as follows:

S(q0,p) = {qA} ; p(qA|p><lo) = 1

8(qA,qrs) = ( q j p(qc|qrs,qA) = 1
Volume II: Compression and Automatic Recognition 103

H G U R K I ! . S u ite t r a n s i t i o n d i a g r a m o f s t o c h a s t i c f i n i t e s ta te
a u to m a to n fo r E C G a n a ly s is .

S(qA,b) = {qj ; p(q„|b,qA) =1

5(qc,b) = {q„} ; p(q„|b,qc) = 1

&(q^b; = {q,...q0} ; p(q,jb,qD) - 0.9

; P(qG|b»q,>) =0.1

8(qD,t) = {q,} ; p(qF|t,q») = 1

8(q,„b) = {q(i} ; p(qc.|b,qF) =1

8(qc,»b) = {q„} ; p(q„|b,qG) = 1

fi(qo.p) = W ; p(qA|p,qG) = 1

8(q«»^b) = {q«} ; p(q0|b,qH) = 1

Stochastic push-down automata are designed to recognize stochastic context-free languages;

104 Biomedical Si?m i Processing

these are generalizations o f the nondeterministic push-down automata. Simple stochastic

syntax-directed languages can be recognized by stochastic syntax-directed translators. The
interested reader is referred to the partem recognition literature.12

V. GRAMMATICAL. INFERENCE

la m e di'.cu-MOM- ui.til n »n ,, w e have assu m ed dial the g ra m m a r ^ ererd im g the sig n a ls

to l e r e cw g r j/ed are a i w i The a.itom„t*: uêd as recount/*..s cr as tn ai-iaiu rs require that
*re gram m ar be k n ow n , in m ost practical en ses. h iw t v e i. the £tam n ,ars are not g iv en
1’s ‘ialSv v- e are tix c n '* num ber c*J sign al sam ples belon gin g to the ^ n w J a s s . T h ese are
' v a n v . U il ' eacn er" ~nd «re LorM u^cd i part o« the trt.ir.ni* '•el A ssu m in g ail sam ples
n a ^ b eeTi g e r e iu te j *>\ }he ân^c gram m «r. ’* is nov. ,?t d to d e*u •'mnc that gram m ar
*c,t a n \ o n e * n a m .r. •* tba* cau g * w dtt1 the seU The piob\»n» »s km r .n «<s gram m atical
r cm e b*;Ke t N j e ■> ,m w. » *o- one t Ja*u'n~’'*r oetw een i <G <a.'w G , gram m atical
n fe ic M tc ip<a no* p r o ' l l unique results
G tariPU ’iCdl ml ere v e is detmec follows A tmite ^.ei oi sente rues S ’ , belonging to
I j i i ) i> g ? \i.n P o s s ib ly another set of s e n te n c e s . S b d o ngng to I (G) (all sentences not
lÛ>ngn» to L(G )} s aKt given Lsm g th is tnt^rnvh xn, if »s rtquued to infer the grammar,
f ; ii is *oi<s tha} til*' inf* lencf' procedure depends o*« the ar icm t or information included
in the >et S It S ' i.,eludes all possible sentences mmeJy. S — L(G), it Is called
‘ c o m p l y " ir S is pot complete but each rev.nte rule oi G i- u ^ d ;n the generation of
c*« least one suing sn S '. *t ?s called **stntcturdHy complete'*
!*êren*'c a^ o rith m s *r»i o r n e - s t a k , conte* ^ f ^ e and stochastic. eoM text-ftee gram m ars
r. j \ t been a e v e lts I mc sur*K\f is o f .n ueaM n g 'r.ip)«ta”ce Us a p p \ t a +i<>us are. found in
t *e s*u * ■>! n<tural !aii*_u^gus and >n pattern and s'gi.al inivs>s

V I. E X A M PL ES

A. S yntactic Analysis of C arotid Blood P ressure

The detailed monitoring of arterial blood pressure is important for the care of the critically
'ill. Arterial blood pressure can be monitored by m eans'of a pressure transducer inserted
through a catheter into an artery . The waveforms monitored closely correlate with the heart
dynamics, A typical blood pressure waveform is shown in Figure 12. The wave can be
divided into two pans corresponding to. the.pressures of the systolic and diastolic phases of
the bean. The main features of the pressure wave28 are a high and rapid pressure rise at the
beginning of systole reaching a peak (sometimes with ” ringing” ) followed by a drop in
pressure. The systolic phase ends with the closing of the aortic valve. The pressure then
rises due. to the compliance of the aorta. The minima on the pressure wave present at the
end o f systole is known as the dicrotic notch. (For more details on the blood pressure wave,
see Appendix A.)
The carotid artery is a main artery supplying the brain, its analysis is of special importance.
Stockman et al.23 have suggested syntactic method for the analysis. Their example has been
used aiso by qfchers.7 Stockman and Kanal5 have used a training set o f 20 carotid pulse
waves to check their parsing algorithm. Out of 158 waves analyzed. 125 were correctly
recognized.
The primitives chosen to describe the signal were

LLP A long line with large position slope

MLP A medium-length line with large positive slope
MLN A medium-length line with large negative slope
MP A medium length line with positive slope
MN A medium-length line with negative slope
Volume 11: Compression anil Automatic Recognition 105

(CAROTlO PULSE)

(M i! «M2' . IMS)

(P O S WAVE'. • •
' / WPV WPP
WPP

F I G U R E \2. C a r o t i d b l o o d p r e s s u r e w a v e w i t h r e l a t i o n a l tr e e .

TE Trailing edge — a long line with medium negative slope

HR A snon almost horizontal line
WPP Wide parabola, peak
NPP Narrow parabola, peak
WPV Wide parabola, valley
NPV Narrow parabola, valley
RPM Right half of parabolic maxima
LPM Left half of parabolic maxima

A typical systole pari may contain the following primitives: LLP, WPP, WPV, WPP, MLN;
and a typical diastole may contain NPP, WPP, TE.
A context-free grammar, G p, has been chosen5 to describe the signal with

Gp = (V n.,V t ,R,(Carotid pulse)) (4.16)

where:

VN = {(Carotid pulse), (Systole), (Diastole), (M axim a).-(M l), (M2), (M3), (Di
crotic wave), (Pos wave). (Neg wave)};
106 Biomedical Signal Processing

VT = {LU\M LP,M LN,M P,M N,TE,HR,W PP,NPP,W PV,NPV\RPM ,LPM }

(Carotid pulse) (Systole)(Diastole)
(Systole) LLP(Maxima)MLN
(Maxima) (M1)(M2)(M3)
(Maxima) MP(M3)
(Maxima) (M l)M N
(Diastole) TE
(Diastole) (Dicrotic wave)TE
(Dicrotic wave) WPP
(Dicrotic wave) HR
(Dicrotic wave) NPP
(Dicrotic wave) —> NPP WPP
(M l ) LPM; ( Ml ) (Pos wave)
(M2) WPV; (M2) (Neg wave)
(M3) RPM; (M3) WPP
(Pos wave) —> WPP
(Pos wave) —> W PP MEN
(Neg wave) . NPV
(Neg wave) —> NPV MLP

The following are a few weaves belonging to L(Gp): {LLP,MP.RPM.MLN,WPP.TE},

{LLP,LPM,WPV,RPM,MLN,HR,TE}, {LLP,WPP,MLN.MN,MLN.TE}.

B. Syntactic Analysis o f EC G
Several syntactic algorithms have been suggested for the analysis of the ECG signal,
especially for the problem o f QRS complex detection.Wv!9-22A simple syntactic QRS detection
algorithm, implemented on a small portable device, was suggested by Fumo and Tompkins.21
A simple finite state automaton, AE, has been designed given by

A e = ( S E,Q E,6 ,q 0,{qQ,qN}) (4 .1 /)

where:

2 E = {normup, normdown, zero, other}

Q E = k . qi. q2. qN}

The two terminal states, qQ and qN, correspond to a QRS wave and noise. The state
transition rules of AE are

S(q0,normup) -> {q,}

§(q0,zero) -> {q0}
5(q0,other) {qN}
6 (q ,,normdown) —> {q2}
8(q,,other) - » {qN}
5(q2,normup) - * {qQ}
§(q2,other) -> {qN}
Volume II: Compression and Automatic Recognition 107

The state transition diagram of the automaton Ab is depicted in Figure 13. The primitives
(normup. normdown, zero, and other) are calculated as follows. The ECG signal. x(t), is
sampled with sampling interval, T. The derivative of x(t) is approximated by the first
difference, s(k):

x(kT) - x(kT - T)
si k) = (4.18)

The samples {s(k>} are grouped together into sequences. Each sequence consists of consec
utive samples with the same sign. Consider, for example, the case where s(n - 1) < Oand
s(n) > 0.
A new sequence of positive first differences is generated:

{s(n),s(n + l),...,s(m )} (4.19)

where s(m -I- 1) is the first sample to become negative. Two numbers are associated with
the sequence (Equation 4.19), the sequence length, SL, and the sequence sum, SM:

SL = m - n + 1

sM= i s(k) (4.20)

Using predetermined thresholds on S, and SM, the primitives are extracted. The algorithm
has been reported to operate at about ten times real time.
108 Biomedical Sigrnl Processing

Table 1
m i M m V E EX TRACTION FO R QRS DETECTIO N

6000

2.000

1000
24

A more elaborate syntactic QRS detection algorithm has been suggested by Belforte el
a I . H e r e a three-iead ECG was used. The first difference* of {he three signals were computed
• Equation 4.18), yielding s,(k), i ~ 1,2,3. The energ\ o| the first differences, sf(k), i =
1.2,3, 'vere used to extract the primitives. A threshold was determined for the energy and
..'.e pulses above this threshold were considered. The pe?k of a pulse was-denoted, a, and
duration (time above threshold) was denoted, d T h e quantaties a and d were roughly
quantized by means of Table 1, yielding the primitives a,b,c. Peaks were considered as
belonging to different events every time the interval between them was longer than 80 msec.
Strings were thus separated by the end of string symbol, w.
A sample o f one lead of the ECG, derivative, and energy are shown in Figure 14. Pulses
above threshold may belong to a QRS complex or may be the result of noise. A string,
from lead i. that may be the result o f a QRS complex is called a QRS hypothesis and is
denoted Q,. A grammar has been inferred from training samples that always appeared with
QRS complexes. This grammar was denoted GQ. Another grammar, Gz , has been introduced
representing strings that in the training set sometimes were from QRS complexes and some
times were not. The two grammars are given by

G0 = {V nq,V tv,R0 ,QRS1 (4.21)

where

{U ,,U 2,U„U4,QRS}

VTp = {a,b,c}
Volume //; C o m p r e s s i o n a n d Automatic Recognition 109

l j r v
'‘- " h I'W.V" i ‘t f '

j! i , | i , , , ,

t i
I_L

sr.r..

rHjrUivh * Syr: lac lie OftS oeiecf:»>:'i — LCC derivative and rrnern). (From 8el
jbrtc. G . . D v-M ori, R .. an** i r i s . F . . I[H E I'runs. Biomed. En#.. B K 1 L - 2 0 . 1 2 5 ,
1979 (@ 1979. !£l-:F.). \\!)’.\"
■ n is s ic n .)

QRS bU QRS —* cU : ; QRS -» at;..

U, —> cL:,; U-. — bU,; u, aU
U, —* bU,. U: -> cU,; u2 &u,
c l\: u, —> aU4; u3 -> bU
l_'4 —> all,: U4 - » bU4; u4 cU.

Va -* a: U4 b; U4 -* c

and

G, = {V„,,Vt,,R,.Z} (4.22!

where

- {Y,.Y,,Z}

= (b-c}

Z -> cY, Z - ’ bY2

Y,->cY, Y, k bY2

Y, cY , Y, ►b

for example, the strings {bcbcaa}, {bn}, and {bcnaa} are generated by GQ and [ebb] and [bcnb]
are generated by Gz.
116 Biomedical Signal Processing

s *s «8s ass HRs ass

F IG U R E 15 . S y n ta c tic Q R S d e te c tio n — th r e e le a d s . ( F r o m B e l i e v e . G . , D e - M o r i.
R . . a n d F e r r a r i s . F . . IEEE Trans. Biomed. Eng., B M E - 2 6 , 1 2 5 . 1 9 7 9 ( © 1 9 7 9 .
I E E E ) . W i th p e r m i s s i o n . )

The rule suggested by Belforte et a l.'g for recognizing a QRS event is as follows. Let Q*.
i = 1,2.3* be a QRS hypothesis emitted under the control of grammer GQ in the time interval
{*>.!> ti.z}* where i denotes the lead number. Let also Zj? j = 1,2,3, be the hypothesis emitted
under the grammar G7 in the time {t3 ,,t, 2}, For a given lead and time interval, only one
hypothesis can be emitted since the grammars GQ and Gz generate disjoint languages.
The hypothesis Qj and Z., i,j = 1,2.3, whose time intervals partially overlap are used to
determine the presence or absence o f a QRS complex. The decision rule suggested19

h = QjACQjVzj) ; i,j = 1,2,3

i # j (4.23)

where A and V are the logical “ and” and “ inclusive or” operators. A QRS is declared if
h — 1. The algorithm was checked, in real time, with data base of 620 QRSs from 16
healthy and ill patients with no errors and less.than 0.5% false alarm errors. Examples of
the three-lead ECG and detection results are shown in Figure 15.

C. Syntactic Analysis of EEG

In the analysis of EEG, spatiotemporal information is of considerable importance. Svn-
Volume II: Compression and Automatic Recognition 111

tactic methods may have a good potential for EEG analysis since they utilize this information.
Syntactic analysis o f EEG spectra has been suggested.1114
The EEG was divided into nonoverlapping segments o f 1-sec duration. The spectrum of
each epoch was estimated (by AR modeling). Discriminant analysis of the training set
generated seven discriminant functions:

{AL,A,SL,S,L,NL,N} (4.24)

with AL = artifactual low, A - artifactual, SL = slow low. S = slow, L = low, NL

= normal low . and N = normal. These were defined as the seven primitives. Recognizable
entities in the EEG such as normal, abnormal, drowsy, lowamp. waxing and waning, or
slow wep? used as nonterminal slates Rewrite rules were parsed from the training set.
EEG record^ (from healthy population), analyzed as normal by expert evaluation, were
all (55 record^ o f 9 sec each) classified normal by the syntactic algorithm. In EEG records
drawn from a dialysis population, results were somewhat less successful. From the records
classified abnormal by the expert, 1\ ck were classified norma! by the syntactic algorithm.
From the records classified normal by cIk expert, about 29c were classified abnormal by the
algorithm.

REFERENCES

1. Fu, K. S . . ' ’ihictii Methods in • *• Recognition. A c a d e m i c P r e s s . N e w Y o r k . 1 9 7 4 .

2. G onzalez. R. C. and Thomason. M. G . . Syntactic Pattern Recognition. An Introduction. A d d i s o n - W e s i e y ,
L o n d o n . I 1- ' s .
3 . Fu, K. S . and Booth. T. L .. G r a m ; : : u a l in f e r e n c e : i n t r o d u c t i o n a n d > u r v e y . I a n d II. IEEE Trans. Syst.
Man Cyber'. 5 . 9 5 . - 0 9 . 1 9 7 5 .
4 . Pavlidis. T . . L i n g u i s t i c a n a l y s i s o f c f o r m s . \n Software Engineering. V o l. 2 , T o u . J . T . . E d . . A c a d e m ic
P r e s s . 19" 1 . 2 0 3 .
5 . Stockm an. G . C. and K anal. L. N.. P r o b l e m r e d u c t i o n r e p r e s e n t a t i o n f o r t h e l i n g u i s t i c a n a l y s i s of
w a v e f o r m * . IEEE Ti\;ns. Pattern Amr.. Maeh. Intelligence, 5 . 2 8 7 , 1 9 8 3 .
6 . Mottl. V. V. and Muchnik, I. B ., L in q iu N tic a n a l y s i s o f e x p e r i m e n t a l c u r v e s . Proc. IEEE. 6 7 . 7 1 4 , 1 9 7 9 .
7 . Fu, K. S . . S y n t a c t i c p a t t e r n r e c o g n i t i o n a n d i ts a p p l i c a t i o n s t o s i g n a l p r o c e s s i n g , in Digital Waveform
Processing und Recognition. C h e n . C' H . . E d . . C R C P r e s s , B o c a R a to n . F l a .. 1 9 8 2 . c h a p . 5 .
8. Sankar, P. V . and Rosenfeld, A ., H ie r a r c h ic a l r e p r e s e n t a t i o n o f w a v e f o r m s . IEEE Trans. Pattern Anal.
Much, hiullinenve. 1 . 7 3 . 1 9 7 9 .
9. Ehrich, R. W . and Foith, J. P ., R e p r e s e n t a t i o n o f r a n d o m w a v e f o r m s b y r e l a t i o n a l t r e e s . IEEE Trans.
Comput.. 2 5 . “ 2 5 . 1 9 “ 6 .
10. Lin, H. H . and Fu, K. S . , A n a p p l i c a t i o n o f s y n t a c t i c p a t t e r n r e c o g n i t i o n to s i e s m i c d i s c r i m i n a t i o n . IEEE
Trans. Ge< > . R e m o t e . Sens.. 21. 125. 1983.
11. Bourne. J . R .. Jagannathan. V ., Hammel, B ., Jansen, B. H ., Ward, J . W ., H ughes, J. R ., and
Erwin, C . W ., E v a l u a t i o n o f a s y n t a c t i c p a t te r n r e c o g n i t i o n a p p r o a c h t o q u a n t i t a t i v e E E G a n a l y s i s . Elec-
troencephalonr. Clin. Seurophysiol.. 5 2 . 5 7 , 1 9 8 1 .
1 2 . Bourne, J . R ., G agannathan, V ., G i e s e , B., and W a r d , J. W ., A s o f t w a r e s y s t e m f o r s y n t a c t i c a n a ly s is
o f t h e EEG. Comput. Prog. Biomed.. 1 1 . 1 9 0 , 1 9 8 0 .
13. Giese, D. A .. Bourne. J. R .. and Ward, J. W ., S y n t a c t i c a n a l y s i s o f t h e e l e c t r o e n c e p h a l o g r a m , IEEE
Trans. Syst. Man C\bern.. 9. 429. 19~9
1 4 . Jansen, B. H ., Bourne, J. R ., and Ward, J. W \, I d e n t i f i c a t i o n a n d la b e li n g o f E E G g r a p h i c e le m e n ts
u s i n g a u t o r e g r e s s i v e s p e c t r a l e s t i m a t e * . Comput. Biol. M ed.. 1 2 . 9 7 , 1 9 8 2 .
1 5 . Albus, J. E . . E C G i n t e r p r e t a t i o n u s ;n g s t o c h a s t i c f i n i t e s t a t e m o d e l , in Syntactic Pattern Recognition
Applications. F u . K . S . , E d . , S p r i n g e r - V e r l a g , B e r l i n . 1 9 7 6 .
1 6 . Horowitz. S . L ., A s y n t a c t i c a l g o r i t h m f o r p e a k d e t e c t i o n in w a v e f o r m s w ith a p p l i c a t i o n s t o c a r d i o g r a p h y ,
Commun. ACM . 1 8 . 2 8 1 . 1 9 7 5 .
1 7 . Degani, R. and Pacini. ( * . , F u z z y c l a s s i f i c a t i o n o f e l e c t r o c a r d i o g r a i s . in Optimization o f Computer ECG
Processing. W o l f . H K . a n d M a c F a r l a n e . P . W . . E d s . . N o r t h - H o i l a n d . A m s t e r d a m . 1 9 8 0 . 2 1 7 .
112 Biotnedlcal Signal Processing

1 8 . S m e t s , P . , N e t t q u a n t i f i e d a p p r o a c h i o r d i a g n o s t i c c ! ;is s i ! 'j c a t io n , in Optimization o f Computer ECG Proc

essing. W o l f . U. K . a n d M a c F a r l a n c , P . W . . E t k . t N o r t h - H o l l a n d . A m s t e r d a m . 1 9 8 0 , 2 2 9 .
19. Belfertc* i i . , R ., and F e r r a r i, F ., A contribution to the automatic processing o f electrecar-
diograms usmg syntactic m cih ^ K IEEE Trans. Hiotned. Eng., 26, 125. 1970.
2 0 . P a p a k o f c t & n t i n o u . G . a « d G r i U s t i , F . , S y n t a c t i c filtering o f E C G waveforms, Comput. Biomed. Rex..
14, 158, v m .
2 L f u m o , G . S . a n d T o m p k i n s . \Y „ J L , Q R S d e t e c t i o n u s i n g a . a n u a a t h e o r y i n a b a t t e r y powered m i c r o -
p;(Vc>«>- ^>r4c?R. iFS.h Ftonue-'s Leg. Hvaui, Cart'. 155, 1982.
22. I M n r a . K. P . , Im (.c S b E K f o r M u k i C h a n n e l P a t t e r n R e c ,.* m iu > n . T e c h . R < p 8 2 - 5 2 9 , D e p a r t m e n t o f
C o m p u t e r S 'jiv r p c e C o r n e l l I ’n s w r s i h I t h a c a . - W 19£2
2 3 . S to c k m a n , K a n a l, L ., a r u l K y le . M . t \ . S tr u c tu r a l p jf tc m re c o g n itio n o f ta rw ttd p u ls e w a v e s u s in e
g e n e r a * w ,,n c f o 'm p .: : s i n g \ W c r n . C'wmwn. ACM. 19. 6 8 8 . 1 9 7 6 .
. D e - M o r i . I - L . G ><npn:er MoJe'.s ot Speech ifsi’ig Fuzzy Algorithms, P J e n u n r P r s V >k ^^
Pa\SIdK, T . and Hon* it/u S. I.* *vgniema£ion o<‘ plane corvt-. f i l E E T : u ; C om 2 S' 4.
. I V .m e k , L * 1 a v- lo t p .c e c w is e l i n e a r o o n l m i t i u r y p p . o x t n i a i i o n <. ! „ n i i > >t t t>;Se,
Iruiis r o w ' - ' . 2}, -145. W74.
. P a \ i i d i (' . i \ , S r * ,~i>! Patteif- R ( ' - i K ; w S p r i n g e r - V c i la g . B e r l m . 1 9 8 0 .
. C o x , J . R . , N o l l e . F, M ., m i \ r t h u r , P . M . , D i g i t a l a n a l y s i s o f t h e e l e c t r o e n c e p h a l o g r a m . lire b lo o d
p r e s s u r e w a v e a n d r h e e ie c ln v c * r d v P ro c . I E E E , 6 0 , ! 13 7 , 1 9 7 2 .
2 9 . B i r m a n , K . P . , R u le b a s e d l e i m i_ ,o , m o r e a c c u r a t e E C G a n a l y s i s , I E E E Trans. Pattern A n al Much.
'Intelligence, 1 4 . ? * 9 . 1 9 8 2 .

J
Volume II:Compression and Automatic Recognition 113

Appendix A

CH AR ACTERISTIC S OF SOME DYNAM IC BIOMEDICAL SIG NALS

i. INTRODUCTION

H ie ty p ical le v e ls and frequency ranges o f various biom ed ical sign als are briefly d iscu ssed
in this a p p e n d ix . O n ly rough range- are given b ecau se o f {lie large variances that ex ist in
these ty p es o f s ig n a ls , and the strom d ep en dence on the acquisition m ethod. R eco rd s o f
typical s ig n a ls are sh ow n for m ost o f the signals d iscu ssed here. A brief d iscu ssio n on the
main p ro ce ssin g m eth od s and problem s is presented. B ecause o f the large a m ou nts o f
inform ation a v a ila b le con cern in g ’.Ik c ffe c .s o f various ubnoim aiitics on the sig n a ls , e s p e
cia lly on the m ore im portant on es such as the ECG or E E G r it w as im p ossib le 10 present a
d etailed d is c u ss io n . S ele cted referen ces are given that refer the reader to a more detailed
d iscu ssio n for each sig n a l. The signal's. have been divid ed into groups according to inherent
ch aracteristics. In so m e c a se s, h ow ever, the d iv isio n is not perfectly clear.

II. BIOELECTRIC SIGNALS

A. Action P otential
T his is the potential generated b\ the excitab le mem brane o f a nerve or m u sc le cell
(C hapter 2 . V o lu m e I). T he action potential generated by a sin gle cel! can be m easu red by
m eans o f a m icroelectrod e inserted into the cell and a reference electrode located ui the
extracellular flu id . T h e m icroeiccir^Je has a very high input im pedance. An a m p lifier w ith
a very lo w n o ise figu re and input capacitance m ust be u se d .1
In m ost a p p lica tio n s the shape oi the action potential is o f no interest. It is the interspike
intervals that are o f interest (Figure 1. Chapter 2). T he time o f occurrence o f the sp ik e is
detected and point p rocess m ethods are used (C hapter 2).
W hen the action p oten tials fiom m ore than on e unit are m onitored by the electro d e.
m u itisp ik e: train a n alysis techniques are required. T he action potentials from the various
neurons can be id en tified by tem plate m atching m ethods (Chapter 1) and m arked point
p ro cesses a n a ly sis can be applied. T ypical level range o f the action potential is 100 m V .
The band w id th required i.> about 2 kHz.

B. E lectro n eu ro g ram (ENG)

T he fie ld gen erated by a nerve can he measured w ithout penetrating the m em brane o f a
sin g le c e ll. A n eed le electro d e inserted to the nerve bundle or even surface electro d es located
on the sk in can m easu re the sign al. The m onitored v o lta g e3 w ill not, in gen eral, be a sin g le
action p o ten tia l, but the contribution o f several action potentials transmitted through the
volu m e con d u ctor.
Figure 1 d ep icts the E N G recorded by surface electrod es from the m edian n erv e. T he
range o f le v e ls is about 5 jjlV to 10 mV v ith a bandw idth o f about 1 kHz. The F.NG is used
clin ica lly to ca lcu la te n erve con d uction velocity. T h is inform ation is required for the d etectio n
o f n erve fib er d am age or regeneration. Because o f the low am plitudes in volved in the E N G
m on itoring, sy n c h r o n ized averaging m ethods (Chapter 5 , V olu m e I) are often used to increase
the sign al to n o ise ratios.

C. E lectro retin o g ram (ERG )

T he E R G is the poten tial generated by the retin a.4 7 Evoked ER G is m ost o ften u sed ,
w hich is the poten tial generated by a short flash o f light. The ER * is used c lin ic a lly 6 and
114 Biomedical Signal Processing

F I G U R E 1. S e n s o r y n e r v e a c t i o n p o t e n t i a l s e v o k e d tV o m t h e m e d ia * ! n e r v e a t h e e l b o w a n d w r i s t
a f t e r s t i m u l a t i o n o f t h e i n d e x f i n g e r . { F r o m L e n m a n . J . A . R . a n d R i t c h i e . A . E . . Clinical Elec
tromyography. P i t m a n M e d i c a l a n d S c i e n t i f i c , L o n d o n . 1 9 7 0 W itfc p e n « i » : o n . )

in ophthalmoiogica! research. For research purposes, it is acquired by an implanted mi-

croelectrode in the retina and an indifference electrode elsewhere on the surface. For clinical
use, a cornea electrode (usually a specially made contact lens) is employed. The indifference
electrode is placed on the earlobe, the temple, or the forehead.
The ERG is typically composed of four components, the *4a ” {negative). *‘b ” (positive),
“ c ” (positive), and “ d ” (negative) waves. The various waves are assumed to be generated
by different regions-of the cornea. The a and c waves are probably contributed by the deepest
part of the cornea. The b wave probably originates in the bipolar cells region. The o wave
is associated with the termination of the stimulus.
The voltage levels of the ERG are in the range, of 0.5 txV to 1 mV in clinical applications.
Much higher voltages are acquired in research experiments when the electrode is implanted
in the retina. The duration of the a, h. and c waves is about 0.25 to 1.5 sec, with the d
wave appearing at the termination o f the stimulus. The bandwidth required for the processing
of the ERG is about 0.2 to 200 Hz. Processing techniques applied to the ERG are mainly
synchronized averaging. Nonlinear methods have also been applied.7

D. E lectro-O culogram (EO G )

The EOG is the recording of the steady corneal-retinal potential * T h i s potential has
been used to measure eye position, either for research purposes (sleep research) or for clinical
use. The signal is measured by pairs of surface electrodes placed to the left and right of the
eyes and above and below the eyes. The amplitude levels are in the range of 10 jxV to 5
mV. The signal requires the frequency range of DC io 100 Hz.

E. E lectroencephalogram (EEG)
The recording of the electrical activity of the brain is known as electroencephalography
(EEG). It is widely used1112 for clinical and research purposes. Methods have been developed
to investigate the functioning of the various parts of the brain by means of the EEG. Three
Volume II: Compression and Automatic Recognition 115

types o f recordings are used. Depth recording is done by the insertion of needle electrodes
into the neural tissue of the brain. Electrodes can be placed on (he exposed surface o f the
brain, a method known as efectrocorticogmm. The most generally used method is the
noninvasive recording from the scalp by means of surface clecliodes.
The investigation o f the electrical activity of the brain is generally divided into two modes.
The first is the recordings of spontaneous activity of the brain which is the result o f the
electrical field generated by the brain with no specific task assigned ro it. The second is the
evoked potentials (EP). These are the potentials generated by the brain as a result of a
specific stimulus (such as a flash of light, an ,.udio click, etc.). EPs are described in the
next section.
The surface recording of the EEG depends ori the locations of the dcorodes. In routine
clinical multiple EEG recordings, the electrodes arc placed in agreed upon locations in the
frontal (F), central (C), temporal O'), parietal (P). and occipital (0) regions, v.ith two
common electrodes placed on the eariobes. Between 6 to 32 channels are employed, with
8 or 16 being the number most often used. Potential differences between the various electrodes
are recorded. There arc three modes of recordings: the unipolar, averaging reference, and
bipolar recordings (e.g ., see Strong).1'
The bandwidth range of the scalp EEG is DC to 100 Hz, with the major power distributed
in the range o f 0 .5 to 60 Hz. Amplitudes of the scalp EEG range from 2 to 100 jaV. The
EEG power spectral density varies greatly with physical and behavioral states. EEG frequency
analysis has been a major processing tool in neurological diagnosis for many years, h has
been used for me diagnosis of epilepsy, head injuries, psychiatric malfunctions, sleep dis
orders. and others. The major portion of the EEG spectrum has been subdivided into fine
bands.
T he delta ra n g e — The part of the spectmm that occupies the frequency range of 0.5
to 4 Hz is the delta range. Delta waves appear in young children, deep sleep, and in some
brain diseases. In the alert adult, delta activity is considered abnormal.
T he theta ra n g e — The theta range i> the part of the spectrum that occupies the frequency
range of 4 to 8 Hz. Transient components of theta activities have been found in normal
adult subjects in the alert state. The theta activity occnrs mainly in the temporal and central
areas and is more common in children.
T he alpha ra n g e — The alpha range is the part of the spectrum that occupies the range
of 8 to 13 Hz. These types of rhythms are common in normal subjects, best seen when the
subject is awake, with closed eyes, under conditions of relaxation. The source of the alpha
waves is believed to be in the occipital iobes. An example of the alpha activity can be seen
in Figure 2.
T h e beta ran g e — The beta range is the part of the spectrum that occupies the range 13
to 22 Hz. The beta rhythms are recorded in the normal adult subject mainly from the precemral
regions, but many appear in other regions as well. The beta range has been subdivided into
two: Beta I is the higher frequency range and beta II is the lower frequency range. P>eta II
is present during intense activation of the CNS, while beta I is diminished by such activation.
Sedatives and various barbiturates cause an increase” of beta a c t i v i t y often «p to amplitudes
of 100 jtV.
Time domain analysis is also used for EEG processing to detect short wavelets. This
has been applied mainly in sleep analysis. Sleep is a dynamic process which consists of
various stages. At the beginning o f the process the subject is in a state of drowsiness where
widespread alpha activity appears. Light sleep, stage 1, is characterized by low voltages of
mixed frequences. Sharp waves may appear in the EEG. These are the result of a response
to stimuli and are known as V-waves (Figure 3b). The spectrum at stage 1 of sleep is
dominated by theta waves. In state 2, the slow activity is increased and sleep sr ndles appear.
These are bursts o f about 3 to 5 cycles of alpha-like activity with amplitude of about 50 to
100 fxV. In stages 3 (moderate sleep) and 4 (deep sleep), there is an increase in irregular
116 Biomedical Signal Processing

1 i
?<L J*3 l*O0iiV
Y • * *T 1 to ?
gMHI

Eyes ■ ■ Eyes
closed opened

L -

(a) U)

FIGURE 2. EEG recordings, (a) Subject with complete absence of alpha waves: (b) subject with alpha waves,
diminished for only about 1 sec following eve opening. (From Kiioh. L. G-. McComas. A. J., Ossclton. I. W .,
and Upton, A. R. M .. Clinical Electroencephalography, 4th ed,. Butterworths. London, 1981. With permission. )

FIGURE 3. EEG recordings, stages o f drowsiness and sleep, (a) Early drowsiness, widespread alpha rhythm:
(b) light sleep (stage 1), note vertex sharp waves in response to sound stimulus at X; (c) light sleep, theta dominant
stage; (d) stage 2. emerging o f sleep spindles; (e) and (f) stages 3 and 4. Increasing irregular delta activity, K-
complex responses to sound stimuli at X. (From Kiloh, L. G ., McComas, A. J.. Osselton, J. W ., and Upton, A.
R. M .. Clinical Electroencephalography, 4th ed., Butterworths, London, 1981. With permission.)

delta activity and the appearance o f K-complexes. These complexes, most readily evoked
by an auditory stimulus, consist of a burst of one or two high-voltage (100 to 200 uV) slow
waves, sometimes accompanied or followed by a short episode of 12- to 14-Hz activity11
(Figure 3). Another sleep stage has been defined, the rapid eye m o vet,^..^ stage.
The EEG of the REM stage is similar to that of stage 1 and early stage 2, but in which
REM appear. It has been also termed the paradoxial sleep state (Figure 4).
Volume 11: Compression and Automatic Recognition 117

L'5fn -Jtl'

v ^ A ^ / v

FIGURE 4. Stages of wakefulness and sleep. Upper channel of each pair, eve movements plus
submental EMG; Lower channel, EEG. (From Kiloh. L. G .. McComas, A. J.. Ossehon, j. W.,
and L pion, A. R. M.. Clinical Electroencephalography, 4th ed.. Butterworths, London. 1981. With
permission.)

Several abnormalities are seen by the EEG. Epilepsy is a condition where uncontrolled
neural discharges take place in some location in the CNS. Such a seizure unvoluntarily
activates various muscles and other functions while inhibiting others. Several types of
epilepsies are known, among them are the grand and petit mal, myoclonic epilepsy, and
others (Figure 5).

F. Evoked Potentials (EP)

The electrical activity of the brain evoked by a sensory stimulus1118 is known as the
evoked potentials (EP) or evoked responses (ER). It is usually measured over the sensory
region of the brain corresponding to the stimulating modality. A sensory stimulus results in
two kinds of potential changcs in the EEG. The nonspecific response is a low-voltage
118. Bi om ed ic al Signal Processing

FIGURE 5 . Generalized epilepsy. (From K i l o h , L . G ., M c C o m a s . A. J . , Osselton, J . W . , and U p t o n , A . R .

M . , C linical Eleeiwencephalogrupky. 4th ed ., Butterworths, London, I 9 8 L With permission.)

transient, having its maximum value in the region of the vertex. The response is similar in
all types of stimuli. It becomes less marked when the same stimulus is repeated. The V-
w a n ^ d K-complex discussed in the previous section are nonspecific EPs. The specific
rcsn^” e is initiated with some latency after the stimulus has been applied. It has its maximum
in i ordeal area, appropriate to the modality of stimulation.
The EP is very low in amplitude, which is in the range of 0. i to 10 jxV The ongoing
EEG in which the EP is burried may be an order of magnitude larger. Synchronized averaging
techniques are usually used to detect the average evoked potential (AEP) (an abbreviation
used also for auditory evoked potentials). When the single EP is required,*9 other methods
o f signal to noise enhancement must be used (Chapter !), There are essentially three major
types o f evoked potentials in common use.
V isual evoked potential (VEP) — The VEP is recorded19 from the scalp over the occipital
lobe. The stimuli are light flashes or visual patterns. The VEP has an amplitude range of 1
to 20 jjlV with a bandwidth of 1 to 300 Hz. The duration of the VEP is of 200 msec. VEP
has been used for the diagnosis of multiple sclerosis (the optical nerve is co m m o n ly affected
by the disease), to check color blindness, to assess visual fields deficits, and to check visual
acuity. Figure 6 shows a typical VER.
Som atosensory evoked potential (SE P, SSEP) — The SEP is recorded-' with surface
electrodes placed over the sensory7 cortex. The stimulus may be electrical or mechanical.
The duration of cortical SEP is about 25 to 50 msec, with a bandwidth of 2 to 3000 Hz.
Subcorticai SEP is much longer and lasts about 200 sec. Figure 7 depicts cortical and
subcorticai SEPs. SEP is used to provide information concerning the dorsal column pathway
between the periferal nerve fibers and *he cortex.!!
A u d ito ry evoked potential (A EP) — AEPs are recorded by electrodes placed at the
vertex.2' 22 The auditory stimulus can be a click, tone burst, white noise, and others. The
A EP is divided15 into the first potential (latency of about a millisecond), the early potential
(eighth nerve and brainstem, 8 msec), the middle potential (8 to 50 msec), and the late
potential (50 to 500 msec). The initial iO-msec response has been associated with brainstem
activities. These brainstem auditory evoked potentials (BAEP) are very low in amplitude
(about 0,5 jjlV ) . The AEP has a bandwidth o f 100 to 3000 Hz. AEPs have been used to
check hearing deficiencies, especially in children. Figure 7 dei icts a typical cortical and
subcorticai AEP.
Volume II: Compression and Automatic Recognition ! 19

N12C

FIGURE 6. Averaged visual responses: bandwidth: 2 to 300 Hz: average of 64

responses. (From Kiloh. L. G ., McComas. A. J., Osselton. j. VV.. and Upton. A,
R. M .. Clinical Electroencephalography, 4th ed.. Buuerworths. London. !9S!. With
permission.)

Other evoked potentials — Potentials evoked by pain"5 stimuli have been recorded. Such
a stimulus can be an intense thermal pulse from an IR laser beam (Figure. 7 B), Olfactory
evoked potentials have been reported as weli as vestibulospinal potentials.
The processing of the EEG and EP requires many of the methods discussed in :h - hook.
The EEG is usually recorded in several channels for relatively long periods of t i Large
amounts o f data are thus collected. Automatic analysis and data compression techniques are
needed. Time series analysis methods (Chapter 7, Volume I) have widely been applied to
EEG analysis. Most often the EEG is modeled by an AR model,26 and adaptive segmentation
methods27 are employed. The estimation of the EEG power spectral density {Chapter 8,
Volume I) is an important part in both clinical and research oriented EEG analysis. Automatic
classification methods (Chapter 3) have been applied to the EEG for automatic sleep >iaging,
depth of anesthesia monitoring, and others. Wavelet detection methods (Chapter 1 ; have
been used lo detect K-complexes and spindles in the ongoing EEG. Principal components
and singular value decomposition methods (Chapter 3) have been used-8 to analyze evoked
potentials.

G. Electromyography (EMG)
EMG is the recording of the electrical potential generated by the muscle.3*29 The activity
of the muscle can be monitored by means of surface electrodes placed on the >kin The
signal received yields information concerning the total electrical activity associated with, the
muscle contraction. More detailed information is often needed for clinical diagnosis. Con
centric needle electrodes are then inserted through the skin into the muscle. The signal
received is known as the motor unit action potential (MUAP). Higher resolution can be
achieved by the use of microelectrodes by means of which single muscle fiber action
potentials are recorded. The three types of EMG signals are briefly discussed h ere.5*’
Single fiber electromyography (SFEMG) — The action potentials recorded from a single
muscle fiber have a duration of about 1 msec, with amplitudes of a few millivolts. The
bandwidth used to p r o c e s s tHf» ^ff.M G is 500 Hr to 10 kHz. Although the SFEMG contains
low frequencies, it is advisable to cut off the low band so that contributions from more
distant Fibers (having most of their power in the low range due the volume conductor) can
124 Biomedical Signal Processing

a difficult task mainly due to the complex volume conductor. Most ECG analysis and
diagnosis,46 however, are performed directly from the surface recordings.
Conventional ECG consists of the PQRST complex with amplitudes of several millivolts.
It is usually processed in the frequency band of 0.05 to 100 Hz47 where most of the energy
of the ECG Is included.
The first step in ECG processing is the identification of the R wave. This is done in order
to synchronize consecutive complexes and for R-R interval (heart rhythm) analysis. Various
techniques o f wavelet detection have been employed48 (Chapter 1, Volume II); the problem
is particularly severe when recording the ECG under active conditions where muscle signals
and other noise sources obscurc the QRS complex, The analysis of the R-R interval is an
important part of heart patient monitoring. Several methods have been employed for the
analysis, among them are autoregressive prediction49 and state estimation.50
Much effort has been placed on the development of algorithms lor automatic processing*' 32
of the ECG for monitoring, data compression, and classification. Optimal features53 of the
ECG have been discussed and a variety o f methods,54’57 including linear prediction^4 55 and
Karhunen-Loeve expansion,56 have been employed for compression and classification.

2. High-Frequency Electrocardiography
It has been found that the higher-frequency band of 100 to 1000 Hz filtered out in the
normal ECG does contain additional information.58'60 Waveforms known as notches and
slurs which are superimposed on the slowly varying QRS complexes have been recorded.

3. Fetal Electrocardiography (FECG)

The nonivasive detection of fetal ECG by means of abdominal surface electrodes61 is used
in clinical practice. The main problem in the processing of FECG is the large interferences
from maternal ECG (MECG) and from other muscles. Adaptive filtering methods (Chapter
9, Volume I) have been successfully used for signal to noise enhancement. Other methods62
have also been suggested.

4. His Bundle Electrography (HBE)

The recording of the electrical field, generated by the His and Purkinje activities,63-64
is known as the His bundle electrogratn (HBE). The signal has an amplitude range of about
1 to 10 |xV. This Iow-amplitude range requires synchronized averaging techniques for
processing.

5. Vector Electrocardiography (VCG)

Rather then displaying the voltages of the surface electrodes as a function of time, one
can plot the voltages o f one electrode as a function of another or as a function of some
combination o f other electrodes. With suitable location of the electrodes, one can thus get
the approximate projections of the heart dipole on the transverse plane (x,z), the sagittal
plane (y,z), and the frontal plane (x,y). Common electrode combinations17 are, for example,
the Frank electrode system, the axial, the tetrahedron, and the cube vectorcardiograms.
The vectorcardiogram is given by .hree two-dimensional plots (the projections on the planes)
in which the time is a parameter.

I. E lectrogastrography (EG G )
The stomach, like the heart, possesses a pacemaker that generates a sequence of electrical
potentials. Unlike the heart, defined pacemaker cells have not been found in the stomach.
The cyclic electrical potentials are transmitted through the smooth muscle fibers, causing a
slow rhythmic (of the order of 0.05 Hz) mechanical motion This motion is responsible for
mixing, grinding, and propelling the absorbed food.
Electrical potential changes generated by the stomach can be picked up65 by means of
Volume II: Compression and Automatic Recognition 125

0-2 0-3
f r e q u e n c y (H z)

FIGURE ! I. Power spectral density functions o f dog’s electrogastrogram (EGG l Calculated from
a record o f 107.7 tnin. The frequency at about 0.32 Hz is o f duodenal origin. (From van der Schee.
E. J.. Electrogastrography Signal Analytical Aspects and Interpretation. Doctoral thesis. University
o f Rotterdam, The Netherlands. 1984. With permission.)

surface electrodes. The signal has a dominant frequency equal to the frequency o f the gastric
electric control activity (ECA) which is about 0.05 Hz in m an/'5 The frequency bandwidth
of the signal is about 0.01 to 0.5 Hz. Optimal locations of electrodes for best recordings
have been suggested.66 Interferences due to electrode skin interface and motion and breathing
artifacts require signal to noise enhancement techniques. Correlation67 and adaptive filtering68
methods have been suggested. Autoregressive analysis o f the signal69 has been used. Using
duodenal implanted electrodes, the automatic classification of the ingestion of three different
test meals was successfully demonstrated.70 Pattern recognition methods discussed in Chapter
3 were em ployed. Figure 11 shows an example of EGG power spectral density function.

J . G alvanic Skin Reflex (GSR), E lectroderm al Response (EDR)

The autonomic nervous system, in response to emotional stimulus, changes the activity
of the sweat glands. A potential can be detected17-71 by means of a pair of electrodes: one
placed on the palm of the hand (high concentration o f glands) and the other on the back of
the hand (region almost completely devoid of sweat glands). The signal is less than 1 mV
in amplitude with a bandwidth of DC to 5 Hz. The acquisition of the signal is difficult due
to DC potentials produced by the electrode skin interface. The GSR is used for emotional
state monitoring (“ lie detector” ) and for various biofeedback applications.

III. IM PEDANCE

A. Bioimpedance
The biological tissue obeys Ohm’s law for current densities73 below about 1 mA/cm.2
The impedance o f the tissue changes with time due to various phenomena such as blood
volume change, blood distribution change, blood impedance change with velocity, and tissue
impedance changes duetto pressure, endocrine, or autonomic nervous system activity. Im
portant information on the resistance of various tissues74 has been collected through the
years.
Bioimpedance measurements75 are usually performed with four electrodes: two for current
injection and two for the impedance measurement. At low frequencies, electrode polarization
causes some measurement problems. The range of 50 kHz to 1 MHz is usually employed.
126 Biom ed ic al Signal Processing

Current densities must be kept low so as not to cause changes due to heating. The range of
currents used in practice is 20 p.A to 20 mA.

B* Impedance Plethysmography
The use o f impedance changes for the recording of peripheral volume pulses is known as
impedance plethysmography. The method has been applied75 to various locations of the body
such as the digits, limbs, head, thorax, and kidney. Since calibration of the impedance in
terms o f blood flow is difficult, the method has been mainly used for relative monitoring.
An experiment on a dog at 50 kHz showed that a 1% change in blood volume generates
a change t f about 0.16% in resistance, with almost linear relationship at a range of ± 30%
blood flow change.

C. Rheoeneephalography (REG)
The measurement of impedance changes between electrodes placed on the scalp is known
as rheoencephahgram (REG). The frequencies used are in the range of I to 500 KHz,
yielding a transcamiai impedance of about 100 fi. The pulsatile impedance change is on
the order of 0.1 C .

D. Impedance Pneumography
Electrodes placed on the surface of the chest are used to monitor respiration in the frequency
range o f 50 to 600 kHz; the change in transthoractic impedance, from full inspiration to
maximum expiration, is almost entirely resistive with the value of about 20 H. The changes
in impedance are related io the changes in lung air volumes. The method is used also as an
apnea monitor, to detect pauses in breathing.

E. Impedance Oculography (ZOO)

The position of the eye can be monitored by the impedance75 measured between pairs of
electrodes located around the eye.

F. Electroglottography
The measurement of the impedance across the neck in known as electroglottography. 12-m
Variations in glottis size, as the vocal cords vibrate, cause impedance changes. The method
can thus be used to measure the pitch frequency.

IV. ACO USTICAL SIGNALS

A. Phonocardiography (PCG)
Phonocardiography77 is the recording o f sounds generated by the heart and the great
vessels. PCG can be monitored by invasive means with a microphone inserted into the heart
or the vessel. It can be monitored noninvasively by placing a microphone on the surface of
the body. The latter is a convenient noninvasive method widely used.
The sounds recorded on the surface of the body depend on the source location, intensity,
and the acoustical properties o f the surrounding tissues. It is important that the location of
the microphone be specified. The required bandwidth is about 20 to 1000 Hz.
The normal heart sounds are divided into four. These are depicted in Figure 12 together
with the ECG. The first and second heart sounds are the more important ones. A detailed
recording is shown in Figure 13.
/ . The First Heart Sound
The first heart sound is the result of the onset of left ventricular contraction (1) (Figure
13), onset of right ventricular contraction, mitral closure (2), onset of right ventricular
ejection (3), and onset of left ventricv ar ejection (4). The first heart sound lasts about 100
to 120 msec.
Volume II: Compression and Automatic Recognition 127

FIGURK 12. (ASxne) Phonocardiogrum, th.

four heart sounds. <Reproduced with permission
fromTavel, M. E .. Clinical Phonocardiography
and External Puis* Reiording, 3rd ed.. Copy
right 1978 by Year Book Medical Publishing.
Chicago.)

FIGURE 13. (Right) Phonccardiogram, the first and

second heart sounds recorded at the ieft sternal border,
fourth intercostal space (LSB), and at the apex. Upper
trace, ECG: lower trace, the carotid pulse. Note the
dictotic notch (F». (Reproduced with permission from
Tavei, M. E .. Clinical Phonocardiography and Externa!
Pulse Recording, 3rd ed.. Copyright C l 978 by Year
Book Medical Publishing, Chicago.)

2. The Second H eart Sound

The second heart sound is generated by the closure of the semilunar valve. Two components
can usually be detected (Figure 13) which are the result of the aortic and pulmonary valves.

3 . The Third Heart Sound

The source o f the third heart sound is not agreed upon. It is a low-frequency (20 to 70
Hz) transient with low amplitudes, occuring during ventricular filling of early diastole. The
third heart sound lasts about 40 to 50 msec.

4. The Fourth H eart Sound

The fourth heart sound occurs at the time of atrial contraction. It is similar to the third
heart sound in duration and bandwidth.

5. Abnormalities o f the Heart Sound

The diagnostic value of the phonocardiography stems from the ability to analyze heart
malfunctions from abnormal recordings. Abnormalities appear in many forms: the most
important are changes in intensity, splitting of sound components, ejection clicks and sounds,
opening snaps, and murmurs. A few examples are shown in Figure 14. Automatic classi
fication systems for PCGs with AR78 and ARM A79 analysis have been suggested.

B. A uscultation
Ti.w uiuiiitoring o f sounds heard over the chest walls is known as auscultation. It has long
been used as one o f the means by which pulmonary dysfunctions were diagnosed.*5 During
respiration, gases flow through the various airways emitting acoustical energy. This energy
is transmitted through the airway walls, the lung tissue, and chest walls.
128 Biomedical Signal Processing

FIGURE 14. Abnormal heart sounds. (AMMidsystolic click; upper trace, ECG; lower trace, apevcardiogram
(ACG): (B) systolic ejection murmur. (Reproduced with permission from'Tavel, M. E .. Clinical Phonocardiography
ami External Pulse Recording. 3rd ed.. Copyright 0 1 9 7 8 by Year Book Medical Publishing, Chicago.)

Breath sounds are generated by the air entering the alveoli during inspiration (local or
vesicular noise) and while passing through the larynx (laryngial or glottic hiss). Four types
of normal breath sounds have been defined: vesicular breath sounds (VBS), bronchial breath
sounds (BBS), broncho-vesicular breath sounds (BVBS), and trachial breath sounds (TBS).
Each one of the above breath sounds is normally heard over certain areas of the thorax.
When heard over other than its normal place, it is considered abnormal. Figure 15 depicts
the characteristics of the four types of normal breath sounds.
There are several types of breath sounds which, when present, always indicate abnormality.
The abnormal breath sounds are known as the cogwheel breath sound (CO), the asmatic
breath sound (AS), the amphoric breath sound (AM), and the cavernous breath sound (CA).
Verbal descriptions of the characteristics of the various breath sounds are used.81 A parametric
description and an automatic classification method have been suggested.82 Another type of
abnormal sounds are the adventitious sounds. These are called musical rales or wheezes and
nonmusical rales.
Auscultation is usually performed with the stethoscope. To get the full frequency range
and an electrical signals that can be processed, microphones are used. The frequency range
required is 20 Hz to 2 kHz.

C. Voice
Speech is produced by expelling air from the lungs through the trachea to the vocal cords.83
When uttering voiced sounds, the vocal cords are forced to be opened by the air pressure.
The opening slit is known as the glottis. The pulse of air propogates through the vocal tract.
V o l u m e 11: C o m p r e s s i o n a n d Au tomatic Recognition 129

FIGURE 15. Typical time and frequency plots o f normal breath sounds. Left: energy en\elope:
middle: power spectral density, estimated by FFT (upper trace, midinspiration: lower trace, beginning
inspiration): right: power density, estimated by LPC (midinspiration). (From Cohen, A. and Lands-
berg. D .. IEEE Trans. Biol. Med. £/»«.. BME-31. 35. 1984 (© 1984. IEEE). With permission.)

The generated sound depends on the acoustical characteristics of the various tubes and
cavities of the vocal system. These are changing during the speech process by moving the
tongue, the lips, or the velum.
The frequency o f oscillation of the vocal cords during voiced speech is cailed the fun
damental frequency or pitch. This frequency is determined by the subglottal pressure and
by the characteristics of the cords, their elasticity, compliance, mass, length, and thickness.
When uttering unvoiced sounds, the vocal cords are kept open and do not take part in
the sound generation. Figure 16 depicts a record of speech signals including silent, unvoiced,
and voiced segments. The speech signal has been used as a diagnostic aid for laryngeal
pathology or disorder;80 84-85 among these are laryngitis, hyperplasia, cancer, paralysis, and
more. It has been also used as a diagnostic aid for some neurological diseases*6 and as an
indicator of emotional states.87 Infant’s cry has also been suggested as a diagnostic aid88
(e.g., see Figure 2. Chapter 3).

D. K o ro tk o ff S ounds
The most common method for indirect blood pressure measurement is by means of the
sphygmomanometer. An inflatable cuff placed above the arm is used to oclude blood flow
to the arm. The pressure exerted by the cuff causes the artery to collapse. When cuff pressure
is gradually released to the point where it is just below arterial pressure, blood starts to How
through > im p re s s e d art~ry. The turbulent blood flow generates sounds known as Ko
rotkoff sounds. These are picked up by a microphone (or a stethoscope) placed over the
artery. The sounds continue, while decreasing the pressure, until no constriction is exerted
130 Bi om ed ic al Signal Processing

FIGURE 16, A sample o f speech signal demonstrating silent, unvoiced, and voiced segments.

on the artery, Most of the sound’s power is in the frequency range of 150 to 500 Hz. Usually
piezoelectric microphones are used yielding amplitudes of about 100 mV (peak to peak).

V. M E C H A N IC A L S IG N A L S

,A . P ressu re Signals
Blood pressure measurements53 are taken from the critically ill patient by the insertion of
a pressure transducer somewhere in the circulatory system. Figure 13 and Figure 12 in
Chapter 4 give typical examples of the carotid blood pressure signal. Pattern recognition
methods have been applied to the analysis o f the pressure blood wave (Chapter 4). The
frequency bandwidth required is about DC to 50 Hz. Other biological pressure signals are
of-clinical importance. Figure 17, for example, depicts the intrauterine pressure of a woman
in labor.

B. A pexcardiography (ACG)
Tavel77 has suggested the term apexcardiography to include a variety of methods used for
recording the movements of the precordium. Among the various methods are vibrocardio
graphy, kinetocardiography, ballistocardiography, and impulse cardiography. The motion
is detected by various transducers, accelerometers, strain gauges, or displacement devices
(LVDT). The frequency bandwidth required is about DC to 40 Hz. An example of the ACG
is shown in Figure 14A.

C . P neum otachograph}
Pneumotachography is a method used to analyze flow rate for respirator functions eval
uation. The flow rate signal has a bandwidth of about DC to 40 Hz.

D. Dye an d T herm al Dilution

The total amount of blood flowing through the heart is known as the cardiac output. Dye
dilution methods have been developed to calculate the cardiac output. A known amount of
dye (or radioisotope) is injected into the superior vena cava. The concentration of dye
(isotope) is then monitored in the arteries. By integrating the dilution curve, an estimate of
the cardiac output is given. The dilution curve contains the ce**‘ '1 ’ :.n of the blooJ. pumped
after injection which is the signal required. Superimposed on it are contributions from the
recirculating blood due to the second and third heart cycles. For the cardiac output calcu-
V o l u m e 11: C o m p r e s s i o n a n d Au tomatic Recognition 131

.FIGURE i 7. Recording during labor. Upper trac;: feial heart rate: middie; abdominal pressure; lower: intrauterine
pressure. (Courtesy of f)r Yarkoni. Soroka Medical Center).

Unions, the first curve must be estimated. The techniques for echo cancellation (Chapter 9,
Volume I) can be employed here. A similar technique, using the injection of fluid having
temperature different than that of the blood, is sometimes employed. It is known as thermal
dilution.

E. Fetal M ovem ents

The movements of the fetus are thought to give an indication of its well-being. Various
methods have been suggested for the noninvasivc monitoring of fetal movement. Abdominal
strain gauges have been used as well as ultrasonic transducers.

V I. B IO M A G N E T IC S IG N A L S

A. M agnetoencephaiography (M EG)
Various organs such as the heaiu lungs, and brain produce extreme weak magnetic fields.
The measurement of these magnetic fields is difficult. Magnetic measurement has been made
on nerve cells** and from the brain.90 MEG was reported to be different than the EEG and
to provide additional information.90 An example of the MEG signal is shown in Figure 18.

B. M ag n eto card io g rap h y (M CG )

The magnetic fields generated by the heart were reported to be different from the ECG
and lo provide additional information.9' Figure 19 depicts the MCG taken from various
places on the chest.

C. M ag n etopneum ography (M PG )
The monitoring of the magnetic Fields generated over the lungs was also suggested.42

V II. B IO C H E M IC A L S IG N A L S

Biochemical m easurem ents1 are usually performed in the clinical laboratory. Blood gas
and acid-base measurements are'roulinely performed to evaluate partial pressure of oxygen
(pO:), partial pressure of CO , (p C 0 2), and concentration of hydrogen ions (pH). These
measurements are usually done by means of electrodes. Other methods for the measurements
of organic and nonorganic chemical substances are used, such as chromatography. electro
phoresis. flame photometry, atomic emission, and absorption fluorometry, nuclear magnetic
132 Biom
edicalSignalProcessing
'*■+ MEG

EYES CLOSED OPEN

FIGURE 18. M agnetocficephalogram (MEG) with

EEG. (From Cohen, D. and Cuffin, B. N ., Electroen-
c e p h a lo g r, C lin . N e u r o p h y s ., 5 6 , 1983. W ith
permission.)

resofltfrtce (NMR), and more. These methods most often provide DC signals. The problems
associated are mainly in the instrumentation and acquisition systems rather than in the
processing. Some processing problems do exist, for example, in methods like chromatog
raphy where sometimes close or overlapping peaks have to be identified.
Biochemical measurements are performed also in the clinic and in the research laboratory.
Specific ion microelectrodes have been developed which allow the recordings o f ion con
centration variations of neural cells. Figure 20 is an example o f such a signal.
Noninvasive, transcutaneous monitoring o f p 0 2 and p C 0 2 can be conveniently performed
by means of special electrodes. This measurement is used in the clinic. Noninvasive blood
oxygenation monitoring is done by optical means (oximetry). These signals are very low-
frequency signals and usually require no special processing.

VIII. T W O - D I M E N S I O N A L S I G N A L S

The problem of image processing is an important problem in biomedical diagnosis. Meth

ods that use X-rays and ultrasound provide two-dimensional images of the tissues under
investigation. Sophisticated imaging systems like the computer tomography (CT) and nuclear
magnetic resonance (NMR) scanners reconstruct three-dimensional information from the
two-dimensional measurements. The analysis and processing of these types of signals are
outside the scope of this book.
Volume II: Compression and Automatic Recognition

I7 la In

JL y ' ___f ^ ~ ~ T v ------- -------- «" — I 1A 4

m r t>f1 U 4

~L J ^ \ ___ p — ^
W l
rft h h t

-H
^ ULfv -fi-s— ___ /'-jl

rft.

M M h H H - f
1u c

J . P oqe 2 0
— J 3 * lO^QOUM
1H M L _ U |.
^ ------1- 5 « 9 in
13s <t>«

FIGURE !9. Magnetocardiogram obtained across the chest with 12-lead ECG and Frank
x,y,z leads. (From Cohen, D. and McCaughan, D ., Am. J. Cardiol., 29, 678, 1972. With
permission.)
134 Biomedical Signal Processing

FIGURE 20 Simultaneous recordings of EEG. field potentials (fp>. and extracellular

potassium activity (ak) from cat thalamus during penicillin-induced seizures. <From
Heinemann. II. and Gutnick, M. J., Electroencephaiogr.C'lin. Neurophys., 47, 345,
1979. With permission.)

REFERENCES

1. Webster, J. C»., F A . * M edical Instrumentation Application and Design, Houghton Mifflin, Boston, 1978.
2. Abeles, M. and Goldstein, M. H ., Multispike train analysis, Proc. IEEE, 65(5). 762, 1977.
5. Lermtan, j . A. R. and Ritchie, A. E ., Clinical Electromyography, Pitman Medical and Scientific, London.
1970.
4. Arm ington, J ., The Electroretinogram, Academic Press, New York, 1974.
5. G ouras, P .v Electroretinography: some basic principles, Invest. Ophthalmol., 9, 557. 1970.
6. Chatrian, G . E ., Computer assist ERG. L Standardized method. Am. J . EEG Techno!., 20(2), 57, 1980.
7. Larkin, R. M ., Klein, S ., O dgen, T. E ., and Fender, D. H ., Non-linear kernels o f the human ERG.
Biol. C y b e m ., 35, 145. 1979.
8. Krill, A. E ., The electroretinogram and electro-oculogram: clinical applications. Invest. Ophthalmol., 9.
600, 1970.
9. North, A. W ., Accuracy and precision o f electro-oculographic recordings, Invest. Ophthalmol., 4, 343,
1965.
10. Kris, C ., Vision: electro-oculography, in M edical Physics, Vol. 3, Glasser, O ., Ed., Year Book Medical
Publishing, Chicago, 1960.
11. Kiloh, L. G ., M cComas, A. J ., O sselton, J . W ., and Upton, A. R. M ., Clinical Electroencephalography,
4 th ed., Butter.vorths, London, 1981.
12. Basar, E ., EEC-Brain Dynamics, Elsevier/North-Holland, Amsterdam, 1980.
13. Cox, J. R ., Nolle, F. M ., and Arthur, R. M ., Digital analysis of the EEG, the blood pressure and the
ECG, Proc. IEEE, 60, 1137, 1972.
14. Barlow, J. S ., Computerized clinical EEG in perspective, IEEE Trans. Biol. M ed. E ng., 26, 277, 1979.
15. Gevins, A. S ., Pattern recognition o f human brain electrical potentials, IEEE Trans. Pattern Anal. Mach.
Intelligence, 2, 383, 1980.
16. Isaksson, A ., W ennberg, A ., and Zetterberg, L. H ., Computer analysis o f EEG signals with parametric
models, Proc. IEEE, 69, 451, 1981.
17. Strong, P ., Biophysical M easurements, Tektronix, Beaverton, Ore., 1970.
18. Childers, D. G ., Evoked responses: electrogenesis, models, methodology and wavefront reconstruction
and tracking analysis, Proc. IEEE, 65(5), 611, 1977.
19. M cGillem , D. C . and Aonon, J . I ., Measurements o f signal components in single visually evoked brain
potentials, IEEE Trans. Biol. Med. Eng., 24, 232, 1977.
20. Sayers, B. M cA ., Beagley, H . A ., and Riha, J ., Pattern analysis o f auditory evoked EEG potentials.
Audiology, 18, 1, i 979.
21. Jervis, B . W ., N ichols, M . J ., Johnson, T . ! ., Allen, E ., and Hudson, N. R ., A fundamental investigation
of the composition of auditory evoked potentials, IEEE Trans. Biol. Med. Eng., 30, 43, 1983.
Volume //. Compression and Automatic Recognition 135

22. Boston, J . R .t Spectra o f auditory brainstem responses and spontaneous jBEG, IEEE T ram R io L Med.
Eng.. 28, 334. 1981.
23. Sclabussi, R. J ., Kisch, H. A M Hinman, C. L ., K roin, J. S ., Knns. N. I’., and Niii»utwv> N . $ „
Complex battem evoked somatosensory j^.ponscs in the study o f muHiplc U v ^ o s i s ^ m . . //.EC, 65(5),
626, 1977J.
24. Berger, M . ! )., Analysis of sensory evoked potentials using normalized cross-corcelation ns, A,\'d.
Biol. Eng. Com put., 21. 149, 1983.
25. Carm en, A ., Consideration o f the cerebral t^pdnse to painful stimulation: stimulus transduction \e:su»
perceptual event, H u ll. N .Y . A c a d . M e d .. 55. 31.3, 1979.
26. Zetterberg, L. H ., Estimation o f parameter*, for a linear difference equation with application to EEC
analysis, M ath. B io ta .. 5. 227. 1969.
27. Praetorius, H . M ., Bodenstein, <1., and Creutzfeidt, O . D ., Adaptive segmentation of I K"; iccord'.. a
new approach to automatic HUG analyst*.. Ffo'troeiu'ephalogr. Clin. Nrurophw;-.-42. 84. 1977.
28. tlaim i-C uhen, R. and Cohen, A ., A micropAKessor controlled system for st»inu« tlion and .n.quivi^on of
evoked potentials. Comput. Biomed. II: > . ;ri press.
29. Basm ajian. Clifford,■'H*V M cl.eod, \ \ . , and Nunnaly, H ., Eds.. C om puters--in E le c tro m y o g ra p h y .
Buttetworths. London, 1975.
30. Stalberg. E . and Antoni, L .. Computer aided EMG analysis, in Computer Aided Electromyography,
Progress in Clinical Electromyography. Vol. i<) Desmedt. J. E.. Ed., S. Karger, Basel. 1983. 186.
31. Ix;Fcvcr. R. S. and DdLuca. C. ,i., A procedure for decomposing the myoelectric signal into its constituent
action potentials. I. Techniques theory and implementation. IEEE Trans. Biol. Eng., 29. J49,. 1982.
32. LeFever, R. S ., Xenakis, A. P ., and Dei.uca, C , J ., A procedure for decomposing the myoelectric signal
into its constituent action potentials, II. Execution and test for accuracy. IEEE Trans Biol. Med. Eng.,
29, 158. 1982.
33. Nandedkar, S. D. and Sanders, D. B.. Special purpose orthonormai basis functions — application to
motor unit action potentials. IEEE T>\m\ Biol. Med. Eng., 3 i, 374, 1984.
34. Berzuini. C ., M aranzana-Figini, M .. and Bernard iuelli, C ., Effective use of EMG parameters in the
assessment o f neuromuscular diseases. Int. J. Biol. M ed. Comput., 13. 481. 1982.
35. Kranz, H ., W illiam s, A. M ., Cassell, .f.. Caddy, L). J ., and Silberstein, R. B ., Factors determining
the frequency content of the EMG. J. Ann'. Physiol. Respir. Environ. Exerc. Physiol., 55(2). 392, 1983.
36. Lindstrom , I.. H . and M agnusson, R. I.. Interpretation o f myoelectric power spectra: a model and its
application. Proc. IEEE. 65. 653. 1977.
37. inbar, G. F. and Noujaim, A. E ., On surface EMG spectral characterization and its application-to diagnostic
classifications. IEEE Trans. Biol. Med. Eng.. 31, 597. 1984.
38. Journee, J. L ., van-M anen, J ., and van>der M eer, J . J ., Demodulation o f EMG’s o f pathological
trcmours. Development and testing of a demodulator for clinical use. M ed. Biol. Eng. C om put., 21. 172,
1983.
39. Stuien, F. G . and DeLuca, C. J ,, Muscle fatigue monitor: a non-invasive device for observing localized
muscular fatigue, IEEE Trans. Biol. Eng.. 29. 760. 1982.
40. G ross, D ., G rassino, A ., Ross, W . R. D ., and M acklem , P. T ., Electromyogram pattern o f diaphragmatic
fatigue. J. Appl. Physiol. Respir. Environ. Exerc. Physiol., 46(1), 1, 1979.
41. G raupe, D. and Cline, W. K ., Functional separation o f EMG signals via ARMA identification methods
for prosthesis control purposes, IEEE Trans. Syst. M an Cybern., 5, 252, 1975.
42. Doerschuk, P. C „ Gustafson, D. E ., and Willsky, A. S ., Upper extremity limb function, discrimination
using EMG signal analysis, IEEE Trans. Biol. Med. E ng., 30, 18. 1983.
43. Saridis, G . N. and Gootee, P. T ., EMG pattern analysis and classification for a prosthetic arm. IEEE
Trans. Biol. M ed. Eng,, 29, 403, 1982.
44. M artin, R . O ., Pilkington, T . C ., and Marrow, M . N ., Statistically constrained inverse electrocardiog
raphy, IEEE Trans. Biol. M ed. Eng.. 22. 487. 1975.
45. Yam ashita, Y ., Theoretical studies on the inverse problem in electrocardiography and the uniqueness of
the solution, IE E E Trans. Biol. M ed. Eng.. 29. 719, 1982.
46. Friedman, H . H ., Diagnostic Electrocardiography an d Vectorcardiography, McGraw-Hill, New York.
1977.
47. Riggs, T ., Isenstein, B ., and Thom as, C ., Spectral analysis of the normal ECG in children and adults.
J . E lectrocardiol. , 12(4). 377, 1979.
48. Ligtenberg, A. and K unt,t M ., A robust digital QRS detection algorithm for arrhythmia monitoring.
Comput. Biorned. Res., 16, 273, 1983.
49. Haywood. L. Y ., Saltzberg, S. A ., Murthy, V. K ., H uss, R ., H arvey, G. A ,, and Kalaba, R ., Clinical
use oi k-R interval prediction for ECG monitoring: time series analysis by autoregressive models, M ed.
Inst., 6, 111. 1972.
50. Ciocloda, G . H ., Digits! analysis o f the R-R intervals for identification o f cardiac arrhythmia. Int. J. Biol.
Med. Com put.. 14. 155, 1983.
136 Biomedical Signal Processing

51. Caceres, C. A. Mid Drelfus, L. S., Eds*, Clinical E le c tro c a rd io g ra p h y a n d Computers, Academic Press,
New Yoric, 1970.
52. Wolf, H. K. and MacFariane, P. W., Eds*, O p tim iz a tio n o f C o m p u te r E C G P ro c e s s in g , North-Holland,
Amsterdam, 1980.
53. Jain, U. , Rautaharju, P. M ., and Warren, J., Selection o f optimal features for classification of elec
trocardiograms, / . Electrocardiot., 14(3), 239, 1981.
54. Shridhai, M. and Stevens, M. F., Analysis of ECG data, for data compression, Int. J. Biol. Med.
Comput., 10, 113 1979
55. Ruttimann, U. E. and Pipberger, H. V. , Compression o f the ECG by prediction or interpolation and
entropy encoding. l E E E T r a n s . B b t . M e d . E n g ., 26, 163, 1979.
56. Womble, ML E Hailiday J. S., Mitter, S. KM Lancaster, M. C ., and Triebwasser, J. H., Data
compression for storing and transmitting ECG’s/VCG’s, P r o c . I E E E , 65(5), 702, 1977.
57. Jain, I)., Rautaharju, P. M ., and Horacek, B. M., The stability of decision theoretic electrocardiographic
classifiers based on the use of discretized features, Comput. Biomed. Res., 13, 132, 1980.
58. Santopietro, R. F., The origin and characteristics of the primary signal, noise and interference sources in
the high frequency ECG, Proc. IEEE, 65(5), 707, 1977.
59. Chein, I. C ., Tompkins, W. J., z .d ! V \ , ' . Computer methods for analysing the high frequency
ECG Med. B io l Eng. Comput., 18, 303, 1980.
60 Kim, Y. and Tompkins, W. J. , Forward and inverse nigh frequency ECG, Med. Biol. Eng. Comput.,
19 11 1981.
M . Wheeler, T. , Murriits, A., and Shelly, T. , Measurement of the fetal heart rate during pregnancy by a
new electrocardiography technique, Br. J. Obstet. G y n a e c o l 85, 12, 1978.
62. Bergvefd, P. and Meijer, W. J. H.» A new technique for the suppression o f the MECG, IEEE Biol. Med.
Eng., 28, 348, 1981.
63. Flowers, N. C., Hand, R. C ., Orander, P. C. , Miller, C. R. , and Walden, M . O ., Surface recording
of electrical activity from the region of the bundle of His, Am. J. Cardiol., 33, 384, 1974.
64. Peper, A. , Jonges, R., Losekoot, T. G ., and Grimbergen, C. , Separation o f His-Purkinge potentials
from coinciding atrium signals: removal of the P-wave from the electrocardiogram, Med. Biol. Eng. Comput. ,
20, 195, 1982.
65. van der Schee, E. J., Electrogastrography Signal Analytical Aspects and Interpretation, Doctoral thesis,
University of Rotterdam, The Netherlands, 1984.
66. Mirizzi, N. and Scafoglieri, U., Optimal direction of the EGG signal in man, M ed. Biol. Eng. Comput.,
2 1 , 3 8 5 ,1 9 8 3 .
67. Postaire, J. G ., van Houtte, N., and Devroede, G., A computer system for quantitative analysis of
gastrointestinal signals, Comput. Biol. M ed., 9, 295, 1979.
68. Kentie, M. A., van der Schee, E. J., Grashuis, J. L., and Smout, A. J. P. M ., Adaptive filtering of
canine EGG signals. II. Filter performance, Med. Biol. Eng. Comput., 19. 765, 1981.
69. Kwok, H. H. L., Autoregressive analysis applied to surface and serosal measurements of the human
stomach, IEEE Trans. Biol. M ed. Eng., 26, 405, 1979.
70. Reddy, S. N., Dumpala, S. R. , Sarna, S. K., and Northeott, P. G., Pattern recognition of canine
duodenal contractile activity, IEEE Trans. Biol. Med. Eng., 28, 696, 1981.
71. Vdow, M. R., Erwin, C. W., and Cipolat, A. L., Biofeedback control o f skin potential level, Biofeedback
Self Regul., 4(2), 133, 1979.
72. Askenfeld, A., A comparison of contact microphone and electroglottograph for the measurement of vocal
fundamental frequency, J. Speech Hearing Res., 23, 258, 1980.
73. Schwan, H. P., Alternating cuirent spectroscopy of biological substances, Proc. IRE, 47(11), 1941, 1959.
74. Geddes, L. A. and Baker, L. E., The specific resistance of biological material. A compendium of data
for the biomedical engineer and physiologist, Med. Biol. Eng. Comput., 5, 271, 1967.
75. Geddes, L. A. and Baker, L. E., Principles o f Applied Biomedical Instrumentation, John Wiley & Sons,
New York, 1968.
76. Lifshitz, K., Electrical impedance cephalography (rheoencephalography), in Biomedical Engineering Sys
tems, Clynes, M. and Milsum, J. H., Eds., McGraw-Hill, New York, 1970.
77. Tavel, M. E., Clinical Phonocardiography and External Pulse Recording, 3rd ed., Year Book Medical
Publishing, Chicago, 1967.
78. Iwata, A., Suzumura, N. , and Ikegaya, K., Pattern classification o f the phonocardiogram using linear
prediction analysis, Med. Biol. Eng. Comput., 15 , 407, 1977.
79. Joo, T. H., McClellan, J. H., Foaie, R. A., Myers, G. S., and Lees, R. S. , Pole-zero modeling and
classification of PCG, IEEE Trans. Biol. Med. Eng., 30, 110, 1983.
80. Childers, D. G., Laryngial pathology detection, CRC Crit. Rev. Bioeng., 2, 375, 1977.
81. Druger, G., The Chest: Its Signs and Sounds, Humetrics Corp., Los A.ngeles, 1973.
82. Cohen, A. and Landsberg, D., Analysis and automatic classification of breath sounds, IEEE Trans. Biol.
Med. Eng., 3 1 , 3 5 ,1 9 8 4 .
Volume II: Compression and Automatic Recognition 137
83. Schafer, R. W . and Market, J. D., Eds., S peech A n a ly s is . IEEE Press, New York, 1978.
84. Mezzalama, M ., Prinetto, P., and Morra, B., Experiments in automatic classification of laryngeal
pathology, M e d . B io l. E n g . C o m p u t., 21, 603, 1983.
85. Detier, J. R. and Anderson, D. J., Automatic classification of laryngeal dysfunction using the roots of
the digital inverse filler, I E E E T ra n s . B io l. M e d . E n g .. 27. 714, 1980.
86. Okada, M ., Measurement of speech patterns in neurological disease, M e d . B io l. E n g . C o m p u t., 21, 145,
1983.
87. Streeter, L. A., Macdonald, N. H., Apple, W., Krauss, R. M., and Galott, K. M., Acoustic and
perceptual indicators of emotional stress, J . A coust. S o c. A m ., 73(4), 1354, 1983.
88. Cohen, A. and Zmora, E., Automatic classification o f infants’ hunger and pain cry, in P r o c . In t. Conf.
D i g i t a l S ig n a l P ro c e s s ., Cappelini, V. and Constantinidcs. A. G., Eds,, Elsevier, Amsterdam, 1984.
89. Wlkswo, J. P., Barach, J. P., and Freeman, J. A., Magnetic field of a nerve impulse: first measurements,
S c ie n c e , 208, 53, 1980.
90. Cohen, D. and Cuffin, B. N., Demonstration of useful differences between magnetoencephaiogram and
electroencephalogram, Electroencephalogr. Clin. Neurophys., 56, 1983.
91. Cohen, D. and McCaughan, D., Magnetocardiograms and their variation over the chest in normal subjects.
Am. J. C ardiol., 29, 678, 1972.
92. Robinson, S. E., Magnetopneumography non-invasive imaging of magnetic particulate in the lung and
other organs. I E E E T ra n s . N u c l. S c i.. 28, 171, 1981.
93. Heinemann, U. and Gutnick, M. J., Relation between extracellular potassium concentration and neuronal
activities in cat thalamus (VPL) during projection o f cortical epileptiform discharge, Electroencephalogr.
Clin. N europhys., 47. 345, 1979.
V o l u m e II: C o m p r e s s i o n a n d Automatic Recognition 139

Appendix B

; D A T A A N D LAG W INDOW S
i
j
! I. INTRODUCTION

Any practical signal processing problem requires the use o f a window. Since we cannot
process an infinitely long record, we must multiply it with a window that zeroes the signal
outside the observation period. The topics of window design and window applications are
dealt with by most signal processing books1'8 and by many papers.9 20
A w indow ,1 w(t), is a real and even function of time, with Fourier transform, W(w) =
F{w(t)}, which is also real and even. We also require that a window be normalized:

w(o)=i L w(w)dw = | (B - i>

and time limited:

w(t) = 0

|t| > T (B.2)

Windows are used for a variety of applications in continuous and discrete signal processing,
e.g., in the design6 o f the nonrecursive digital filters, in the application o f FFT, and in
power spectral density (PSD) function estimation (Chapter 8, Volume I).
In the application of PSD estimation, a window is required to reduce the spectral leakage.
Several figures o f merits have been defined to evaluate and compare windows. To cancel
leakage completely we need a window that behaves as a delta function in the frequency
domain. Such a window is of course unrealizable. We can consider a practical window
(Figure 1) and require that the main half width10 (MHW) of the main lobe and that the side
lobe level10 (SLL) be as small as possible. Other criteria such as the equivalent noise
bandwidth9 (ENBW ), processing gain9 (PG), maximum energy concentration,! and minimum
amplitude m om ent1 have been used.
When considering PSD estimation, a window can be applied directly to the data (data
window o r taper window) or to the autocorrelation function. The latter is known as the lag
window or quadratic window. Note that the data window does not preserve the energy of
the signal. The lag window, however, does preserve the energy since r(o), the signal’s
energy, is multiplied by w(o) = 1 .

II. SOME CLASSICAL W INDOW S

A. Introduction
In this section we shall list a number of windows with their appropriate parameters. Plots
of the windows in the time and frequency domain are also given. To demonstrate the relative
behavior o f the windows in PSD estimation application a simple experiment was conducted
by Harris.9 A signal was Synthesized, composed of two sinusoids, one with frequency of
1 0 .5 fs/N and amplitude 1 .0 0 and the other with frequency 16.0 fs/N and amplitude 0 .0 1
( 4 0 .0 dB below iu c arst), with N ueing the number of samples in the window. The PSD
Biom ed ic al Signal Processing

1.50 W{ni * 0.5fi 0 46 cos In

25
1.25 r* - -25, -2- , 0. . . . 24, 25

t.QO

III!
HI Hj h
II '!! ni l;:
-2S -20 -15 -1Q -5 ^0 15 20 25

FIGURE 5A. The Hamming window. Upper trace, the window in the time domain; middle
trace, the window in the frequency domain, linear scale; lower trace, the window in the
frequency domain, logarithmic scale. (From Harris, F. J., Trigonometric Transforms, A
Unique Introduction to the FFT, Tech. Pub!. DSP-005 (8-81). Spectral Dynamic Division,
Scientific Atlanta. San Diego, 1981. With permission.)
Volume //: Compression and Automatic Recognition

FFTBin
10.5
16.0

30 *3 50 60 70 80

FiGL'RE 5B. The Hamming window. FFT power spectral density function estimation of syn
thesized signal consisting o f two slnewaves with frequencies o f 10.5 and 16 fs/N and amplitudes
o f 1.00 and 0.01, respectively. Data window was used (fs, sampling frequency; N. number of
samples in the window). (From Harris, F. J., Trigonometric Transforms. A Unique introduction
to the FFT. Tech. Publ, DSP-005 (8-81), Spectral Dynamic Division. Scientific Atlanta. San
Diego. 1981 With permission.)
Biomedical Signal Processing

FIGURE 6A. The Dolph-Chebyshev window with £ = 3.0. Upper trace,

the window in the time domain;’ middle trace, the window in the frequency
domain, linear scale; lower trace, the window in the frequency domain, log
arithmic scale. (From Harris, F. J., Trigonometric Transforms. A Unique
Introduction to the FFT, Tech. Publ. DSP-005 (8-81), Spectral Dynamic Di
vision. Scientific Atlanta, San Diego, 1981. With permission.)
Volume II: Compression and Automatic Recognition 151
o FFT Bin Ampi.
Signal 1. 10.5 1.0
Signal 2. 16.0 0.01

-20

-4 0 - -

-6 0 '■

---- j--- 1----1--- 1--- 1--- j--- 1--- 1--- 1--- 1

0 10 20 30 40 50 60 70 80 90 100

FIGURE 6B. The Dolph-Chcb) ,!. . .. . .' \ 0 . FFT power spectral density function
estimation o f synthesized signal consisting of two sinewaves with frequencies of 10.5 and 16
N and amplitudes o f 1.00 and 0.01, respectively. Data window was used (is, sampling
frequency; N, number o f samples in the window). (From Harris, F. J.. Trigonometric Transforms.
A Unique Introduction to the FFI\ Tech. Pub!. DSP-005 (8-81), Spectral Dynamic Division.
Scientific Atlanta, San Diego, 1981. With permission.)

REFERENCES

t. Papoulis. A ., Signal Processing. McGraw-Hill Int.. Auckland. 1981.

2. G old. B. and Rader, C . M ., Digital Signal Processing, McGraw-Hill. New York. 1969.
3. Beaucham p, K. and Yuen, C ., Digital Methods fo r Signal Analysis. George Allen and Unwin Ltd.,
London. 1979.
4. Brillinger, D. R ., Time Series: Data Analysis and Theory. Holden Day, San Francisco. 19S1.
5. .Jenkins, G . M . and W atts, D. G ., Spectral Analysis and Its Applications, Holden Day. San Francisco,
1968.
6. Tretter. S. A ., Introduction to Discrete Time Signal Processing, John Wiley & Sons. New York, 1976.
7. Oppenhetm , A. V. and Sc.«afer, R. W ., Digital Signal Processing, Prcntice-Hall, Englewood Cliffs.
N.J.. IW75. ' v
8. Riibiner. L. R, and Gold, B., Theory and Application o f Digital Signal Processing. Premice-Hall. En
glewood Cliffs, N .J., 1975.
9. Harris, F. J ., On the use o f windows for harmonic analysis with DFT, Proc. IEEE. 66, 51. 1978.
10. NuttaL A. H ., Some windows with very good sidelobe behavior, IEEE Trans. Acoust. Speech Signal
Process.. 29. 84, 1981.
11. Babic. H. and Temes, G. C MOptimum low-order windows for DFT systems. IEEE Trans. Acoust. Speech
Signal Process.. 24, 512, 1976.
12. Eberhard, A ., An optimal discrete window for the calculation of power spectra. IEEE Trans. Audio.
E lectroacoust.. 21, 37, 1973.
13. Rohling, H. and Schuermann, J ., Discrete time window functions with arbitrary low sidelobe level,
Signal Process., 5, 127, 1983.
14. Blomqvist, A., Figure o f merit o f windows for power spectral density spectrum estimation w ith the DFT.
Proc. IEEE, 67. 438, 1979.
15. Prabhu, K. M. M., Reddy, V. U ., and Agrawal, J. P., Performance comparison of data w indows. Elect.
Lett., 13, 600, 1977.
16. Geckilint, N . C . and Yavuz, D ., Some novel windows and concise tutorial comparison of window families,
IEEE Trans. Acoust. Speech Signal Process., 26, 501, 1978.
17. Harris, F. J ., Trigonometric Transforms, A Unique Introduction to the FFT, Tech. Publ. DSP-005 (8-
81). Spectral Dynamic Division, Scientific Atlanta. San Diego, 1981.
18. Brillinger, D. R., The key role o f tapering in spectrum estimation, IEE£ Trans. Acoust. Speech Signal
Process., 29, 1075, 1981.
19. Yuen, C . K ., Quadratic windowing in the segment averaging method for power spectrum computation,
Technomrtrics. 20, 195, 1978.
20. Yuen, C . K ., On the smoothed periodogram method for spectrum estimation. Signal Process.. I. 83, 1979.
Volume 11: Compression and Automatic Recognition 153

Appendix C

COMPUTER PROGRAMS
i
i
i. INTRODUCTION

This appendix contains a number of computer programs and subroutines for biomedical
signal processing. The programs are written in FORTRAN IV language and are u s e d o r the
VAX 11/750 com nuter, under the VMS operating system. The input to the vanou-. programs
aie vectors containing the samples of the signals to be processed. These are read from data
flies generated by the A 'D converter. The files arc therefore unformatted, integer fil es. In
order to make the software as compatible as possible with other machines, inpu: and output
statements, file definitions, and structuring arc done with input-output subroutine (RF1LE.
WFILE. RF1LEM. W FlLbM ). The user can adapt the software -o another computer or us e
different data files by just replacing these subroutines.
The programs use several mathematical subroutines mainly for matrix operations. Ail of
these subroutines are given in this appendix except for the subroutine EIGEN. which com
putes the eigenvalues and corresponding eigenvectors of a matrix. The listing of this sub
routine was not included in the appendix due to its length. Subroutines for eigenvalues and
eigenvectors computations can be found in one of the well known software libraries such
as the IBM System/360 Scientific Subroutine Package (SSP). the International Mathematical
and Statistical Libraries (IM SL). or the CERN Library.
The programs presented here are taken from the Bio-Medical Signal Processing Package
(BMSPP.) o f the Center for Bio-Medical Engineering. Ben Gurion Universii>. The listing
of the complete package could not be presented here, due to space limitation'. The few
programs presented here were selected to allow the interested reader ro implement some of
the processing methods discussed in this book.
Biomedical Signal Processing

IL M AIN PROGRAMS

PROGRAM NUSAMP
(VAX p S VERSION)
j
T H I S PF3GRAM PR OV IDE S5 THREE TYPES OF NON UNIFORM
SAMPLING WITH APPLICATIONS TO BIOMEDICAL S I G N A L S ♦
DATA I S READ FROM UNFORMATTED INTEGER F I L E .
THE USER HAS A CHOICE OF ONE OUT OF THREE NON UNIFORM
SAMFLINB METHODS FOR DATA COMPRESSION:

A* ZERO ORDER ADAPTIVE SAMPLING <VOLTAGE

TRIGGERING METHOD).

B» FI R S T ORDER ADAPTIVE SAMPLING (TWO POINT S

PROJECTION METHOD),.

C. SECOND ORDER ADAPTIVE SAMPLING (SECOND

DIFFERENCE METHOD > *

INPUT F I L E S : UNFORMATTEDfINTEGER F I L E WITH NREC=NO.

OF RECORDS AND NGP=NO. OF SAMPLES PER RECORD♦
EACH RECORD CONTAINS SAMPLES OF SIGNAL TO BE
COMPRESSED BY NON UNIFORM SAMPLING,

OUTPUT F I L E S : UNFORMATTED1 1NTEGERt FI L E WITH 3 RECORDS AND

NOP SAMPLES IN EACH RECORD.
RECORD I 4. THE ORIGINAL SIGNAL
RECORD 2 1 LOCATIONS OF NON UNIFORM SAMPLES
(A VALUE OF 5 1 2 I S PLACED AT EACH
SAMPLING LOCATION)
RECORD 3 : THE RECONSTRUCTED SIGNAL

REFERENCES!

1. COHEN >A ♦» BIOMEDICAL SIGNAL PR OC ESS IN G,

CRC P R E S S , CHAPTER 4

2. BLANCHARD» S . M . AND B A R R , R > C . , ZERO F I R S T AND

SECOND ORDER ADAPTIVE' SAMPLING FROM ECGS
PROC. OF THE 35TH ACEMB, PH 11AD ELPHIA » 1 9 8 2 r 2 0 9

3. PAHLM » 0 . , BORJESSQN >P . D. AND WERNER * 0 . r

COMPACT DIGITAL STORAGE OF EC G S» COMPUTER PROG,
IN B I O M E D . , 9 , 2 9 3 , 1 9 7 9

LINKI NG! RFILE

INTEGER I V E C ( 2 0 4 8 ) , K < 1 0 2 4 >, KP ( 2 0 4 8 ) , IA V ER ( 2 ) , I R E C < 2 0 4 8 ) ,

* IAUX < 2 0 4 8 )
REAL AVER< 2 )
BYTE NAME< 11 ) » NAME1 ( 1 1 )

READ INPUT F I L E

CALL RFILE ( NAMtr T"«r C , NOP, IAUV >

TYPE 2 0 9
FORMAT<1 H $ ' ENTER NO. OF POINTS FOR AVERAGING WINDOW: '}
ACCEPT * , IAW
IAW 2- IAW *2
Volume II: Compression and Automatic Recognition 155

207 FORMAT<2 X ' TH IS PROGRAM SAMPLES THE SIGNAL IN THE F I L E ' / 2 X

* 'B Y ONE OF THREE NON UNIFORM ADAPTIVE SAMPLING M E T H O D S? '/1 0 X
* ' O - Z E R O ORDER( VOLTAGE T R I G G E R >- T Y P I C A L T H R F S . * 0 ♦ 0 2 5 ' / 10X
* ' 1 - F I R S T ORDER (TWO POINTS P R O J E C T I O N ) - T H R E S * = 0 » 0 0 0 3 ' / 1 0 X
* ' 2 - S ^ C O N D ORDER (SECOND DIFFE RE NC E) -T YP IC AL THRES. = 0 . 0 0 0 2 ' / )
TYPE 2 0 8
208 FORMAT( I H f ' E N T E R TYPE OF SAMPLING < 0 , 1 , 2 K ')
ACCEPT * r ISM
TYPE 2 1 0
210 FORMAT( 1 H $'E NT ER THRESHOLD LEVEL ( I N VOLT S) : ' )
ACCEPT * , R
RM 0 D = R * 1 0 2 4
IF < I S M - 1 ) 1 0 * 4 0 * 7 0
C
C * * * * * * THE VECTOR K I S A VECTOR IN WHICH THE IND ICES OF THE SAMPLES
C ARE P L A C E li. * * * * * * * * * * * * * * * *
C
C
* * * * * * * *. . * + * * # * # # * * * ZERO ORDER (VOLTAGE TRIGGERED) M ET HO D** ** ***
C
C IF PRFVIOUS SAMPLING POINT OCCURRED AT INDEX K , WILL BE DETERMINED BY?
C A B S < X ( K < 1 + 1 ) ) -X CK ( I ) ) ) «GT.R
C WHERE R I S THE THRESHOLD,
C

10 CONTINUE

C
C OPEN AVERAGING WTNDOW
C
JJ=0
REF = 0
DO 1 4 1 = 1 , IAW
14 REF=REF+IV EC(I)
REF=RE F/IAW
XF IRS T = REF ' I N I T I A L CONDITION TO BE SENT FOR RECON.
DO 11 1 = 1 , ( N O P - I A W ) , IAW
IAV ER(1)=0
AVER( 1 ) " 0
DO 1 2 11 ~ 1 » IAW
12 A V E R ( 1 ) =AVER( 1 ) + IVEC <11 + 1 - 1 >
AVER ( 1 > = A VER(1 ) / 1 AW
IF <ABS<AVER< 1 ) - REF) .LE.RMOD) GO TO 11
C
C SAMPLING POINT I S NEEDED
C
JJ=^JJ+1
REF=AVER( 1 )
K ( J J ) " I+ IA W
11 CONTINUE
C
C PREPARING OUTPUT FOR PLOTING-RECONSTRUCTION OF SIGNAL
C
JJ = 1
IREC <1 ) = X F I R S T
' KF' ( 1 ) = 5 1 2
DO 7 0 1 1 1 = 2 , NOP
K P( I I ) = 0
I F ( K ( J J ) « NE 1 1 1 ) GO TO 7 0 0
K P( I I ) = 5 1 2
IR E C (II)= IV E C (II)
JJ=JJ+1
GO TO 7 0 1 (
700 IR E C (II)= IR E C < II-l)
701 CONTINUE
GO TO 7 7 7
C
40 CONTINUE
C
156 Biomedical Signal Processing

€#****** FIRST ORDER (TWO POINTS PROJECTION )*

€
C I f PREVIOUS SAMPLING POINT OCCURRED AT
C INDEX h THE NEXT SAMPLING- r AT K ( I + 1 ) . . WILL
C BE PETERMINED HYI
C ABS (XDOT <K( I + 1 )-XDOT< K( I > ) ) *GT .R
C WHERE XDOT IS THE ESTIHSTION Of THE OERIVATIVE
C AND R IS A DERIVATIVE THRESHOLD*
C
€
JJ=0
AVER < 1 ) - 0
AVER( 2 ) ~ 0
DO 1 5 I = l f I A W
AVER <• 1 ) = AVER < 1 ) + 1VEC 
15 AV £R C 2>= A V£ Rt 2W -I VEC
AVER( 1 > ~A V E R i1 > / IA W
nV E R {2)= A V £F ( 2 '/IAW
REF ~ <AVER( 2 ) -AVER ) / ~AVERX 2)
DO 16 1 = IAW2 f ( NOP-IAW > , IAW
AVER < 2 ) “ 0
DO 1 7 I I = 1-»IAW
17 A V £R ( 2 > * A VER<2 > + 1VEC <11 + 1 )
AVER* ( 2 > “ AVER i 2 ) / 1 AW
XDOT” ( AVER < 2 ) -AVER < 1 ) ) / ( I A W - 1>
I f< A » S C P E F ~ X D G T > .L E V e n OD) GO TO 8 0 1
r
C SAMPLING POINT I d NEEDED
C
JJ=JJ+1
REf-XDOT
K <JJ)= 1+ 1AW
KP <I*KIAW )=512
801 DO 8 0 0 Il^ Ir (IA U + I)
800 IREC(11)~IREC C11 — 1 >+R£P
16 AVER( 1 ) =AVEF: <2>
GO TO 7 7 7
70 CONTINUE
C
************ SECOND ORDER-SECOND B I f f E R E N C E METHOD
C
C I f PREVIOUS SAMPLING POINT OCCURRED AT
C INDEX K ( I ) THE NEXT SAMPLING POINT * AT INDEX K ( H 1 ) >
C WILL BE DETERMIND BY t
C
C A B S<X D 0T<K (I+1>>-X D 0T <K <I+1)-1>>*G T.R
c
C WHERE XDOT I S THE ESTIMATE O f THE DERIVATIVE AND
C R I S THE THRESHOLD.
C
JJ=0
AVER( 1 > ~ 0
AVER( 2 ) - 0
DO 4 5 1 = 1 fIAW
AVER < 1 ) =AVER( 1 ) + 1 VEC( I )
AVER <2 > = AVER <2 > + 1 VEC / 
AVER( 1 ) =AV £R( 2 )
DO 4 6 I = I A W 2 r (N O P - I A W ) r I A W
AVER < 2 ) = 0
DO 4 7 1 1 = 1 rIAW
A V E R (2)=A V E R (2)+IV E C <II+I)
AVER i 2 ) =AVER < 2 ) /IA W
XDOT - ( AVER < 2 ) - AVER < 1 > > / . LE.RMOD) GO TO <SC>i
M im e It: Compression and Automatic Recognition 157

C
C SAMPLING POItfFTS NEEDED

1J- l+.JJ
K • JJ >- If I A!/.'
?.P T+ IAW) ='51 2
XHOTP-XDOT
60 L vO 600 I I » I * >+XD0Tf
nvfc&a>*-AVER<2*
46 >:iCTP^XD0T

o uT P ur f j i t has 3 r e c o r d s ;
c u OKJGINAi SIGNAL
C ?«. LOCATIONS Of NON U N J . SAMPLES.
C 2L. RECONSTRUCTED SIGNAL
r
C
777 CONTINUE
C
c
C WRITE SkSULTS ON OUTPUT FI LE
c
TYPE 2 1 1
211 F Q R K A T U H t ' E N i m OUTPUT FILE NAME: ')
ACCEPT 1 1 9 * N C « M N A « E 1 
119 FORMAT <0* 1 1 A U
K0P2=N0P*2
CALL AS SI GN ':2.«*ftMElrll>
D EF INE F I L E 2 0 * W 9 P 2 * U f IVAR)
wR I T E ( 2 7 1 ) 
WRITE ( 2 ' 3 ) i IREES-II) * 1 1 - 1 » NOP)
CALL CL OSE( 2 )
C
c PRIN T PROGRAM'S S T A T I S T I C S
C
P R IN T 9 0 0
900 FORMAT</ 2 5 X ' RES&.TS OF NUSAMP PROGRAM'>
P R IN T 9 0 1
901 FORMA T< 25 X' * * * * * # * # * # * * * * * * * * * * * * * * * # * * ' / )
PR IN T 9 0 2 * < N A M O l ) r I = l * U >
902 F G R M A T d S X 'I N P S T FILE NAME! ' * 1 1 A 1 >
PR IN T 9 0 5 * IAW
905 FORMAT<2 5 X ' NO. 8F SAMPLES IN AVERAGING WIND0W= '14)
P R IN T 9 0 8 rR
908 FORMAT( 2 5 X ' THRESHOLD LEVEL= ' E 1 0 . 3 )
P R IN T 9 0 1
PR IN T 9 0 6 * ISM
906 FORMAT < 1 5 X ' NON SiilFORM SAMPLING OF ORDER ID
P R IN T 9 0 7 * < N A H E 2 U > * 1 = 1 * 1 1 )
907 F 0 R M A T ( /1 5 X 'N A a & OF OUTPUT FILE.* ' H A D
P R IN T 9 0 9 * ( J J - l i
909 FORMAT( 2 5 X ' NO• SF SAMPLES USED = ' 1 6 )
C
C C O M P R E S S ^ RATIO <CR> I S THE RATIO BETWEEN
C NO. OF SaSPLES ( 1 2 B I T S ) OF ORIGINAL SIGNAL
C AND NO. OF SAMPLES OF NON UNIFORMALY SAMPLED

C SIGNAL <JJ> PLUS THE LOCATIONS OF THE SAMPLES

C (THESE ARC QUANTIZED AT 8 B I T S )
C
I F ( J J . N E . O ) C1t*HQP*l2/< J J * 2 0 >
PR INT 9 1 0 * CR
910 FORMAT < 25X ' COMF&ESSIGN RATIO = 'E 1 0 .3 )
PR IN T 9 0 1
S TOP
EM*
158 Biomedical Signal Processing

PROGRAM SEGMNT
(VAX VMS VERSION)

n n n
TH IS PROGRAM PROVIDES ADPTIVE SEGMENTATION OF
on
A SAMPLED FUNCTION* SEGMENTATION I S PERFORMED BY
ESTIMATING AN AR FILTER FOR AN I N I T I A L REFERENCE
non

WINDOW OF THE SI GNA L. THE INVERSE OF TH IS M L T E R

(THE WHITENING FI L T E R ) I S USED TO UHITEN THE SAMPLES
OF A SLIDING WINDOW CONTINUOSLY RUNNING ALONG THE
no

TIME AXIS THE WHITNNES OF THE RESIDU ALS (THE OUTPUT

OF THE F U T f r R ) I S EXAMIND. AS LONG AS THE RESIDUALS
ARE CONSIfERED WHITE THE CORRESPONDING SIGNAL WINDOW
o o o o o o o o o o o o o o o o n o o o o o

BELONGS TL T™E PREVIOUS SEGMENT. WHEN THE WHITNNES

MEASURE <SEM> CROSSES A GIVEN THRESHOLD- THE
WINDOW I S CONSIDERED BELONGING TD A NEW SEGMENT.
A NEW REFERENCE WINDOW I S DEFINED AND THE PROCESS
CONTINUES,

1, UNFORMATTED INTEGER DATA FI L E TO BE SEGMENTED

WITH NOR RECORDS AND NOSR SAMPLES PER RECORD.

1. UNFORMATTED INTEGER FIL E HOLDING THE SEM FUNCTION

WITH NOR RECORDS AND NOSR SAMPLES PER RECORD.
(NAME XXXXX GIVEN BY USER)

2. FORMATTED INTEGER FI LE HOLDING THE AR ORDER ,

o o no

THE NO. OF SEGMENTS ( I S E G ) AND THE I N D IC ES

OF THE SEGMENTS-FILE NAMEi
( THREE FI R S T CHARACTERS OF IN F I L E ) . IND
o-o

3. FORMATTED REAL F I L E HOLDING THE LPC AND PAR

OF EACH SEGMENT-FILE NAME.*
on

< THREE FIR ST CHARACTERS OF IN F I L E ) *LPC

_ INK t NACOR , BLPC , R F I L E , WFILE

n o o n

reference:
non

1. COHENrA. BIOMEDICAL SIGNAL PR OC ES SI N G,

CRC P R E S S , CHAPTER 7
n o o n n n o

2 . , BODENSTE IN , G. AND PRAETOR I OS , H. M. FEATURE EXTRACTION

FROM EEG BY ADAPTIVE SEGMENTATION, PROC. I E E E .
6 5 ,6 4 2 ,1 9 7 7

DIMENSION COR( 4 1 ) , A U X ( 4 1 ) , S M P R ( 1 0 2 4 ) , CORRES( 4 1 )

* , RES( 1 0 2 4 )
REAL LPC( 4 1 ) , PAR( 4 1 )
BYTE N A M E ( l l ) , N A M E 1 ( 1 1 ) , N A M E 2( 1 1)
INTEGER I S M P ( 1 2 2 8 8 ) , I N D C ( 1 0 0 0 ) , I S E M ( 1 2 2 8 8 ) » I A U X ( 2 0 4 8 )

C
C READ INPUT F I L E
C
CALL P 17ILE ( NAME , ISMP , NTS , I AUX )
100 CONTINUE
C
c
c OPEN LPC SPAR OUTPUT FI LE
Volume It: Compression and Automatic Recognition 159

NAME < 4 ) = ' * '

NA M E <5)='L '
N A M E < 6 )= : 'P '
NAME $ 7 ) ~ ' C '
CALL AS SI GN < 2 »NAME r 7 )
731 FO R M A T (2 E 1 2 » 5 >
C
C
C CALCULATE LPC OF REFERENCE WINDOW
C REFERENCE WINDOW HAS 2 5 6 SAMPLES
C AND A FILTER OF ORDER INN <HAX« 4 0 ) I S USED
C
C
C
ISEG=0 ! ISEG I S THE NO. OF CURRENT SEGMENT
ICW^l ! ICW I S THE INDEX OF CURRENT SAMPLE.
IB R = 1 ! IBR I S THE INDEX OF CURRENT REFERENCE WINDOW
INDC <1> = 1 « INDC I S THE VECTOR IN WHICH THE SEGMENTS
C INDICES ARE STORED ( = I B R )
IW” 1 2 8 ! IW I S THE NO. OF SAMPLES IN THE REFERENCE
C AND SL ID IN G WINDOWS
INN” 20 I INN IS THE ORDER OF THE WHITENING FILTER
C (MAX. 4 0 )
S E MT H= 5. ! SEMTH I S THE THRESHOLD FOR SEM
R E S (l) - 0 , 0
IWNN-IW+INN
I.NNl= INN + 1
C
C BEGIN CALCULATIONS OF THE I S E G ' TH REFERENCE WINDOW
C
103 CONTINUE
ISE G -1SE G +1
TYPE 5 5 5 * ISEGfICW
555 FORMAT(X'PROCESSING SEGMENT N O , : ' 1 3 ' SAMPLE N O . : '.5 >
IB R “ I CW
IN D C (ISEG )~IB R
DO 1 0 2 1 = 1 flW
102 SMPR ~ISMP < IBR + 1 - 1 )
CALL NA CO R(S MPRrIWr CORrINN1? ENG)
CALL DL PC( INN » COR »L P C, PAR » AUX * ERR)
C
C LPC HOLDS THE WHITENING FILTER OF THE I S E G ' TH
C REFERENCE WINDOW
C
C WRITE LPC AND PAR ON OUTPUT FILE
C
DO 7 3 0 I = 1 fIN N
730 W R I T E ( 2 f 7 3 1 > <LPC )
C
C PREPARE DATA VECTOR FOR F IR S T SLIDING WINDOW
C
ISW= IBR +I W ! ISW I S THE INDEX OF THE FI RS T SL ID IN G WINDOW
DO 7 0 1 I = l f I W
701 SMPR<I )=IS M P< IS W + I - 1 )
ICW = IS W -1
C
C ZERO SEM VECTOR IN REFERENCE WINDOW
C
DO 7 1 7 I = I B R » ICW
717 ISEM < I ) = 0
104 CONTINUE
C
C START CHECKING S L ID IN G WINDOWS IN CURRENT SEGMENT
C UPDATE SL IDI NG WINDOW
ICW=ICW+1 ! ICW I S THE INDEX OF THE END OF CURRENT
C SL IDI NG WINDOW < CURRENT SAMPLE)
I F < ICW• GE. N T S ) GO TO 7 1 1 1 END OF DATA VECTOR
DO 7 0 2 I = 1 » I U - 1
702 SM PR=SMPR(1 + 1 )
SM PR (IW )= IS M P< IC W + IW )
I F U C W . G T . (ISW + IW) ) GO TO 7 0 5
160 Biomedical Sigmi Processing

£ CALCULATE THE RESIDUALS OF F I R S T S L I D I N G WINDOW

C
DO ? 2 0 1 = 1 » I N K - 1
. ■ R £ 3 * L P C < J )
DO 7 0 0 I - I N N , I W
RES =SMPR 
DO 7 0 0 J = 1 , INK
RES M > - R E S +SHPR* I - J ) f LPC ( J )
700 CONTINUE
C
€ CALCULATE CQRRELTI0NS OF RES
C
C fin d CORRELATIONS FOR F I R S T SL ID IN G WINDOW

CALL NACOR <RE S* IW »CORRES. INN rENGRE.S)

DO 7 0 8 I = 1 f 1NN1
708 CORRES*ENGRES
CTH = 0 »9#.EKGRES
GO TO 7 0 6
C
C CALCULATE NEW RESIDUAL
C
705 CONTINUE
RESM=S«PR(IW>
DO 7 3 3 J = ,t » INN
733 RESW“ RESW-f SMF’R ( IW- J >$ LPC <J )
CuRRES <1 > ^CORRES < 1 > -R ES < 1 ) * R E S < 1 ) +RESW*RE3W
C
C FIND CORRELATIONS ITERATIVELY FOR ALL SL ID IN G
C WINDOWS EXCEPT THE F I R S T ONE
DO 7 0 7 J = 2 r I N N l
707 COF:RES < J > “ CGRRES < J ) -•RES *R E S ( J >+RES &RESW
C
C SHIFT RESIDUALS VECTOR
C
DO 7 3 4 I = i , I W - l
7 34 R E S(I) = R E S (I tl>
RE S( IW) =RESW
706 CONTINUE
C
C CLIP CORRELATIONS TO REMOVE SHORT TRANSIENT
C ARTIFACTS
C
DO 7 0 9 I - l , INN1
709 I F ( C O R R E S ( I ) *GT. CTH ) CORRES( I >=CTH
C
C CALCULATIONS OF SEN
C
■SUM= 0 . 0
DO 7 1 0 I = 2 r I N N l
710 SUM=SUM+( CORRES( I ) ) * ( C O R R E S )
SEM=<ERR/CORRES(1 ) - l > * * 2 + 2 * S U M / ( C O R R E S < 1 ) * C O R R E S ( 1> )
ISEM ( ICW >= INT (•< SEM *4 0 9 ♦ 6 >+ 0 « 5 )
C
C COMPARE SEM WITH THRESHOLD
C
IF <SEM. GT. SEMTH) GO TO 1 0 3 ! START A NEW SEGMENT
GO TO 1 0 4 ! STAY IN CURRENT SEGMENT
C SHIFT S L ID I N G WINDOW
711 CONTINUE ! END OF DATA VECTOR
C
C CLOSE LPC OUTPUT F I L E
C
CALL CLOSE ( 2 )

C OPEN AND WRITE OUTPUT SEM FILE

Volume II: Compression and Automatic Recognition

CALL WFI LE <NA M£ 1, XSEM• * T&NORS>

IV E ft = 2
C PR TNI 7 2 2 , IVER
C722 F O R M A T \2 0 X ,' R E S U L T S 0 r 5E5HNT PROGRAM. . . . VERSIONJ ' 1 2 )
P R IN T 7 2 3 , <NAME , I" t » 1 it
723 FORMAT 5X "INPUT f- ILE NAMF-i ' H A D
PRIN T 7 2 5 » ISEG
725 FORMAT </ 5 X ' DATA WAS L'lVlISI* INTO ' 1 3 ' SEGMENTS')
PR IN T 7 2 6 , ( NAMEl “ ' I '
N A ME \ 6 ) - N'
NAME-, 7 ) - / Ii /
CALL ASS IGN 2 » NAME t ~
W P I T £ < 2 , 7 2 7 ) , INN>I8EC-
727 FC'^’rtAT ( 21’5 )
DC- ?2& I * I , ISEG
728 u* [ t £ « , 2 f 7 2 7 ) U N D C U
CALL CLOSE <2>
99? CON'.'HUE
STOF
END

NLS-'NT S + N2PAD
N -l
Nf ‘ ~ O
I N~ N* 2
N P ~ N F‘+ i
I F ( N . L T . N L S ) GO TO i
T YPE 1 0 5 , NLS
105 FORMAT (X'LENGTH OF P* I'D ED1 DATA UECTQRfPOWER OF 2 ) : '14)
I r ( N L S . G T , 2 0 4 8 ) TYPE 1 0*
10? FORMAT(X'MAX. LENGTH QF S£TA VECTOR I S 2 0 4 8 ! ! ' )
C
C
DO 3 1 1 = 1 , NTS
COR<II )= I S A M P < I I )

ZERO PADDING

DO 5 I = N T S + 1 , NLS
C0 R ( I ) = 0 . 0
C
C FFT CALCULATIONS
C
CALL FTO 1 A ( NLS » 2 , COR ♦ CGRU
NL SH -N LS/2
DO 6 I - 1 f NLSH
6 COR( I ) =SQRT <COR( I ) *CCR ( I H G Q R I 
DO 7 1 = 1 , NLSH
7 ISAMF . CMAI)* 1 0 2 4 + 0 . 5 )
C
C * * * * * * * * * OUTPUT PROCEDURES « * * « * * * *
C
CALL WFILE <NAME 1 , I S A ^ P , NLSH, NORO)
PRIN T H O
110 FORMATC20X'RESULTS Qc PROGRAM P E R S P T ~ ' / 2 5 X
* 'E ST IM A TI ON OF PSD P* THE PER IODOGRAM' )
PR IN T 1 1 1 , ( N A M E ( I ) , 1 = 1 > 1I>
162 Biomedical Signal Processing

111 F O R M A K / l O X ' I NP UT F I L E NAMEi ' 1 1 A 1 >

PRINT 112»NTS
11c FORMAT< 1 0 X ' TOTAL NO* OF S A MP L E S * ' I 4 >
P R I N T 1 1 3 » NZPAD
113 FO RMAT < 1 0 X ' NO * OF PADDI NG Z E R OE S i ' 1 4 )
P R I N T 1 1 4 , ( N A ME I ( I ) j I = 1 > 1 1 ) » N L S H
114 FO RMAT < / l OX" O UT P UT F I L E NAME * ' 1 1 A 1 /
* l O X ' N O * OF RECORDS? 1 NO. OF s a m p l e s : ' 14 >
STOP
END

PROGRAM PERSPT
<VAX VMS VERSION)

TH IS PROGRAM ESTIMATES THE PSD FUNCTION

BY MEANS OF THE PERIODOGRAM <AB S. VALUE OF
WINDOWED DATA RECORD F F T ) . T H E DATA RECORD
I S AUGMENTED BY PADDING ZEROES AS REQUESTED
BY THE USER*

reference:
1* COHENrA* f BIOMEDICAL SIGNAL PROCESSING
CRC P R E S S » CHAPTER 8
2, OT EN SrR .K * AND ENOC HSON »L.f DIGITAL
TIME SE R I E S ANAL YSIS W I L E Y r l 9 7 2

UNFORMATTED INTEGER DATA F I L E

NO. OF RECORDS AND SAMPLES DETERMINE!
BY USER

UNFORMATTED'* INTEGER FIL E WITH ONE RECORD

STORING THE NORMALISED PSD ESTIMATIONS

F T 0 1 A» XT ER M» RF ILE ,W FIL E

INTEGER ISAMP< 2 0 4 8 ) » I A U X ( 2 0 4 8 )
REAL SAMP( 4 0 9 6 ) , COR( 2 0 4 8 ) r C O R I <2 0 4 8 )
BYTE NAME< 1 1 ) rNAMEI (1*1 )

READ INPUT F I L E

CALL RF IL E( NA M E» IS AM P» NT Sr I A U X )
TYPE 1 0 2
FORMAT( l H f " G I V E NO. OF PADDING ZEROES: ')
ACCEPT * , NZPAD

CHECK FOR POWER OF TWO

PROGRAM WOSA
(VAX VMS VERSION)

TH IS PROGRAM ESTIMATES THE POWER SPEC RAL

DENSITY FUNCTION ( P S D ) OF A STOCHASTIC PROCESS
BY MEANS OF THE *WOSA* METHOED. THE TIME SAMPLE
Volume II: Compression and Automatic Recognition 163
FUNCTION PROCESS I S DIVIDED INTO SEGMENTS ( I N GENERAL
n n o n n o o n o n n n o n o n n o o o n o o o o o o n n o n o o o o n o
OVERLAPPING SEGMENTS) EACH SEGMENT I S WINDOWED BY
MEANS OF RECTANGULAR,TRIANGULAR OR HAMMING WINDOW I T S PSD
I S ESTIMATED BY THE SQUARE ABSOLUTE OF THE FFT .
THE FI NA L PSD IS| ESTIMATED BY AVERAGING ALL THE
SEGMENTAL P S D S . 1
THIS METHOED I S KNOWN AS WINDOWED OVERLAPPING
SPECTRUM A N A L I S I S <WOSA>.

references:

1. COHEN,A# BIOMEDICAL SIGNAL PROCESSING,

CRC PR E S S , CHAPTER 8

2♦ WELCHrP♦D . THE USE OF FAST FOURIER TRANSFORM

FOR THE ESTIMATION OF POWER SPEC 1 RA 5 A MS THOU
BASED ON TIME AVERAGING OVER SHORT MODIFIED
PERIODGGRAMS, IEEE TRAS . AUDIO ELECTRO.
,A U -1 5 ,70,1967

THE REQUESTED INFORMATION I S :

i <1> INPUT FIL E NAME.
( 2 ) NUMBER OF SAMPLES IN THE INPUT F I L E .
( 3 > LENGTH OF SEGMENT ( POWER OF 2> FOR F F T .
( 4 ) PERCENTAGE OF OVERLAP BETWEEN SEGMENTS.
( 5 ) TYPE OF WINDOW : ( a ) - r e c t a n d u l s r w i n d o w
 t r i a n g u l a r w i n d o w
< c ) h a m m in g w i n d o w
( 6 ) OUTPUT FI L E NAME.

INPUT f il e :

UNFORMATTED INTEGER DATA FI LE WITH NOR RECORDS

AND NOSft SAMPLES

OUTPUT f i l e :

UNFORMATTED INTEGER FI LE WITH THE AVERAGED

ESTIMATED PSD (ONE RECORD AND NLSH SAMPLES)

FTOl ArXTERMfRFILErWFILE

DIMENSION I S M P ( 1 6 3 8 4 ) , I A U X ( 2 0 4 8 )
INTEGER PEROV » I S P A C E , NEWS I Z , NOOREG, N O S , NLS * NLSP , NOREC
BYTE NAME< 1 3 ) , A A ( 9 ) , N A M E O ( 1 3 )
REAL F R E ( 2 0 4 8 > , FIM <2 0 4 8 ) , S P C T ( 2 0 4 8 )

DO 1 9 9 1 = 1 , 1 6 3 8 4
ISM P (I)= 0
199 CONTINUE
C
C READ INPUT FI LE
C
C
CALL R F I L E ( N A M E , I S M P , N O S , I AUX)
TYPE 3
FORMAT < 1H* ? 7 GI VE LENGTH OF SEGMENT (POWER OF 2 ) FOR FFT J ')
ACCEPT*,NLS
I

CHECK I F POWER OF 2

N= 1
N P =0
164 Biomedical Signal Processing

40 N^-N*2
NP~NP+1
I F <N. L E * N L S ) GOTO 40
NLS=H/2

T YPE 4 *NLS
4 FORMAT ( i X f 'LEN G T H OF SEGMENT (CLO SEST -POWER OF 2 ) I S t '* 1 4 )
TYPE 5
5 FORMAT<1 H $ * 'G IV E PERCENT O V ER LA PPIN G BETW EEN SEGMENTS t ')
ACCEPT %>PEROV

TYPE 8
8 FORMAT<1H$ * 'G IV E WINDOW** REGTAN»=1* T R IA ,= 2 * HAMMING=3? { ')
A C C E P T * *IW
C
c
€ CA LCULA TING WITH PERCENTAGE OF OVERLAP (P E R O V ) AND SEGMENT
C LENGTH NLS THE NO* OF SEGMENT A V A IL A B L E *
C
NDSP=< NLS*PEROV> /iOO
ITEMP-N-LS
N O REO l
29 IF M T E M P . G T *NGS> GOTO 30

JT EH P= IT£ M P- NL SP+ NL S
NOREC-NORECil
GOTO 2 9
30 N0REC=N0REC-1 ! NOREC I S THE NO. GF OVERLAPPED SEGMENTS
C
C CALCULATE FFT OF EACH SEGMENT ANB AVERAGE
C
NLSH -N LS/2
IB-NLSP-NLS+2
DO 1 9 7 1 = 1 tNOREC
IB = IB *N L S -i~N L S F
DO 1 9 6 J ~ 1 • NLS

196 F R E ( J ) = I S jMP ( IB + J - 1 )
CALL F TO1 A <NLS r 2 *FR E* FIM 5
DO 1 9 5 J = i* N L S H
195 S P C T < J >=SPCT < J ) +S QRT <FRE<J ) * F R E < J ) ± F IM ( J ) * F I M <J >)
197 CONTINUE
DO 1 9 4 J = i *NLSH
194 SF*CT< J ) = S P C T <J ) /N O R E C
CALL XTERM<SPCT,NLSH»SMAX»SHIN>
DO 1 9 2 1 = 1 » NLSH
192 I S M P * 1 0 2 4 -! - 0 .5 >
C
C WRITE OUTPUT F I L E
C
CALL WFILE( NAMEO* I SM P*NLSH * NOR)
191 FORMAT < E 1 2 * 5 )
CALL DATE(AA)
PRINT 6 0 2 , < AA 
C
PRINT 7 0 2
702 FORMAT<19X*'*********%*****%%%%%%%%%%%%%%%%%%%%%%%' >
C
C
C .
PRINT 6 0 4 * ( N A M E ( I ) * I = 1 * 1 1 >
604 FORMATC/lOX* ' * * * * * I N F ‘UT ORIGINAL DATA FILE** '*11A 1>
C
c
PRINT 6 0 7 * PEROV
607 FORMAT( / 2 X * ' PERCENTAGE OF OVERLAPPJ '*I3>
C
PRINT 6 0 8 * NLSP
608 F0RMAT</2X»/NUMBER OF POINTS OVERLAPPED BETWEEN SEGMENTS: '*14)
C
Volume II: Compression ami Automatic Recognition 165

PRINT 6 0 9 t NOREC
609 FORMAT ( / 2 X * ' NUMBER OF SEGMENTS WILL SE USED IN "WOSA*: '*14)
C-
Cf t * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
r
If *IW--2> 610?6 2 0 j630
c
610 PRINT 6 1 1
6M FQF*AT</5X»'TY PE OF UrNDOW: RECTANGULAR")
GO TO 6 4 0
C
625 P?:INT 6 2 1
621 f CRH A r ( / 5 X r 'TYPE OF WINDOW? TRIANGULAR')
GO TO 6 4 0
C
630 F5-IN7 631
63! FORMAT < / 5 X r 'TYPE OF WINDOW: HAMMING')
C
640 PRINT 6 4 1 » ( NAMEG =
64S FORMAT</10X» 'OUTPUT FILE NAME: ' » 1 1 A1 >

C
C
99? STOP
END

PROGRAM MEMSPT
(VAX VMS VERSION)

THIS PROGRAM ESTIMATES THE PSD FUNCTION

BY MEANS OF THE MAXIMUM ENTROPY METHOD
(MEM) OR THE AUTO REGRESIVE <AR) METHOD.

reference ;
1. COHEN >A ♦ r BIOMEDICAL SIGNAL PROCESSING
CRC PR ESS* CHAPTER 8

2. BURG»J ♦P ♦ y MAXIMUM ENTROPY SPECTRAL ANALYSIS

PRGC. 37TH ANN * I N S T « MEETING S O C . EXF’LOR.
GEOPH, r OKLAHOMA» 1 9 6 7

INPU T?
UNFORMATTED INTEGER DATA FI LE
NO♦ OF RECORDS AND SAMPLES DETERMIND
BY USER

UNFORMATTED>INTEGER F IL E WITH ONE RECORD

STORING THE NORMALISED PSD ESTIMATIONS

l in k in g :
NACORfDLPCrXTERM»RFILE»WFILE

INTEGER I S A M P ( 2 0 4 8 ) t IAUX(2048)
REAL SAMP( 4 0 9 6 ) > COR< 4 1 ) r LPC( 4 1 ) f PA R ( 4 1 ) rRHO( 4 1 ) * AUX< < 1 )
BYTE NAM E(1 1 ) » NAM El( 1 1 )

READ INPUT FILE

CALL R F I L E <N A M E r I S A M P >N T S »IAUX)

166 Biomedical Signal Processing

DO 3 I ~ 1 * N T S
3 SA M P fI>=ISA H P (I>
887 TYPE 1 0 3
103 FORMAT <H$ GIVE ORDER OF AR MODEL: ')
ACCEPT *»NAR
IFCMAR » LE « 4 0 ) GO TO 8 8 8
TYPE 1 0 9

10? FORMAT( X'MAX♦ ORDER OF AR MODEL I S 4 0 H ' >

GO TO 8 8 7
888 TYPE 8 8 6
886 FORMAT U H * ' G I V E ORDER OF SPECTRUM VECTOR: '>
ACCEPT # r IT
C
C CALCULATIONS OF CORRELATION
C .
NC QR Pl =N AR f1
CALL NACOR( SAMP>N T S >CORrNC0RP1fENG)
C
C ESTIMATIONS OF AR COEFFICIENTS
C
CALL DLPC<NAR t COR »LPC t PAR»AUX* ERR)
C
C PSD ESTIMATION
C CALCULATE RHOO WITH A0~-1
C
RH00~1.0
DO 3 2 0 K= 1 * NAR
320 RHQC=RHOO+LPC(K>*LPC<K>

C CALCULATE RHOI
C ,
DO 3 6 0 I ~ 1 »NAR-1
RH O I=LPC(I)
DO 3 4 0 J - 1 * N A R - I
3 40 RHOI=RHO1 + ( LPC <J )> * ( L P C( J + I ) )
RH 0( I ) =RHOI
36.0 CONTINUE
RHO( NAR) ~ L P C ( NAR)
C
C CALCULATE THE DISCRETE SPECTRUM
t: '
PI2 = 8 .0 * A T A N < 1 . 0 )
IT2--2* I T
DO 4 0 0 K= 1 r I T
SI G M A =0 .
DO 3 8 0 1 = 1 y NAR
S I G M A =S IG M A + ( R H O ( I > ) *COS 
400 CONTINUE
C
C * * * * * * * * * NORMALIZATION OF ESTIMATED PSD FUNCTION * * * * * * * * * *
C
CALL XTERM(SAMP»ITtCMAXrCMIN)
ACMIN=ABS( CMIN)
■> IF<CMAX«LT «ACMIN) CMAX=ACMIN
DO 7 1 = 1 » I T
7 I S A M P ( I ) = I N T < ( S A M P ( I >/CMAX) # 1 0 2 4 + 0 ♦ 5 )
C
C * * * * * * * * * OUTPUT PROCEDURES * * * * * * * * *
C
CALL W F I L E ( N A M E l » I S A M P f I T f NORX)
PRINT 1 1 0
110 FORMAT<20X'RESULTS OF PROGRAM MEMSPT-/ / 2 5 X
Volume 11: Compression and Automatic Recognition 167

PROGRAM NOI CAN

(VAX VMS V E R S I O N )
o o o o o o o o o o o o o o o o o o o o n n o n o o n o o n o r i

T H I S PROGRAM USES THE LMS ADAPTI VE COMBI NER

TO RE A L I Z E ADAPTI VE N O i S E CANCELLI NG F I L T E R

in put :
1, UNFORMATTED I NTEGER DATA F I L E HOLDING
SI GNAL SAMPLES ( PRI MARY I N P U T )

2. UNFORMATTED I NTEGER F I L E HOLDI NG REFERENCE

SI GNAL SAMPLES.

output:

1. UNFORMATTED INTEGER FI LE HOLDING FILTERED

OUTPUT SAMPLES.

lin k : LM S, U F IL E , R F IL t

references : .

1* COHENt A . BIOMEDICAL SIGNAL PROCESSING

CRC PRESS* CHAPTER 9 .

2. WIOROW, B . , ET AL , ADAPTIVE NOISE CANCELLING

P RIN CIP LE S AND AP PLICATIO NS PROC. I E E E . , 6 3 , 1692,1975

D I M E N S I O N ISI< 1 0 0 0 0 ) , I S R ( 1 0 0 0 0 ) » I S O ( 1 0 0 0 D > , I A U X C 1 0 2 4 "■

R E A L M U f X ( 5 0 > r W ( 50)
INTEGER ORD
BYTE N A M E I U 1 ) , NAMER< 1 1 ) , NAMEO< 1 1 )
C
C READ INPUT AND REFERENCE F I L E S
C
TYPE 1 0 0 0
1000 FO K M A T (// X 'D A T A INPUT T I L E : ' )
C A L L RF I L E ( NAME I >I S I »NSAMI» I A U X )
TYPE 1 0 0 1
1001 FO RMA T(/ /X'RE FE RE NC E INPUT F I L E : ')
CALL R F IL E < N A M ER ,I SR ,N SA M R, IA UX )
C
C GET F I L T E R ' S PARAMETERS

* 'ESTI MA TIO N OF PSD BY THE MAX. ENTROPY ALGORITHM < A R ) ' )

PRINT 1 1 1 , <NAME<I) , 1 = 1 , 1 1 )
111 F O R M A K / l O X ' INPUT FILE NAME: ' 1 1 A 1 )
PRINT 1 1 2 , NTS
112 FORMAT( 10X'TOTAL NO. OF SAMPLES: ' 1 4 )
PRINT 1 1 3 , NAR
113 FORMAT ( 1 0 X ' ORDER OF AR MODEL’. ' 1 4 )
PRINT 1 1 4 , < NAMEI< I ) , 1 = 1 , 1 1 ) r I T
114 FORMATC/IOX'OUTPUT FILE NAME*. ' 1 1 A 1 /
* l O X ' N O . OF RECORDS*. 1 NO. OF SAMPLES: ' 1 4 )
STOP
END
Biomedical Signal Processing

type 5
FO RM AT </H *'6 IV E FILTERS PARAMETERSi MU? ORDER AND GAIM5 ')
ACCEPT * ? MU»ORBfGAIW

PREPARE REFERENCE VECTOR

DO x 0 2 I ~ l f O R D - l
102 X*I5R<0R&~-I>*GAIN'.

C IN IT I A T E WEIGHTING VECTOR S E P SI
c ■
DO 10 3 I~i yQRD
10 3 W*0»
EF'SI“ ISX (O RB -1 > * G A I N

C IN I T I A T E OUTPUT VECTOR
C
DO 1 0 6 1 = 1 rORB
106 T S O ( I >= I S I < I )
o n c

START FILTERING DATA

10 4 CONTINUE
DO 1 0 5 J=ORD»NSAMI
C
t: UPDATE REFERENCE VECTOR
€
DO 1 0 7 I = O R D f 2 » - l
107 X (I)-X < I-1)
X( D = I S R ( J>*GA IN
. C' GET NEW DESIRED OUTPUT SAMPLE
C
D = I S I < J)*;GAIN
CALL LH S( X yORD rW» H U » E P S I »Y * D )
I S O ( J ) = I N T ( E P S I / G A I N + O ♦5> 5
105 CONTINUE

c
c WRITE 0UTPU1 FI L E
C
CALL WFI LE (NAMEO»ISOiNSAMI» NORO)
€
C PROGRAMS DETAILS
C
TYPE 2 0 1
201 FORMAT( / / 1 0 X ' RESULTS OF ADPCAN PROGRAM'/)
TYPE 2 0 0 » ( NAMEI< I ) » 1 = 1 » 1 1 )
200 FORMAT<5 X ' INPUT F I L E NAME♦ ' 1 1 A D
TYPE 2 0 2 » <N A M E R ( I ) r l = l » l l )
202 FORMAT</5X'REFERENCE FI LE NAME♦ ' 1 1 A D
TYPE 2 0 3 1 <NAMED < I ) » I = l r l D
203 FORMAT( / 5 X ' OUTPUT FI LE NAMEt ' 1 1 A D
STOP
END
Vtiktme II: Compression and Automatic Recognition 169

c
PROGRAM CONLIM
C fl»AX VMS VERSION)
C
c.
c
c
C TH IS PROGRAM DETECTS WAVLETS BY MEANS OF THE CONTOUR LIM ITING
C ME'HOD. THE PROGfcSM READS THE TEMPLATE 5IGNA1 ( I TEMP) FROM
C A - I LE .
C U P - E h AND LOWER S9N1QURS ARE DEFINED:
0
C UPPER CONieSR( I>-- I r E M F ( I > + ( ( E P S I >* I T E h P a >+ C P N >
C LOWER CONTJBttR - ( < EPS I ) * 1 TE MF( I } PCON)

C THE PROGRAM THEN ISADS THE ACTUAL NOISY WAVLETS S I G N A L * I S ( I > »

C FF.C" ANOTHER FILC AND DETECTS THE PRESENCE OF THE WAVELETS,
C DETECTION L m :

WAVELET I S DETECTED IF FOP AT LEAST O.V*NOPT SAMPLES

< NOF'T BEING THE NO. OF SAMPLES IN THE TEMPLATE) WE
n o n o o o o o o o n o rjo n o -o o o o o o o n o r

HAVE?

LOWER CONmUR<I) . L E . I S ( I ) . L E . U P P E R CONTOUR(I)

INPU T f il e s :
1. tSKFORMATTED INTEGER TEMPLATE F I L E WITH
m £ RECORD AND NOPT SAMPLES.
2. tfcFORMATTED INTEGER SIGNAL FILE WITH NREC
«CORI»S AND NOPS SAMPLES PER RECORD.

OUTPUT f i l e :
1 . FO RM AT TE D INTEGER F I L E WITH 3 RECORDS
f m NOPT SAMPLES PER RECORD. THE RECORDS:
1 . THE TEMPLATE
2 . UPPER CONTOUR
3 . LOWER CONTOUR
2 . tfisFDRMATTED INTEGER F I L E WITH NREC RECORDSy
NJFS SAMPLES PER RECORD. THE FI LE CONTAINS
Tift: LOCATIONS OF DETECTED WAVELETS (DETECTED
W«*£LET I S DENOTED BY A PULSE OF AMPLITUDE
OF 5 1 2 )

reference:

C C O H E N S BIOMEDICAL SIGNAL PROCESSING

C CRC PR ESSr CHAPTER 1
C
C
C
c
c
c
c
c ,
c
c
INTEGER ITEMF' ( 2048$ » I S ( 4 0 9 6 ) r IF* <4 0 9 6 )
DYTE NAMEI< 1 1 ) r N-8IE2 < 1 1 ) * NAHE3( 1 1 ) r NAME4 < 1 1 )
C
C-
C READ TEM^ATE FI LE

TYPE 1 0 0
FORMAT < H * ' ENTER MPUT TEMPLATE FI L E NAME: ')
ACCEPT 1 1 9 » N C U r « f t M E l ( I > * I = l r l l >
FOR rtAT < 0 r 1 1 A 1 )
TYPE 1 0 1
Biomedical Signal Processing

FORMAT<H$'ENTER MO. OF SAMPLES IN TEMPLATE* ')

ACCEPT *»NOPT
CALL ASS IGN < t »NAME1* 1 1 )
N0PT2=N0PT*2
DEFINE F I L E 1 < 1 * N0PT2 »U* IVA R>
TYPE * , 'NO* OF SAMPLES IN TEMP= ' >NOPT
R E A S ( 1 ' 1 > < IT E M P 
CALL CLOSE<1>

READ S I G N A L 'S SAMPLES FROM INPUT F I L E

TYPE 1 0 2
FORMAT< H $ ' ENTER NAME OF SIGNAL F I L E i ' )
ACCEPT 1 1 9 * N C H 2- *( N A M £ 2 
TYPE 1 0 3
FORMAT<H$'ENTER NO. OF RECORDS AND SAMPLES PER RECORD: ')
ACCEPT * r NREC»NOPS
N0PS2=N0PS*2

PREPARE OUTPUT F I L E S

TYPE 1 0 4
FORMAT <H$ 'ENTER CONTOURS OUTPUT F I L E NAME J ' )
ACCEPT 1 1 9 * N C H 3 » ( NAME3 *1 = 1 » 1 1 )
CALL A S S I G N ? 2 »NAME2»1 1 )

DEFINE FILE 2(NREC* N 0 P S 2 » U* IV AR )

CALL A S S I G N (4 *N A M E 4* 1 1 >
DEFINE FI LE 4 ( NREC• N 0 P S 2 * U * IVAR)

TYPE 1 0 6
FORMAT(H$'ENTER CONSTANT AND RELATIVE CONTOUR PARAMETERS: ')
ACCEPT * , CO N. EP SI

LOOP ON ALL SIGNAL RECORDS

NO PTH=INT(N0P T/2+0.5)
NOPTL=INT <NOPT* 0 . 9 + 0 . 5 )
NMODS=NOPS+NOPT-1 {NO. OF SAMPLES IN BUFFER

DO 9 K= 1» NREC

TRANSFER REMAINDER OF PREVIOUS BUFFER TO CURRENT BUFFER

F IR S T RECORD READ DIRECTLY (NO REMENANCE)

I F ( f w L T . l ) GO TO 1 2
READ( 2 • 1 ) 
GO TO 1 3
DO 3 J = 1 r ( N O P T - 1 )
IS < 1)= IS (N 0P S -N 0P T + J+1)
READ< 2 ' K ) » J =NOF‘T »NMODS )
DO 4 J = 1 *NOPS
IP < J)= 0
IC=0
DO 5 J J = 1 * N 0 P T
CORF=ITEMP< JJ > *EP SI +C ON
H=I TEMP<JJ)+CORF
L=ITE MP( J J ) - f O R F
JM 1=J+JJ-1
Volume 11: Compression and Automatic Recognition 171

5 I F ( L . L E . I S ( J M 1 ) ♦ AND• I S <J M 1 ) . L E • H ) IC=IC+1

I F ( I C * L T * N O P T L ) G O TO 4
C
C A WAVELET HAS BEEN DETECTED
c i IP (J>=512
4 j CONTINUE
M R ITE (4/ K ) ( I P ( J ) » J = 1 » N 0 P S )

9 CONTINUE
C
C END DETECTION
C
CALL CLOS E< 2 )
CALL CLOSE < 4 >
CALL A S S I G N ( 3 * NAME3 * 1 1 )
DEFINE F I L E 3 ( 3 * N 0 P T 2 *U r I V A R )
W R I T E ( 3 / 1 ) ( I TEMP< I ) * 1 = 1 *NOPT)
DO 1 0 1 = 1* NOPT
10 I TEMP( I ) = I T E M P ( I ) * < 1 . + E P S I ) + C 0 N

WRITE< 3 ' 2 ) < I T E M P < I ) » I = 1 * N 0 P T ,

DO 11 I “ 1 * NOPT
X= I TEMP( I )
11 I TEMP * < 1 . - C P S I ) - C O N
W R I T E < 3 ' 3 ) < I T E M P < I ) » 1 = 1 * NOPT)
STOP
END

C
C

PROGRAM COMPRS
o o o o n o o o r j o o n o o ’r j o o n n o n n n n o o o o n n o o o

(VAX VMS VERSION)

TH IS PROGRAM COMPUTES A TRANSFORMATION MATRIX * TO REDUCE THE

PATTERN SPACE* BY THREE METHODS J KARHUNEN■LGEVE <K L ) » ENTROPY

MINIMIZATION (ENT ) AND FISHER DISCRIMINANT (F I),

THE TWO CLASS PROBLEM IS CONSIDERED BY THE PROGRAM♦

PROBABILITIES OF CLASS APPEARENCE HAS BEEN ASSUMED

EQUAL TO 0 . 5 FOR THE TWO CL ASS ES.

COMPRESSION IS PERFORMED TO PRESERVE SEPERA BIL ITY OF

CLASSES ACCORDING TO THE VARIOUS CRITERIA OF THE THREE METHODS♦

THE F I R S T AND SECOND METHODS REDUCE THE N DIMENSIONAL PATTERN

SPACE TO AN M DIMENSIONAL FEATURE SPACE (M < N ) ,

FI SH ER DISCRIMINANT REDUCES THE DIMENSIONALITY TO M = l .

in pu t :
1. A DATA FI L E HOLDING A MATRIX OF LI VECTORS
OF DIMENSION N* CORRESPONDING TO THE MEMBERS
OF THE FI R S T CLUSTER♦
n o o r

2. A DATA F I L E HOLDING A MATRIX OF L2 VECTORS

OF DIMENSION N* CORRESPONDING TO THE MEMBERS
OF THE SECOND CLUSTER.
m Biomedical Signal Processing

C
C OUTPUT?
C i » A DATA FILE HOLDING M VECTORS OF DIMENSION N»
C CORRESPONDING TO THE TRANSFORMATION MATRIX
C OF THE COMPRESSION* IN THE CASE OF FISHER

C rtLIHOu H = l .
G
C
C
e
C REFERENCE S *
C
C 1. COHEN r A , t BIOMEDICAL SIGNAL PROCESSING*
C CRC PRESS * CHAPTER 12
C
C 2. r un A «F’« * 0 * t A N D H A R T »P . * E . * P A T T E R N
C C L A S SI FI CA TI ON AND SCENE ANAL YSIS* WILEY
C I N T E R S CI EN CE * N.Y.* 1973
C
C 3* FUKUNAGAiK,? INTRODUCTION TO ST A T I ST I C A L
C PATTERN RECOGNITION* ACADEMIC P R E S S ?
C N .Y .* 1972
C
C 4, TOU*J . * T .* AND GONZALEZ *R . * C .* PATTERN
C RECOGNITION P R I N C I P L E S , ADDISON-WESLEY*
C READ.IN6?Ms. * 1 9 7 4
C
C
C
C LIN KING: EIuENrRFILEMrafFILEM*MEANrCOVAfADD*INVER*MUL*SYMINV
C
c
D I M E N S I O N X I < 4 0 * 5 0 0 ) ? X 2 ( 4 0 ? 5 G 0 ) Y X M < 4 O ) *XM1<4 0 > ? X M 2 c 4 0 >
D I M E N S I O N R2 ( 4 0 ,-4 0 > * R1 C 40 * 4 0 > , C 1 <40* 40) tC 2 ( 4 0 , 4 0 )
D I M E N S I O N C I N V <40* 4 0 ) » COR<4 0* 40> * C O V ( 4 0 * 4 0 )
D I M E N S I O N D E L T A < 4 0 >.-r DEM 4 0 » 4 0 ) * A< 4 0 » 4 0 ) f WR < 4 0 > . WI ■<4 0 >
INTEGER IAUXC80)
BYTE NA-iEl i 11> *NAHE2 0 1 1 )

TYPE* * # PLEASE SELECT A REDUCTION METHOD: '

TYPE*
T Y P E ** ' K-L (FROM N-D IM* TO M D I M , ) . . . , ♦ . . TYPE < 1 > AND <CR>
TYPE* * ' ENT (FROM N-DIM * TO M D I M * ) . . . * . . , TYPE < 2 > AND <CR>
T Y P E ** ' F I S <FROM N - D I M . TO 1 DIM. ) ♦ . . ♦ . . . TYPE < 3 > AND <CR>
TYPE*
803 TYPE*i- ' PLEASE TYPE THE METHOD CODE: '
ACCEPT**ME
C
M= 1
I F ( M E . E G . 3 ) GO TO 8 0 7
. TY P E’ * * ' GIVE DIMENSION OF REDUCED SPACE:
ACCEPT * * M
807 CONTINUE
C
C READ INPUT FI LE S
C
C
TYPE 7 7 7 ? 1
777 F O R M A T ( / 3 X ' R E A D DATA FROM F I L E OF CL AS S NO.: '12/)
CALL R F I L E M ( N A M E 1 * X 1 * 4 0 * 5 0 0 * L 1 * N )

T YPE 777f2
CAL L RFJL£M<NAME2*X2*40*500»L2*N>

.MEAN CALCULATIONS

CALL MEAN ( XI * 4 0 * 5 0 0 * N * LI * XM1.)

Volume U: Compression and Automatic Recognition 173

CffeL H E A N ( X 2 f 4 0 « » 5 0 0 f N r L 2 » X h 2 >
C
C Ct & ^ IA N C E CALCUt..ATICN
c
CftU C oC'A ( X1 t 4 0 . 5 0 0 • J • XMt f C 1 )
CAil COVA ( X2 » 4 ’•>» 5 0 0 »*• ♦. 2 » X r t 2 * C 2 )
c
c
C COT ON COVAKIANCF

F*-0 .5
C AD M C l f C 2 * COV.^C • -30, N» N* 1 )
Dfc «’?-C I - l r N
Df? 0 J = 1 >N
480 CD-' i * J>-FA*COV J
C
C. , , , . v , I f f i ZRSE OF COMMON CC . - IA N CE
C
C«__ I N V E R < C 0 V » 4 0 r 4 C . * » C I N V > ' CIHV t ' 'wk<9N COVARIANCE
N1‘ 1
C
C
C KISHER METHOI
C
IF£?*E . N E . 3 > GO TO 8 0 0
C
C PREPARE MEAN DIFFERENCE
C
CALL A D D ( X M l » X M 2 » D E L T A f 4 0 » l » N » N l » - l > ‘ DELTA I S
C THE DIFFERENCE IN CLUSTERS MEANS♦
C
C CALCULATE FISHER VECTOR <CINV*DELTA)
C
• CAi-_ HUL. <:CIfyV, 4 0 » 4 0 r : £ l 7 A r 4 0 f 1 TDEL r 4 0 » 1 * N » N f N l )
C
C NORMALIZE FISHER VECTOR

xm~ o.
DG 5 1 0 J - 1 * N
810 XX^~XXN+DEL( Jr 1 >*DEL •: J » 1 )
XX$~SQRT < XXN >
DO 8 1 1 J = 1 ? N
811 A < 1> = DEL( J * 1 ) /XXN ' F I R S T ROW OF A HOLDS NORM ♦ FI SH ER
G© TO 7 7 8
800 CONTINUE
C
C MINIMUM ENTROPY METHOD
C
I F ? M £ ♦N E ♦ 2 ) GO TO 8 0 1
CALL EIGEN < 4 0 > N * COV >UR» W I j A »IERRrWO)

6 0 TO 7 7 8
801 CO#T INUE
C
C K-L METHOD
C
IF ?flE . NE . 1 ) GOTO 7 7
C
C COHMON CORRELATION
C
DO 6 0 0 1 = 1 *N
600 X M «1 > = 0 . ! XM I S A NULL VECTOR DUMMY MEAN
CALL C 0 V A < X 1 * 4 0 » 5 0 0 » * . L 1 » X M » R 1 ) !R1 I S CLUSTER 1 CORRELATION
CALL C 0 V A ( X 2 * 4 0 » 5 0 0 r ' « . L 2 ? X M f R 2 ) !R2 I S CLUSTER 2 CORRELATION
CALL ADD ( R1 » R2 » C0 R » 4 =:« 40 * N f N r 1 ) ! C0 R I S THE COMMON CORRELATION
DO 4 5 1 1 = 1 >N
DO 4 5 1 J = 1 » N
COP: I » J ) = C O R < I ? J ) / 2 •
451 CONTINUE
CALL E I GEN < 4 0 r N » C0 R • =: * WI * A » IERR » W0 )
C
174 Biomedical Signal Processing

€ CHANGE ORDER OF ROWS TO GET EIGENVECTORS

C CORRESPONDING TO M LARGEST EIGENVALUES
C <R1 I S DESTROYED)
C
©0 8 0 8 1 - 1 » N
DO 8 0 8 J = 1 , M
JJ~N -J+ l
808 R l =A<I?JJ>
DO 8 0 ? 1 = 1 »N
DO 8 0 9 J - l f M
809 A = R l M !>'
GO TO 8 0 3
773 CONTINUE
C
C WRITE TRANSFORMATION ON OUTPUT F I L E
C
CALL W F I L E M (N A M E l ,A * 4 0 ,4 0 ,M , N )
END

III. SUBROUTINES

SUBROUTINE L M S (X , N ,W ,M U , EF'S I , Y ,D )

C THIS SUBROUTINE REALIZES THE ADAPTIVE LINEAR

C COMBINER BY MEANS OF WIDROW'S ALGORITHM
C
C
C X- THE REFERENCE VECTOR
C N- THE ORDER OF X
C W- THE WEIGHTING VECTOR
C SUBROUTINE RECIEVES CURRENT WEIGHTES
C AND TRANSMITS PREDICTED WEIGHTES
C MU- THE GRADIENT CONSTANT
C EPS I - CURRENT OUTPUT ,
C FILTERED SIGNAL
C Y- ADAPTIVE LINEAR COMBINER'S OUTPUT,
C ESTIMATION OF NOISE
C
C D- PRIMARY INPUT SAMPLE
C
c reference:
c
C UID ROW,B. ET A L , ADAPTIVE NOISE CANCELLING
C P R IN C I P L E S AND AP PL I C A TI O N S* PR O C . I E E E ,
C 63 , 1 6 9 2 , 1 9 7 5
C
C
DIMENSION W ( 1 ) , X ( 1 >
REAL MU, E P S I , Y , D
C
C
C CALCULATE CURRENT COMBINER'S OUTPUT
e
Y =0.0
DO 2 1 = 1 , N
2 Y = Y+ W
C
C CALCULATE CURRENT OUTPUT
C
EPSI~D-Y
Volume II: Compression and Automatic Recognition 175

C UPDATE WEIGHTING VECTOR

C
C0NS=2*MU*EPSI
DO 1 1 = 1 »N
W(I)=U<I)+CONS*X
1 CONTI NUE
RETURN
END

SUBROUTINE NACOR( S >L »COR>NP1, ENG>

C
C
C
C THIS SUBROUTINE COMPUTES THE NORMALIZED AUTOCORRELATION
C SEQUENCE ( I N NORMAL U S E , AS AN INPUT TO ' B L P C 'r
C THE L . P . C EXTRACTION SU BRO UT INE ).
C
C
C
C
C S ................ ..................................................... INPUT VECTOR
C L ......................................................................THE DIMENSION OF ‘ S '
C COR................................................................VECTOR OF NORMALIZED AUTOCORRELATION
C N P 1 ............................................................... THE DIMENSION OF •COR' + J
C (CORRELATION COEFF. FROM ZERO TU N>
C ENG................................................................THE ENERGY OF THE INPUT ♦
C
C
C
C
C
REAL S ( 1 ) » COR( 1 )
ENG = 0
DO 5 1 = 1 rL
5 EN G =EN G +S(I)##2
DO 1 0 I = 2 » NP1
C O R ( I >^0
DO 2 0 J = 1 » L + 1 - I
20 CO R(I) -C O R (I>+ S(J)*S< J + I - l )
10 C O R <I)=C O R (I)/E N G
COR( 1 ) = 1
RETURN
END
Biomedical Signal Processing

SUBROUTINE'. SL PC <P ?COR, L P C* PAR»AUX* ERR)

T H I S S U B R O U T I N E C O M P U T E S TH E LP C?
THE PA R C O R COEF. AN D THE TOTAL S Q U A R E J E R R O R
OF A S E Q U E N C E t OUT OF THE A U T O C O R R E L A T I O N SE QU E N C E .

DESCRIPTION OF P A R A M E T E R S

P . . .......D I M E N S I O N O F * L P C * » • P A R * % *AUX *
COR ♦ *P 41 A U T O - C O R R . C O E F ♦ V E C T O R .
L P C . L F C CO FF . V E C T O R .
PA R« * v F £F C3F COEF VECTOR.
AUX WORi' a NG A R F A .
ER R. -. ................... . .NORMALIZED PR E D I C T I O N ERROR.

reference;

1, MAKHQUL s J . t * LINEAR PR ED IC TI ON ;A TUTORIAL

•REVIEW *PROC. I E E E f 6 3 ? 5 6 1 ?. i ? 7 5

l in k ; none

R E A L C O R < 1) > LF'C < 1 ) r P A R ( 1 ) >A U X C1 )

INTEGER P
PAFx ( X ) = - C O R (2 )
LPC m - P A R U >
E R R = < 1 - P A R (1 )'**2)
DO 10 I = 2 » P
11=1-1
T E M P ~ - C O R 
DO 2 0 J ~ 1 7 1 1
TEMP- TEMP- LPC( J ) *COR( 1 + I - J )
PAR(1)=TEMP/ERR
L P C Cl") = ? A R * L P C

ERR=ERR* < 1 - P A R ( I >* * 2 )

RETURN
END
Volume II: Compression and Automatic Recognition 177

SUBROUTINE D L P C 2 0 ( C 0 R r XMAT >

o o o o o o o o o o o o

T H I S SUBROUTINE COMPUTES THE LPC

AND THE TOTAL SQUARED ERROR FOR
ALL THE PREDICTORS FROM ORDER 1 UP TO 2 0 ,

D E TA IL S AND COMMENTS SEE SUBROUTINE DLPC

COR................................................. 21 AUTO-CORR, CGEF. VECTOR

o n n o o n

H A T . , ...........................................MATRIX UHICH CONTAINS THE SOLUTION OF A l l

THE PREDICTORS.THE FI RS T COLUMN CONTAINS
THE NORMALIZED ERROR.

REAL COR( 2 1 ) * XMAT < 2 0 » 2 1 ) > PAR( 2 0 > * LP C ( 2 0 ) * Y( 2 5 >

PAR<1 >=-C0R<2>
L P C d )=PA R (1)
ERR= * * 2 >
XMAT ( 1 * 1 ) =ERR
XMAT ( 1 * 2 > = L P C < 1 )
DO 6 0 1 = 2 * 2 0
11=1-1
TEHP = -COR <1 + 1 )
DO 2 0 J = 1 * I 1
20 TEM P=T EMP -LP C< J>*COR< 1 + I - J )
PA R ( I ) = TEMP/ERR
L PC i I ) =F*AR ( I )
DO 3 0 K = l * I i
30 Y ( K ) - L P C < K) + PA R( I ) * L P C =Y < L )
E R R -E R R *< 1-P A R (I)* * 2)
XMAT ( 1 * 1 ) =ERR
DO 6 0 J = 1 * 2 0
XMAT( I » J + l ) = L P C ( J )
60 CONTINUE
C DO 7 0 ? I = 1 * 2 0
C TYPE* * I
C TYPE** XMAT( I * 1 ) * , XMAT( I * 2 >
C70 CONTINUE

RETURN
END
178 Biomedical Signal Processing

SUBROUTINE FT 01 A

o o
THIS ROUTINE CALCULATES THE DISCRETE FOURIER TRANSFORM OF
o
THE SEQUENCE F<N>» N<0*1 * . . * rIT -1 j
fi
THE BATA I S TAKEN T O .B E PERIODIC NAMELY5 F<N+IT> = F < N > .|
r> n

IT I S THE SEQUENCE DIMENSION AND MUST BE A POWER OF 2 . i

THE PROGRAM CALCULATES THE DIRECT TRANSFORM » -
o n o o o o o

FOR WHICH?

6(M> * SUM OVER N = 0 , l » * * f I T - 1 OF F < N ) * £ X P < 2 P I * S Q R T < - 1 > # N * M / I T >

FOR H - O t l f * . ♦ » I T - 1

IT ALSO CALCULATES THE INVERSE TRANSFORM (IN V =1)» FOR WHICH:

F ( N ) = ( l . / I T ) * < S U M OVER M * 0 » 1 » . . » I T - 1 OF G<M) * E X P <- 2 P I * S Q R T ( - 1 ) )

NM /I T> FOR N = Q , 1 1 . . . I T - 1

o
o o

IF IT IS NOT A POWER OF 2 t INV IS SET TO - 1 FOR ERROR RETURN,

THE SUBROUTINE ACCEPTS REAL AND IMAGINARY PARTS OF SEQUENCE

TO BE TRANSFORMED IN ARRAYS TR (REAL) AND TI ( IM A GI N AR Y ),

r> o

TRANSFORMED RESULT I S RETURNED IN THESE TWO ARRAYS.

NOTE THAT WHEN DIRECT FFT OF A REAL SIGNAL I S PERFORMED

o o o

SIGNAL SHOULD BE IN ARRAY TR AND ARRAY TI MUST BE ZEROED.

o
n

reference:
n
o

1, GENTLEMAN AND SANDE•

PROC, FALL JOINT COMPUTER CONFER. 1966

o o
o
o

I T . . . . . . . . . . .................. SEQUENCE DIM ENSION( MUST BE POWER OF 2 !!> .

I V .................. . . . . . . . . . . 1 — INVERSE TRANSFORM.

2 - - DIRECT TRANSFORM.
o

T R . . ............ .. REAL PART ARRAY.

T I . . . ............ .. IMAGINARY PART ARRAY.

o
n
o
n

DIMENSION TR(4 )» TI (4 ) tUR(15>»UI(15)

INTEGER KJUMP
K JUMF'~ 1

GO T O < 10 0 , 2 0 0 > rKJUMP

100 U M = .5
DO 5 0 1 = 1 , 1 5
U M " *5 * U M
T H = 6 . 2 8 3 1 8 5 3 0 7 178*UM
UR<I)=COS(TH)
50 U I ( I ) - S I N ( T H )
20 0 U M = 1.
GO T O ( 1 y2 ) » INV
1 UM=-1.
2 10 = 2
DO 3 I= 2 > 1 6
10=10+10
IF(I0-IT)3»4»5
3 CONTINUE
C E R R O R IN IT - S E T I N V = ~ 1 A N D R E T U R N
5 IN V = -1
RETURN
C I T = 2* *1 - I N I T I A L I S E O U T E R L O O P
4 1 0 =1
11 = 10
1 1 = IT /2
13 =1
Volume II: Compression and Automatic Recognition

START MI DDLE LOOP

10 K=0
12=11+11
CALCULATE TWI DDLE FACTOR E C K / I 2 )
11 Wf t =l .
WI=0.
KK=K
JO=IO
24 IF (K K > 2 1 f2 2 » 2 1
21 J 0 = J 0 - 1
KK1 *KK
KK-KK/2
I F ( K K 1 - 2 # K K ) 2 3 * 2 1 »23
2 3 4'S=WR*UR( J O ) - W I * U J ( JO)
L*I>WR*UI ( J O ) + W I * U R ( J O )
Wft - w s
GO TO 2 4
22 WI=WI*UM
START INNER LOOP
J =0
DO 2 * 2 TRANSFORM
31 L=J*I2+K
L 1 ~ L+ 1 1 ‘
Z ? -T R (L + 1 )+ T R (L l+ l)
Z I-T I(L +1)+T I<L 1+ 1)
I~W R *(TR (L+1 )-T R < L i + l ) )-W I * < T I ( L + l > - T I ( L l + l >>
I I ( L l + 1 ) “ WR* < T I < L + 1 > - T I ( L l + l ) ) + y i * ( T R < L + l ) - T R ( L l + l ) )
Tft(L +l)=Z R
T R (L 1+1>~Z
T < L + 1 >=ZI
INDEX J LOOP
J='J+ 1
IF (J-I3)31»12»12
INDEX K LOOP

12 ‘ -K+l
I F ' K - I l > 1 1 »6 »6
INDEX OUTER LOOP ,
6 13=13+13
10^10-1
I 1 “ 1 1. / 2
: c ( I 1 ) 51 , 5 1 f 10
VNSCRAMBLE

■30 T0< 61 >52) » INV

61 JM=1./FLOAT(IT)
52 N=0
ji~ -j
DO 53 1 - 1 , 1 1
J 2 -J 1 /2
r~2*(K-J2)+J1
53 2 ~-J2
54 I F <K - J >6 6 j 5 6 * 5 5
56 TR ( J + 1 > = T R < J + 1 ) * U M
T I ( J + l ) = T I ( J+l)*U M
0 0 TO 6 6
55 Zf t= T R( J + l )
ZI = T I ( J + l )
T R ( J + l )=TR(K+1)*UM
T I ( J + l >=T I(K +1)*UM
Tft < K + 1 >=ZR*UM
T I < K+ l >=ZI*UM
66 J=J+1 (
I F ( J-IT +1 )52»57»57
57 TR ( 1 )= T R ( 1 )* U M
T I ( 1 } = T I ( 1 )*UM
T R (IT )=T R (IT )*U M
T I(IT )=T I(IT )*U M
RETURN
END
Biomedical Signal Processing

S Ul kGI n HE WFILEMtNAME r A r M r# 2 * NOftfttOSRr.

i VAX VMS VERSION>

THIS SUBROUTINE WRITES A REAL DATA MATRIX

OF DIMENSIONON NOR*NOSR ON AN UNFORMATTED
REAL DATA FI LE *

NAME A BYTE ARRAY HOLDING THE NAME OF THE FIL E

A - A REAL MATRIX OF DIMENSION N 1 * N 2 SO DIMENSIONED

IN THE MAIN PROGRAM

H i - NO. O F ROUS OF THE MATRIX AS DIMENSIONED

IN MAIN PROGRAM

N 2 - NO. OF COLUMNS OF THE MATRIX AS DIMENSIONED

I N MAIN PROGRAM

C NOR- NO. OF RECORDS IN FILE

C
C NOSR- NO. OF SAMPLES/RECORD
C
c
DIMENSION A < N 1 ,N 2 >
BYTE NAME<1>
TYPE- 1 0 0
100 FO RM AT (IN *'G IVE OUTPUT F I L E NAME! *>
ACCEPT l l 9 , H C H 0 7 < N A M E t I > 1 = 1 , 1 1 ) .
11? FORMAT m , 11 A i )
CALL Ai<SIGN( i »NAME? 11 >
TYPE 1 *:*i
101 FO RM AT UH $'G IV E NO. OF RECORDS AND SAMPLES/RECORDJ ')
ACCEPT * , NOR,NOSR
NOSR2=NOSR*2
DEFINE FI L E 1 <NOR, N 0 S R 2 , U , I V A R )
DO 1 J = t ,N O R
T Y P E * , ( A ( I ,J ) ,I ~ 1 ? N 0 3 E >
i W R I T E ,1 = 1 f N O S R )
3 CONTINUE
CALL CLOSE ( 1 )
RETURN
END
Volume If 189
INDEX

A Bayes decision theory, 3 9 —50

conditional risk, 41—42
decision rule. 41, 47. 49
Abnormal breath sounds. 128
decision threshold. 41
Abnormal heart sounds, 127
discriminant function, 41— 43,.45—4 6
Acceptance by empty store, 96
Euclidean distance, 44
Acoustic signals, 526— 130
faisc negative, 40
Action potential, 1, 14. 19, 35, 213— 114
false positive, 40
Actopic beat extrasv stole. !23
likelihood ratio, 41
Adaptation, 26
Mahalaaobis distance. 43, 47
Adaptive filtering.. 124— 125
minimum error rate, 42. 52
Adaptive wavelet detection. 9— 16
quadratic classifier, 50
correction of initial template, 12— 16
training set- 44, 46
least squares estimate. 9
Beta range, 115
probability distribution. 9— 10
Between class scatter matrix, 64— 66, 75, 78
QRS complexes, 13— 16
Bhaltacharyya distance. 76
template adaptation. 10— 11
Biochemical signals, 131— 132, 134
tracking slowly changing wavelet. 12
Bioelectric signals, 113— i 25
Adventitious sounds, 12S
action potential, 113— 114
AEP. see Auditor) evoked potential; Average
electrocardiography. 121— 124
evoked .potential
electrodermal response. !25
Age specific failure rate. 23
electroencephalogram, 114— 118
Algorithms, 37, 124
electrogustfographv, 124— 125
wavelet detection. I— 5
electromyography. 119— 122
Alignment, wavelet detection, 5. 5
eiectror.eurogram, 113—-H4
Alpha n..5ge, 115
electro-oculogram, 114
Alternative-hypothesis. 5U
e 1ec trore ti nog ram. 113— 114
Amphoric breath sound :AMP). -28
evoked potentials. 1 *7— 121
Amplitude /.one time es»ch coding (AZTEC). 5
galvanic skin reflex. 125
Anesthesia depth monitoring. i 19
Apexcardiography (AC G i. 130 Biofeedback. 125
Apnea monitor, 126 Bioimpedance, 125— 126
A priori knowledge, 1. 3. 5, 10. ?7— 38. 59 Biomagnetic signals, 131
Arrhythmia, 122 Blackman-Hanis window, 145
Asmatic breath sound (A S). 128 Biood pressure measurements, 130
Atrial fibrillation. 123 Bradycardia, 121
Atrio ventricular, see A V node Brain damage, 46
Auditory evoked potential (AEPs. 118, 120— 121 Brainstem auditory evoked potentials (BAEP), 118
Auscultation, 127— 128 Breath signal analysis, 1
Autocovariance of Poisson process, 30 Breath sounds, 38, 128
Automata, 92 Bronchial breath sounds (BBS), 128
Automatic sleep staging. 119 Broncho-vesicular breath sounds (BVBS). 128
Autoregressive analysis. 125
Autoregressive model
newborn’s cry, 46 c
point process model. 19
Autoregressive moving average model (ARMA). 37 Cancer. 130
Autoregressive prediction. 124 Canonical forms, point processes, 20, 22— 24
Average evoked potential (AED). S18— 119 Cardiac output, 130
AV node, 121 Carotid artery. 104
Axial vectorcardiograms. 124 Carotid blood pressure, syntactic analysis of, 104—
AZTEC, see Amplitude zone time epoch coding 106
Carotid waveform classification, 87
Cavernous breath sound (CA), 128
B
Central limit theorem. 22
Chemoff bound. 76
Ball istocardiography, i 30 Chromosomes, 87
Bartlet window, 140— 141. 144— 145 C! ssical windows, see also specific types, 139—
Baseline shifts, 3— 4 151
190 Biomedical Signal Processing

Classification o f signals, 37— 86 Data windows, see also specific types, 139— 151
alternative hypothesis, 39 Decision rule, 41, 47, 49
applications, 38 Decision-theoretic approach, see also Classification
Bayes decision theory, 39—50 o f signals, 37— 86 j
feature selection, 75— 79 wavelet detection, 1
Fisher’s linear discriminant, 63— 66 Decision threshold, 41 j
Karhunen-Loeve expansions, 66— 75 Delta range, 115
k-nearest neighbor, 50—53 Depth recording, 115
linear discriminant functions, see also Linear dis Diastolic phase o f heart, 104
criminant functions, 53—63 Dicrotic notch. 104
null hypothesis, 39 Diriehlet window, 140, 142— 143
statistical, 39— 53 Discriminant approach, 87
time warping, 79— 84 Discriminant functions. 38. 41— 43, 45— 46, 111
Class .separability, see Separability linear, see Linear discriminant functions
Cluster-seeking algorithms, 38 Divergence, 76, 78
Cogwheel breath sound (CO). 128 Dolph-Chebyshev window, 145. 150— 151
Color blindness, 118 DP, see Dynamic programming topics
Compression, 37, 60, 62. 66— 69, 71. 124 Dve dilution, 130
Compression ratio, 63 Dye dilution curves. 14
Computer programs, 153— 188 Dynamic biomedical signals, characteristics of, see
Conditional density function, 23 also specific topics, 113— 137
Conditional intensity function, 23— 24, 34 Dynamic programming (DP) equation, 82— 83
Conditional probability, 23 Dynamic programming (DP) methods, 77— 79,
Conditional risk, 41—42 81— 84
Context-free grammar, 90— 92, 96, 98— 99
stochastic, 102— 103
Context-free languages, 95
E
Context-free push-down automata, 92, 95— 100
Context-sensitive grammar, 90 ECG, see Electro-cardiograms
Contour limiting EEG. see Electroencephalograms
QRS detection, 6 EGG, see Electrogastrography
wavelet detection, 5— 6 Eigenplanes, 70, 74
Convergence properties, 73 Eigenvalues, 61— 62. 66. 69. 72— 73
Coordinate reduction time encoding system Eigenvectors, 61, 66. 69— 73
(CORTES), 5 Ejection clicks, 127
Cornea, 114 EKG, see Electrocardiograms: Electrocardiography
Corneal-retinal potential. M4 Electric control activity (ECA). 125
Correlation, 125 Electrocardiograms (ECG), 37
Correlation analysis, point process, 20 adaptive wavelet detection o f QRS complexes,
Correlation coefficients, renewal process, 27 13— 16
Correlation function, spectral analysis, 24 analysis, 87— 89
CORTES, see Coordinate reduction time encoding finite transducer for, 100— 101
system high-frequency, 124
Cosine windows, 141, 143, 146— 147 point process, 19. 21
Counting canonical form, 22— 24 QRS complex, 1— 5
Counting process, 20, 22— 24 signal, 121— 124
Counts autocovariance, 26 finite state automata, 94— 95
Counts PSD function, 26 syntactic analysis of, 106— 110
Counts spectral analysis, 24— 26 Electrocardiography (ECG). see also Electrocardi
Cross correlator, 8 ograms, 1, 38. 121— 124
Cross covariance density, 34— 35 inverse problem. 123
Cross intensity function, 34 Electrocorticogram. 115
Cross spectral density function, 35 Electrodermal response (EDR), 125
Cube vectorcardiograms, 124 Electroencephalograms (EEG). 37, 114— 118
Cumulative distribution function, 23 alpha range, 115
Weibull distribution, 31 analysis, 37
aperiodic wavelets, 1, 3
beta range, 115
D delta range, 115
depth recording, 115
Data compression, see Compression k-nearest neighbor classification, 53
Volume II

syntactic analysis, 110— 111 Fourier transform

theta range, IIS counts spectral analysis, 26
Electroencephalography, see also Electro-encephalo Poisson process, 30
grams. 38 • Frank elecLode system, 124
Elect rogastrography (EGG), 124— 125
Electrogloitography, 126
Electromyography (EMG), 38, i i 9 — 122 G
point processes. 26
Electroneurogram (ENG). 113— 114 Galvanic skin reflex (GSR), 125
Electro-oculogram (EOG). 114 Gamma distribution, see Erlang (Gamma)
Electroretinogram ( ERG). 113— 114 distribution
Emotional state monitoring. 125 Gamma function
Emotional states. 129 Erlang (Gamma) distribution, 32
ENG, sec Electroncurogram Wcibull distribution, 31
Entropy criteria methods. 60— 63 Gastric electric control activity, 124— 125
EOG. see Electro-oculogram Gaussian distribution, 27
EP. see Evoked potentials Uau.v»iuu oO
Epilepsy. L i 15. 117— 118 Generalized linear discriminant functions, 55-
Equivalent noise bandwidth (ENBW ). 139 Glottal pulses, 19
ER. see Evoked responses Glottic hiss, 128
ERG. see Electroretinogram Glottis, 126, 129
Erlang (Gamma) distribution. 19. 29. 32 Gradient descent procedures. 57
Estimation error. 71 Grammar, 87, 89—92, 98, 107
Euclidean distance. 44. 58— 59 Grammatical approach, 87
Evoked potentials (EP), 115. 117— 121 Grammatical inference, 89, 104
Evoked responses (ER). 117 Grand mal seizure, 117
Evoked response wavelets, i
Exponential autoregressive moving average
(EARMA>. 32 H
Exponential autoregressive (EAR) process. 32
Exponential moving average (EMA) process. 32 Hamming window, 143, 145, 148— 149
Extraction processes. 101 Hanning window, 141, 143, 146— 147
Eye position. 114 Hazard function, 23
Head injuries, 115
Heart activity, 121
F Heart function, 121
Heart sounds, 1, 126— 127
FA, see Factor analysis Heart surgery, 37
Factor analysis (FA). 6~ High-frequency electrocardiography, 124
False negative. 4 0 His bundle, 121
wavelet detection. 6 His bundle electrogram (HBE), 124
False positive. 4 0 Homogeneous Poisson process, 30
wavelet detection. 6 Hyperbilirubinemia, 46
Fatigue, 26 Hyperplasia, 129
Fatigue analysis. 121 Hypoglycemia, 46
Features extraction. 37— 38
Features selection. 38— 39. 75— 79
Fetal ECG. 124 I
Fetal heart rate. 1
Fetal movements. 131 111 condition, 57
Fingerprints. 87 Impedance, 125— 126
Finite automaton. 92 Impedance oculography (ZOG), 126
Finite state automata. 92— 95, 102 Impedance plethysmography. 126
Finite state automaton, 100 Impedance pneumography, 126
Finite state grammar. 90— 93 Impulse cardiography, 130
stochastic. 102 Infant’s cry, 129
Finite transducer. 100— 101 Inference algorithms, 104
Fisher’s linear discriminant. 63— 66 Intensity function, ?3
Flatness of spectrum, 27— 28 renewal process, 26
Formal languages, 89— 92 Interevent intervals, 24— 25
Fourier discriptors. 3 Interval histogram, 23
192 Biomedical Sigm! Pmcessmg

Interval independence hypothesis. 26—27 Markov chain. 32

Interval process, 20, 22— 24 Msu-ted tillering
Intervals spectral analysis. 24— 25 overlapping wavelets detection. 15 --1 6
Intrauterine pressure during- labor, ! 30— i *1 d ei fk sectnaj. !. 6— %
Inverse problem in electroc jtrdiography, 12.' ■vLiemai hCG. \ U
MecnaHva) signals, i 3*I— P I

J classifiers. 58— 60
err;v rate. 42, 52
litters, 5 Mr>;ns;rm squared errn’- method, 56— 57
Joint interval histograms, 24 Mo:..r-unit. 120
Mi-te. unit aun»n potermul (M l,AP. MUP). 119—
122'
K i "i'Cv'ss mode*. 19
M*;*: ^ ici* ms* 51 ■’
Karfeunen-Loeve Expansion »'KL£),<0 —7s> I2n • ua*n. 3 ' 3>
Karhunen-Loeve Transformation <KLf) t r -V/ .,’ikr1 -j.i.n an,».h ■- 14. ; !3
K-eomplexes, 1. !!6 , i !8— II9 N''ji,*sp-|>e tusn rccwro-‘:es, 1
Kinetoeardiography. 130 M;*!m*-KUe point uress. !*>. 3 3 —35
KLE. see Karhunen-Loeve Expansions M.'-miuin 1 2 /
KLT, see Karhunen-Loeve Transformation Muscles, 33
k- nearest neighbor (k-NN) elassitieation. i , 50— 53 over’-sppma wavelets detection. 14
Knock-out method, 7? M-.l-iSC'ir Idh's. 128
Kolmogorov-Smimov statistics. 28. 30 M; -J..1H epIL-jM. I P
Korotkoff >ounds, 129— 130 M;.<v;!ecUii_ acr.'.iue-. anaKsis of. 19
Mvopathics. 19. s2*>

L
N
Lag windows, see also specific types. 139— 15S
Laplace transformation, renewal process. 27 \ . i )»duc>!nr \a .u\. I P
Laryngeal disorders, 19, 87, 129 Nerve. lifter .damage, i 13
Laryngitis. 129 Vrvj ijher rescue! tK>i . ? P>
Least squares method, adaptive wavelet detection. 9 V ira« siunaN 3S
Lie detector. 125 V - rai s^skc i^tcr, <i!- spe~:ra! analysis. 25
Light sleep. 115 rscura! '■pike tidtn Sv-o dl\o Renewal processes,
Likelihood ratio, 41, 76 jg—?o
Linear discriminant functions. 53— 63. 12i i-na-ig 'Gamma, i.-f.U ition. 32
advantages, 53 ,'«■ n{ protes'es 2h
entropy criteria methods. 60— 63 res.'ener,iti\c t)pe 26
generalized, 55— 56 Neurogenic lesions. 120
geometrical meaning, 54 V jr logical diseases 129
minimum distance classifiers. 58-—60 Netiromustular diseases. 19, 120
minimum squared error method, 56— 57 Neurons
Linear prediction, 124 action poienuais. -.s
Linguistic approach, 87 i^eiijpp.hg u<.\c!et> detect ion. J4
Linguistics, 87 Neurophysiologv. 19. 33
Logarithmic survivor function, 23 Nor ’<»nt*>sicncoiis Poisson process. 30
LPC, newborn’s cry. 46, 50
Nonpa'rfinernc trend test. 2K
isonstationanties.2b
M Normal distribution. 27. 29
Normalized autoco\arunce. spectral analysis' 25
Machine recognition, 79 Notches, 124
Magnetocardiography (MCG), 131, 133 nth order interval. 20. 22
M agnetoencephal ography (M EG). 131— i 3 2 Null hypothesis. 27. 39
Magnetopneumography (MPG). 131
Mahalanobis distance, 43. 47. 59, 76, 79
Main half widui >. 139
o
Mann-Whitney statistic, 28
Markov. 32 Observation window, 8. 14
Volume U 193

Olfactory evoked potentials, 119 myopathy, 19

Opening snaps, 12? neural spike train, 1 9 - 20
Ophthalmologies) research, 5 i4 neuromuscular diseases. 19
Ordered graph search (OGS), 84 neurophysiology. 19
Orderly process, 2 2 nonsia-ionarity, 26
Overlapping wavelets detection, 14— 17 nth order interval. 20. 22
orderly process. 22
periodicities. 20
P Poisson distribution, ’9
Poisson processes. 28— 3»
Pacemaker cells. 124 probability density function 22 —24
Pacemaker neurons. 23 probabiiity distribution functions. 26
Paradoxial sleep. I ) 6— i i 7 random variables, 22— 24
Paralysis. 129 renewal process. i9. 26 28
Parsing. 92. 100— iu! semi-Markov processes. 32 ’ 3
Parsing algorithm. 104 spectral analysis, see also Spectral analysis. 20.
i uinology, 2b 24— 26
Pattern. 87 speech signal. I‘J. 21
Pattern recognition. 37. 39. !25 siationarity. 20. 26
decision-theoretic approach, sec also Classifica stationary. 26
tion o f signals. 37— *6 survivor function. 23
syntactic method. 37, 87— 112 trends, 20. 26
wavelet detection. ! univariate. 19, 33—35
Pattern vector. 38 vocal cord, 19
PCA, see Principal components analysis Weiba!! distribution, 19. 3!
PCG. see Phonocardiography Poisson distribution. <9, 28
PDA. see Push-down umcnia'a Poisson probability. 29
Perception criterion function. 57 Poisson processes, 23. 28 -31
Periodicities. 2 0 Poie maps. 50
Periodogram. renewal process. 28 Polygonal approximations, 3
Petit mal seizure. ) 17 Postevtnt probability. 23
Phase structure grammar. 90 Power method, 72. 74
Phonocardiography (PCG). 1 .3 8 , 126— 127 Power spectral density (PSD). 119. 121
Pitch, 19. 79. 126. 129 spectral analysis. 25— 26
Pneumotachograph), 130 Power spectral density function. 24, 125
Point processes. 19— 36 Poisson p-oce.is. 30
action potentials. 19 PQRST complex, i 24
autoregressive process, 19 Pressure signals. I3U
canonical forms. 20, 22— 24 Preventricular contraction (PVC), 123
central limit theorem, 22 Primitives. 87— 89. 94, 96— 98, 104— 105, 11 i
conditional intensity function, 23— 24 Principal components analysis (PCA). 66—75. 119
correlation analysis. 20 Probability density function (PDF)
counting canonical form. 22— 24 Erlang (Gamma) distribution, 32
counting process. 20. 22— 24 point processes. 22— 24
cumulative distribution function, 23 spectral analysis, 25
ECG, 19, 21 wavelet detection. 1
Erlang (Gamma) distribution, 19. 32 Weibull distribution. 31
exponential autoregressive moving average. 32 Probability distribution, adaptive wavelet detection.
hazard function. 23 9— 10
intensity function. 23 Probability distribution functions, point processes.
interval histogram. 23 26
interval process. 20, 22— 24 Processing gain (PG). 139
laryngeal disorders. 19 Production, 90. 101 — 102
logarithmic survi\or function, 23 Prosthetics, conttol of, 121
models, see also other subtopics hereunder. !9— Pseudo inverse. 57
20, 26— 33 Psychiatric malfunctions, 115
motor unit. 19 Pulmonary dysfunctions, 127
moving average process, 19 Purkinje bundle. 121
multivariate, see also Multivariate point proc Push-down automata (PDA), 92, 95— 100
esses, 19. 33— 35 acceptance by empty store. 96
myoelectric activities, 19 nondeterministic. 95— 96
194 Biomedical Signal Processing

stochastic, 103— 104 Serial correlogram, 25. 27

Push-down automaton, 92 Side lobe level (SLL), 139
P-wave, 1, 121 Signai compression, see Compression
Signals, classification of, see Gassification of
signals
Q Signal-to-noise ratio, 7— 8, 11, 19, 113
Single fiber electromyograph (SFEMG), 119— 120
QRS complex, see also Wavelet detection, 124 Singular planes, 70
adaptive wavelet detection, 13— 16 Singular value decomposition {SVD), 67, 69— 75,
contour limiting, 6 119
ECG 5'gnai, 2— 5 Sino atria, see SA node
syntactic method o f detection, 106— 110 Sleep analysis, 115
wavelet detection. 1—5 Sleep disorders, 37, 115
Quadratic classifier, 50 Sleep research, 114
Quadratic window, 139 Sleep spindles, 1, 115, 119
Q wave, 121 Sleep state analysis, 1
Slurs. 124
Smooth muscle fibers, 124
R Somatosensory evoked potential (SEP, SSEP).
118— 121
Random shuffling, 27 Speaker verification, 77— 79
Random variables, 29 Spectral analysis, 20. 24— 26, 34
Erlang (Gamma) distribution, 32 Spectral leakage, 139
Markov. 32 Spectrum o f Poisson process, 30
point processes, 22—24 Speech analysis, 79
wavelet detection, 1 Speech processing, 87
Weibuil distribution. 31 Speech signal, 19, 2 L 78. 87
Rapid eye movement (REM). 116 Sphygmomanometer. 129
Rayleigh quotient, 64 Spikes, see Action potentials
Recognition o f signals, see Classification o f signals Spontaneous activity o f brain, 115
Reconstruction, 69 SSEP. see Somatosensory evoked potential
Rectangular window. 140, 142— 143 State estimation, 124
Regular grammar, 90. 94 State transition diagram. 91. 93— 94, 106— 107
Relaxation procedures .5 7 Stationarity, 20, 26
Renewal processes, 19, 26—28 Statistical signal classification, see also Classifica
Retina, 113— 114 tion o f signals. 39— 53
Rheoencephalography (REG), 126 Stochastic finite automaton, 102
Ridge regression method, 57 Stochastic languages, 101— 104
R-R interval. 121, 124 Stochastic push-down automata, 103— 104
point processes, 26 Stochastic recognizers. 102— 104
R wave, 2. 5. 19, 124 Storage, 37. 69
Structural approach, 87
Subcortical SEP, 118
s Supervised learning mode. 87
Surface electromyography (SEMG), 120— 121
SA node, 121 Survivor function, 23
Scattering diagrams, 24 Poisson process, 30
Scatter matrix, 63— 66, 75 SVD, see Singular value decomposition
Scene analysis, 87 Synchronized averaging. 120, 124
Schwarz inequality, 7 Syntactic methods, 37. 87— 112
Search algorithm, 81 carotid blood pressure, 104— 106
Search window, 81, 83— 84 ECG, 106— 110
Search without replacement, 77 EEG, 1 1 0 -1 1 1
Seizures, 117 examples, 104— 1 i 1
Semialtemating renewal (SAR) process, 33 formal languages, 89—92
Semi-Markov processes, 32— 33 grammatical inference. 104
Sensory cortex, 118 stochastic languages. 101— 104
SEP, see Somatosensory evoked potential syntactic recognizers. 92— 101
Separability, 63, 75— 76 syntax analysis, 101— 104
Separability criterion. 75, 77 wavelet detection, 1, 3
Serial correlation coefficients, 24— 25 Syntactic recognizers. 92— 101
Volume II 195

context-free push-down automata. 92, 95— 100 Vector electrocardiography (VCG), 124
finite stale automata, 92— 95 Ventricular fibrillation, 123
parsing, 92. 100— 101 VEP, see Visual evoked potential
syntax-directed translation, 100 VER, see Visual evoked responses
Syntax, B7, 89 Vesicular breath sounds (VBS), 128
Syntax analysis, 92, 101— 104 Vestibulospinal potentials, 119
Syntax-directed translation, 100 Vibrocardiography, 130
Systolic phase of heart, 104 Visual acuity, 118
Visual evoked potential (VEP), 118
Visual evoked responses (VER), overlapping wave
T lets detection. 14
Visual fields deficits, 118
Tachycardia, 122 Vocal cord, 19. 126. 128
Taper window. 139 Vocal tract. 78. 129
TA wave, 121 Voice. 38. 128— 129
Template, 1, 3. 5, 8— 9, 58, 79 V-waves, 115, 118
correction, 12— 16
Template adaptation, wavelet detection. 10— 11
Template matching, I w
Terminal symbols, 90— 91
Tetrahedron vectorcardiograms. 124
Thermal dilution, 130— 131 Wald-Woliowitz run test, 30
Theta range. 115 Warping function, see Time warping function
Time interval lengths (TIL), 30 Waveform. 37
Time series analysis, 37, 119, 121 Wavelet detection. 1— 18, 119— 120. 124
Time warping. 1. 79— 84 adaptive, see also Adaptive wavelet detection.
Time warping algorithms. 80 9— 16
Time warping function. 80— 83 algorithms. 1— 5
Tracheal breath sounds (TBS), 128 alignment, 1. 5
Training set, 38, 44, 46. 51. 58. 89. 104. 107. 11 i amplitude zone time epoch coding. 5
Transmission, 37 baseline shifts. 3— 4
Tremors, 121 contour limiting. 5— 6
Trends, 20, 26 coordinate reduction time encoding system. 5
Triangle window . 140— 141, 144— i 45 decision-theoretic. I
T wave. 121 Fourier discriptors, 3
Two-class discrimination, 69 jitters. 5
Two-dimensional signals, 132 matched filtering. I. 6— 8
Two-state semi-Markov model (TSSM). 32— 33 multivariate point processes, 35
overlapping wavelets, see tlso Overlapping wave
lets detection. 14— 17
'u pattern recognition. 1
polygonal approximations, 3
Univariate conditional intensity. 34 probability density functions, 1
Univariate point process analysis, 19 QRS complex, 1— 5
Univariate point processes, 33— 35 random variables. 1
Unrestricted grammar, 90 structural features. 1— o
Unrestricted stochastic grammar. 101— 102 syntactic, 1. 3
Unsupervised learning, 38 template. 1. 3, 5. 8— 9
U wave, 121 template matching. 1
time warping. I
Wei ball distribution, 19, 31
V Wheezes. 128
Windows, see specific types
Vectorcardiograms, 124 Within-class scatter matrix, 64— 66, 75

Ec557 Biomedical Signal Processing
No ratings yet
Ec557 Biomedical Signal Processing
1 page
Bruce, E.N.-biomedical Signal Processing&Signal Modeling
80% (5)
Bruce, E.N.-biomedical Signal Processing&Signal Modeling
534 pages
Advances in Biomedical Signal Processing
No ratings yet
Advances in Biomedical Signal Processing
4 pages
Medical Electronics - Lecture Notes, Study Material and Important Questions, Answers
No ratings yet
Medical Electronics - Lecture Notes, Study Material and Important Questions, Answers
5 pages
BMI Lab Manual
100% (1)
BMI Lab Manual
77 pages
Bm3591 Dte QB
No ratings yet
Bm3591 Dte QB
13 pages
Biomedical Engineering Ktu Mod 3
No ratings yet
Biomedical Engineering Ktu Mod 3
12 pages
Diagnostic and Therapeutic Equipment Exam
No ratings yet
Diagnostic and Therapeutic Equipment Exam
1 page
Bmi Lab Viva Question
No ratings yet
Bmi Lab Viva Question
7 pages
Unit III Signal Conditioning Circuits
No ratings yet
Unit III Signal Conditioning Circuits
15 pages
Biomedical Signal Processing Lectures 2011-12
No ratings yet
Biomedical Signal Processing Lectures 2011-12
510 pages
Bio Telemetry
100% (2)
Bio Telemetry
25 pages
Diagnostic & Therapeutic Equipment Overview
100% (1)
Diagnostic & Therapeutic Equipment Overview
40 pages
BM3491 Biomedical Instrumentation
100% (4)
BM3491 Biomedical Instrumentation
113 pages
Holter Monitoring and ECG Basics
No ratings yet
Holter Monitoring and ECG Basics
141 pages
Omd 551 Basics of Biomedical Instrumentation
100% (1)
Omd 551 Basics of Biomedical Instrumentation
24 pages
Ec8073-Medical Electronics Study Material
100% (2)
Ec8073-Medical Electronics Study Material
116 pages
Noise in Biomedical Instrumentation
100% (1)
Noise in Biomedical Instrumentation
17 pages
BM3591 Question Bank - DTE - Students
No ratings yet
BM3591 Question Bank - DTE - Students
3 pages
Unit-3 Notes PDF
No ratings yet
Unit-3 Notes PDF
16 pages
Wearable Devices QP
No ratings yet
Wearable Devices QP
1 page
DTE LAB MANUAL - 2017 Regulation
67% (3)
DTE LAB MANUAL - 2017 Regulation
90 pages
Biotelemetry 3.1 Introduction To Biotelemetry
100% (2)
Biotelemetry 3.1 Introduction To Biotelemetry
11 pages
OMD551-Basics of Biomedical Instrumentation
No ratings yet
OMD551-Basics of Biomedical Instrumentation
12 pages
Bio-Instrumentation - John G. Webster
No ratings yet
Bio-Instrumentation - John G. Webster
424 pages
Biomedical InstrumentationLab Manual BM3411
100% (1)
Biomedical InstrumentationLab Manual BM3411
102 pages
Biomedical Instrumentation
100% (1)
Biomedical Instrumentation
143 pages
Assist Devices
0% (1)
Assist Devices
1 page
Biomedical Instrumentation Systems Read Online 210111234042
33% (3)
Biomedical Instrumentation Systems Read Online 210111234042
64 pages
Biopotential Electrodes Overview
100% (1)
Biopotential Electrodes Overview
49 pages
CO-PO Justification for EC365 Course
No ratings yet
CO-PO Justification for EC365 Course
2 pages
Biomedical Instrumentation Syllabus 2017 Regulation
No ratings yet
Biomedical Instrumentation Syllabus 2017 Regulation
1 page
Biomedical Signal Processing
No ratings yet
Biomedical Signal Processing
2 pages
Wavelet Transform in ECG Analysis
No ratings yet
Wavelet Transform in ECG Analysis
18 pages
Mip-191bm72b Lab Manual
No ratings yet
Mip-191bm72b Lab Manual
51 pages
bm3401 SP Question Bank
No ratings yet
bm3401 SP Question Bank
14 pages
Bm3491 Bmi QB - Part B&C
100% (1)
Bm3491 Bmi QB - Part B&C
2 pages
Bio Medical Instrumentation PDF
100% (6)
Bio Medical Instrumentation PDF
120 pages
Unit 2 Wearable Systems
100% (2)
Unit 2 Wearable Systems
54 pages
1 .Intro of Biosignal Processing
No ratings yet
1 .Intro of Biosignal Processing
38 pages
CBM370 Wearable Devices Lecture Notes Spcet
No ratings yet
CBM370 Wearable Devices Lecture Notes Spcet
16 pages
Bio Electrodes
No ratings yet
Bio Electrodes
26 pages
Biomedical Instrumentation and Measurements
89% (164)
Biomedical Instrumentation and Measurements
536 pages
BM3351 Ahp Record
No ratings yet
BM3351 Ahp Record
43 pages
CBM360 Patient Safety Standards and Ethics Lecture Notes 1
No ratings yet
CBM360 Patient Safety Standards and Ethics Lecture Notes 1
31 pages
Wireless Body Area Networks Technology, Implementation, and Applications PDF
No ratings yet
Wireless Body Area Networks Technology, Implementation, and Applications PDF
571 pages
SP Final
100% (1)
SP Final
68 pages
Bio Signal Processing Question Bank
0% (1)
Bio Signal Processing Question Bank
5 pages
Bio Medical Signal Processing Koe082
No ratings yet
Bio Medical Signal Processing Koe082
2 pages
Biomedical Signal Processing Syllabus
100% (1)
Biomedical Signal Processing Syllabus
2 pages
BM3411 Set1
No ratings yet
BM3411 Set1
2 pages
Unit 1 Bio Potential Generation and Electrode Types
100% (1)
Unit 1 Bio Potential Generation and Electrode Types
13 pages
unit1-ELECTRO-PHYSIOLOGY AND BIO-POTENTIAL RECORDING
No ratings yet
unit1-ELECTRO-PHYSIOLOGY AND BIO-POTENTIAL RECORDING
130 pages
Páginas de The Biomedical Engineering Handbook
No ratings yet
Páginas de The Biomedical Engineering Handbook
50 pages
Biosignal and Bio Medical Image Processing - Matlab Based Applications - 2004 - (By Laxxuss)
100% (3)
Biosignal and Bio Medical Image Processing - Matlab Based Applications - 2004 - (By Laxxuss)
443 pages
Chapter1 Intro of Biomedical Signal Processing
100% (3)
Chapter1 Intro of Biomedical Signal Processing
50 pages
Biosignal Processing Fundamentals and Recent Applications With MATLAB ® 1st Edition Scribd Full Download
100% (11)
Biosignal Processing Fundamentals and Recent Applications With MATLAB ® 1st Edition Scribd Full Download
14 pages
Full Version Biosignal Processing Fundamentals and Recent Applications With MATLAB ®, 1st Edition Instant EPUB Download
No ratings yet
Full Version Biosignal Processing Fundamentals and Recent Applications With MATLAB ®, 1st Edition Instant EPUB Download
17 pages
Bio Medical
100% (1)
Bio Medical
13 pages
Biomedical Signal Processing
No ratings yet
Biomedical Signal Processing
11 pages
Dyna Flight 08
No ratings yet
Dyna Flight 08
12 pages
Scrum Cheat Sheet Overview
100% (1)
Scrum Cheat Sheet Overview
1 page
Statistics Activity Class 10
No ratings yet
Statistics Activity Class 10
6 pages
International Institutions in Business
No ratings yet
International Institutions in Business
3 pages
Compressor
No ratings yet
Compressor
8 pages
Thesis Writing Help and Guidance
100% (3)
Thesis Writing Help and Guidance
6 pages
2040 Pojacalo Za Zvucnici
No ratings yet
2040 Pojacalo Za Zvucnici
16 pages
Discourse Analysis: An Introduction (2 ED.) : C B. Paltridge
No ratings yet
Discourse Analysis: An Introduction (2 ED.) : C B. Paltridge
5 pages
Information Systems Development Methodologies in PDF
No ratings yet
Information Systems Development Methodologies in PDF
6 pages
Text Production Practice Sample
No ratings yet
Text Production Practice Sample
6 pages
2005 KS3 Maths - Paper 1 - Level 6-8
No ratings yet
2005 KS3 Maths - Paper 1 - Level 6-8
28 pages
Maintainability Mining Equip
No ratings yet
Maintainability Mining Equip
11 pages
Encoder To Microprocessor Interface Chip Chips: Features: Description
No ratings yet
Encoder To Microprocessor Interface Chip Chips: Features: Description
3 pages
SSIP I June 2022
100% (3)
SSIP I June 2022
4 pages
Dcs302ca61 Central Controller
No ratings yet
Dcs302ca61 Central Controller
28 pages
Biomechanics in Ergonomics 2nd Edition Accessible DOCX Download
100% (8)
Biomechanics in Ergonomics 2nd Edition Accessible DOCX Download
14 pages
Seavis Compact en
No ratings yet
Seavis Compact en
2 pages
Construction Blueprint Overview
No ratings yet
Construction Blueprint Overview
1 page
Reflective Essay on Teamwork and NGO Funding
No ratings yet
Reflective Essay on Teamwork and NGO Funding
10 pages
The Ultimate Guide To IBM Certified Infrastructure Deployment Professional - Maximo Asset Management V7.6
No ratings yet
The Ultimate Guide To IBM Certified Infrastructure Deployment Professional - Maximo Asset Management V7.6
2 pages
Understanding Worldviews in Week 2
No ratings yet
Understanding Worldviews in Week 2
18 pages
15R
100% (1)
15R
108 pages
Sony Str-k850p Ver1.0 Receiver
No ratings yet
Sony Str-k850p Ver1.0 Receiver
42 pages
Nuclear Technology
No ratings yet
Nuclear Technology
3 pages
Communications: Four Steps To Effective Listening
No ratings yet
Communications: Four Steps To Effective Listening
1 page
AER 504 Aerodynamics: Ryerson University Department of Aerospace Engineering
No ratings yet
AER 504 Aerodynamics: Ryerson University Department of Aerospace Engineering
8 pages
The Kindness Method
No ratings yet
The Kindness Method
4 pages
College Fee Details
No ratings yet
College Fee Details
225 pages
Albert Lightning Compiled by S. Patles
No ratings yet
Albert Lightning Compiled by S. Patles
18 pages
Week 31 Thursday 10 Nov 22
No ratings yet
Week 31 Thursday 10 Nov 22
3 pages