Reference Paper 4
Reference Paper 4
Article
Development of an Electronic Stethoscope and a Classification
Algorithm for Cardiopulmonary Sounds †
Yu-Chi Wu 1, * , Chin-Chuan Han 2 , Chao-Shu Chang 3 , Fu-Lin Chang 1 , Shi-Feng Chen 1 , Tsu-Yi Shieh 4,5 ,
Hsian-Min Chen 6 and Jin-Yuan Lin 1
1 Department of Electrical Engineering, National United University, Miaoli City 36003, Taiwan;
a0976335098@[Link] (F.-L.C.); sisterw961o3y94rm4@[Link] (S.-F.C.); yuan@[Link] (J.-Y.L.)
2 Department of Computer Science and Information Engineering, National United University,
Miaoli City 36003, Taiwan; cchan@[Link]
3 Department of Information Management, National United University, Miaoli City 36003, Taiwan;
cschang@[Link]
4 Section of Clinical Training, Department of Medical Education, Taichung Veterans General Hospital,
Taichung City 40705, Taiwan; zuyihsieh@[Link]
5 Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital,
Taichung City 40705, Taiwan
6 Center for Quantitative Imaging in Medicine (CQUIM), Department of Medical Research, Taichung Veterans
General Hospital, Taichung City 40705, Taiwan; hsmin6511@[Link]
* Correspondence: ycwu@[Link]; Tel.: +886-939967722
† This paper is an extended version of our paper published in 2021 IEEE International Conference on Consumer
Electronics-Taiwan (ICCE-TW), Penghu, Taiwan, 15–17 September 2021.
Abstract: With conventional stethoscopes, the auscultation results may vary from one doctor to
another due to a decline in his/her hearing ability with age or his/her different professional training,
and the problematic cardiopulmonary sound cannot be recorded for analysis. In this paper, to resolve
Citation: Wu, Y.-C.; Han, C.-C.;
the above-mentioned issues, an electronic stethoscope was developed consisting of a traditional
Chang, C.-S.; Chang, F.-L.; Chen, S.-F.;
stethoscope with a condenser microphone embedded in the head to collect cardiopulmonary sounds
Shieh, T.-Y.; Chen, H.-M.; Lin, J.-Y.
and an AI-based classifier for cardiopulmonary sounds was proposed. Different deployments of
Development of an Electronic
the microphone in the stethoscope head with amplification and filter circuits were explored and
Stethoscope and a Classification
Algorithm for Cardiopulmonary
analyzed using fast Fourier transform (FFT) to evaluate the effects of noise reduction. After testing,
Sounds. Sensors 2022, 22, 4263. the microphone placed in the stethoscope head surrounded by cork is found to have better noise
[Link] reduction. For classifying normal (healthy) and abnormal (pathological) cardiopulmonary sounds,
each sample of cardiopulmonary sound is first segmented into several small frames and then a
Academic Editor: Leopoldo
principal component analysis is performed on each small frame. The difference signal is obtained by
Angrisani
subtracting PCA from the original signal. MFCC (Mel-frequency cepstral coefficients) and statistics
Received: 18 April 2022 are used for feature extraction based on the difference signal, and ensemble learning is used as the
Accepted: 1 June 2022 classifier. The final results are determined by voting based on the classification results of each small
Published: 3 June 2022
frame. After the testing, two distinct classifiers, one for heart sounds and one for lung sounds, are
Publisher’s Note: MDPI stays neutral proposed. The best voting for heart sounds falls at 5–45% and the best voting for lung sounds falls at
with regard to jurisdictional claims in 5–65%. The best accuracy of 86.9%, sensitivity of 81.9%, specificity of 91.8%, and F1 score of 86.1%
published maps and institutional affil- are obtained for heart sounds using 2 s frame segmentation with a 20% overlap, whereas the best
iations. accuracy of 73.3%, sensitivity of 66.7%, specificity of 80%, and F1 score of 71.5% are yielded for lung
sounds using 5 s frame segmentation with a 50% overlap.
diseases but also lung sounds to diagnose whether there are abnormalities in the lungs.
This is known as auscultation. Traditional stethoscopes utilized horn-shaped stethoscope
heads for listening to the sounds of the movements of the visceral organs. However, with
conventional stethoscopes, the auscultation results may vary from one doctor to another
due to a decline in his/her hearing ability with age or his/her different professional training
background, and the problematic cardiopulmonary sound cannot be recorded for further
analysis. Therefore, electronic stethoscopes that can record and analyze/classify cardiopul-
monary sounds are needed to cope with these issues. If an effective classification algorithm
can be embedded into electronic stethoscopes, doctors can use electronic stethoscopes to
obtain a prompt preliminary diagnosis in an emergency, or those who need secondary pre-
vention after discharge from the hospital and care at home can have long-term monitoring
and early detection of abnormalities.
Based on the sensor used in electronic stethoscopes, air- and contact-conduction elec-
tronic stethoscopes are two common types of electronic stethoscopes. The air-conduction
electronic stethoscope utilizes an electromagnetic coil or electret capacitor as the sensor to
collect sound signals with high stability and strong reliability but with low sensitivity. The
contact conduction electronic stethoscope mostly uses piezoelectric materials as a sound
sensor with improved anti-interference ability and sensitivity. However, any deforma-
tion or damage to the piezoelectric materials in the manufacturing process degrades the
sensitivity of the stethoscope.
In the category of air-conduction electronic stethoscopes, McLane et al. [1] developed
an advanced air-conduction electronic stethoscope using a microphone array, an external-
facing microphone, and an onboard signal processor to perform adaptive noise suppression
of lung auscultation in noisy clinical settings. Zhang et al. [2] designed an electronic
auscultation system for the graphic recording of heart, lung, and trachea (HLT) sounds
by placing microphones in a CNC-machined Delrin housing case covered by diaphragms.
Sixteen acoustic sensors, of which 14 were positioned in a memory foam pad and 2 were
placed directly on the heart and trachea, were used to record the acoustic data using a
LabView program, and the waveforms in the time and frequency domain, as well as a
spectrogram for visual examination, could be plotted using Matlab.
In the category of contact-conduction electronic stethoscopes, Toda and Thompson [3]
devised a contact-vibration sensor by bonding a piezoelectric polyvinylidene fluoride
(PVDF) film to a curved rubber piece having a front-contact face. Vibrations transmitted
from the front-contact face through the rubber to the film cause pressure normal to the
surface of the film and then an electric field is induced by the piezoelectric effect. Duan
et al. [4] proposed a double-sided diaphragm micro-electro-mechanical system (MEMS)
electronic stethoscope (DMES) based on a bionic lollipop-shaped MEMS sound-sensitive
sensor. Shi et al. [5] designed a stethoscope also based on the bionic lollipop-shaped MEMS
and developed an acquisition circuit and PC upper machine for real-time acquisition of
heart sound signals. Recently, forcecardiography (FCG), a non-invasive technique that
measures vibrations via force-sensing resistors (FSR), has been proposed [6,7] to measure
heart mechanical vibrations. The active area of FSR was fixed with a rigid dome and an
accelerometer was used to acquire the dorso-ventral seismocardiography (SCG) signal.
In [6], the FCG sensor and the accelerometer were firmly mounted on a plexiglass rigid
board. In [7], a lead-zirconate-titanate piezoelectric disk equipped with the same dome-
shaped mechanical coupler used for the FSR-based sensor was proposed for simultaneous
monitoring of respiration, infrasonic cardiac vibrations, and heart sounds.
Three steps are involved in the analysis or classification of auscultatory sounds: pre-
processing, feature extraction, and classifier design. Pre-processing deals with the noise-
processing or signal-frame decomposition. For noise processing, noise reduction and
de-noising of the samples are performed to enhance the noise immunity of the classification
algorithm. For cardiopulmonary signals, the location of the recorded cardiopulmonary
sounds may vary but usually, both cardiac and pulmonary signals are included. Therefore,
Chien et al. [8] used two microphones to collect signals from the left and right chest and
Sensors 2022, 22, 4263 3 of 25
presented a fast independent component analysis (ICA) algorithm to separate heart and
lung sounds. Hadjileontiadis and Panas [9] and Liu et al. [8] used wavelet transform tech-
niques to denoise heart sounds from lung signals [9] and, conversely, filter out lung sounds
from heart sounds [10]. Mayorga et al. [11] proposed an empirical mode decomposition
(EMD) followed by Gaussian mixed models (GMM) to improve ICA for separating heart
and lung sounds. In addition to noise reduction, EMD is also a method to estimate the
frequency components [12–14] and each component can be used as a classification feature.
In [12], EMD was used to separate heart sounds from lung signals; in [13], EMD was used to
capture heart murmurs; and in [14], EMD was used to segment and capture the underlying
heart sounds (S1, S2). Varghees and Ramachandran [15] proposed an empirical wavelet
transform (EWT)-based heart sound signal decomposition method by integrating both
EMD and WT. Ntalampiras [16] transformed the lung sound signals to the frequency and
wavelet domains before performing the subsequent analysis.
The pre-processed acoustic signals then go through feature extraction for time- and/or
frequency-domain features. The main purpose of capturing time-domain features is to
observe inter-beat or inter-respiratory variations. The frequency-domain features are most
commonly expressed as coefficients of frequency spectrum or inverse frequency spectrum,
and the signals are transformed to the frequency domain to observe the changes in the
signals in the frequency domain. The Mel-frequency cepstral coefficient is widely used in
sound processing [17,18]. Potes et al. [17] used frequency-domain features together with
time-domain features. Chowdhury et al. [18] denoised and compressed phonocardiography
(PCG) signals using a multi-resolution analysis based on the discrete wavelet transform
(DWT), segmented PCG signals using the Shannon energy envelope and zero-crossing into
four parts and extracted features from PCG signals using a Mel-scaled power spectrogram
and Mel-frequency cepstral coefficients (MFCC). Kumar et al. [19] used a set of features:
time, frequency, and statistical or phase space features.
The extracted features are finally fed into a designed classifier. In [17], a total of
124 time-frequency features were extracted from the PCG and input to a variant of the
AdaBoost classifier, and PCG cardiac cycles decomposed into four frequency bands were
input to a second classifier using a convolutional neural network (CNN). An ensemble
of classifiers combining the outputs of AdaBoost and the CNN was designed to classify
normal/abnormal heart sounds. In [18], a five-layer feed-forward deep neural network
(DNN) model was used. In [19], a support vector machine (SVM) classifier was trained and
applied for each of the feature sets. Li et al. [20] applied a wavelet scattering transform and
multidimensional scaling method and then presented a twin SVM (TWSVM) to classify
heart sound signals. Gjoreski et al. [21] utilized classic machine-learning (ML) to learn from
expert features and end-to-end deep learning (DL) to learn from a spectro-temporal repre-
sentation of the signal. Shuvo et al. [22] proposed a lightweight end-to-end convolutional
recurrent neural network (CRNN) architecture for the automatic detection of five classes of
cardiac auscultation using raw PCG signals. This model was tested on PhysioNet/CinC
2016 challenge dataset [23] achieving an accuracy of 86.57%.
In this study, a low-cost electronic stethoscope with noise reduction is proposed, and
an effective classification algorithm, based on principal component analysis (PCA), MFCC,
and ensemble learning, is developed for auscultatory sounds. A graphical user interface is
established to save or replay the recorded sounds in Raspberry Pi. The presented system
can be useful in medical care.
Figure 1. System
Figure architecture
1. System of electronicofstethoscope.
architecture electronic stethoscope.
A condenser microphone that could be easily found on the market at a reasonable price
A condenser
was utilized in this study. microphone that could
Condenser microphones withoutbethe easily
magnet found
and coil ongenerate
the market
voltage
price was utilized in this study. Condenser microphones withoutand
changes with changing distances between two diaphragms in the capacitor the magn
present the advantages of being lightweight, small in size, and have high sensitivity,
anderate
they voltage
are often usedchanges with changing
in high-quality [Link] between two
Condenser microphones withdiaphragms
higher
and present
sensitivity the advantages
could record of being
more sound details but lightweight,
would also easily small
absorb innoise
size,inand the have h
environment; therefore, they are more suitable for use in quiet
and they are often used in high-quality recording. Condenser microphon studios. Figure 2a shows
our design of the electronic stethoscope head [24] where the condenser microphone (mic 1)
is sensitivity
placed. Figurecould
2b shows record
the headmore
with micsound details
1 inside, the cork,butandwould also easily
the diaphragm. Here, aabsorb
vironment;
round disk cork istherefore,
presented and they aretomore
is used suitable
cover the back of for use ifindesired.
the head quiet Figure
studios. 2c Figu
illustrates the edge of the head (face up) surrounded by the cork.
design of the electronic stethoscope head [24] where the condenser microp No gasket is stuffed inside
the head. Figure 2d shows the back of the head attached to a microphone (mic 2) to collect
placed. Figure
environmental noise. 2b shows
Figure the head
2e depicts with
the front of themic
head1 encased
inside,with thethecork, and the diap
diaphragm;
theround
edge ofdisk cork
the head is presented
is surrounded by theand is Aused
cork. to cover
shielded line wasthe used back of thethehead if
to connect
microphone and amplifier circuits. Two-stage amplification/filtering
2c illustrates the edge of the head (face up) surrounded by the is adopted to process
cork. No g
the small signals collected by microphones and more accurate heart/lung sound waveforms
areinside
obtainedthe head.
when Figure
compared 2donly
with shows the back
one-stage of the head attached
amplification/filtering. The frequency to a microp
ofcollect
the heartenvironmental
sounds is from 1 Hz noise.
to 800 Figure
Hz and the 2ehuman
depicts ear the front in
is sensitive ofthetherange
head of encase
40phragm;
Hz to 400 Hz s most of the signals below 20 Hz are inaudible [25].
the edge of the head is surrounded by the cork. A shielded line w Therefore, the filtering
band of heart sounds in this study is below 400 Hz and the low-pass filter is designed to
nect the microphone and amplifier circuits. Two-stage amplification/filte
attenuate the signals above 400 Hz. The main frequency range of the lung sounds is from
100toHzprocess
to 2000 the small
Hz, and the signals
high-passcollected by microphones
filter and low-pass and more
filter are designed to filteraccurate
out he
signals below 100 Hz and above 2000 Hz. Therefore, after
waveforms are obtained when compared with only one-stage amplificatioamplification, the heart or lung
sound is filtered by the corresponding filter to filter out the unwanted frequencies of sound.
frequency of the heart sounds is from 1 Hz to 800 Hz and the human ear is
As the two processed heart/lung signals and noise signals need subtraction, the circuit
range of 40 Hz
signal-processing delaytowould
400 Hz s most
affect of the signals
noise reduction. To solve below 20 Hztwo-stage
this problem, are inaudible
amplification/filtering
the filtering band of heart sounds in this study is below 400 Figures
circuits with different gains were applied for noise sounds. Hz and 3 the lo
and 4 illustrate amplification circuits for heart/lung sounds and noise sounds, respectively.
designed to attenuate the signals above 400 Hz. The main frequency ra
sounds is from 100 Hz to 2000 Hz, and the high-pass filter and low-pass filt
to filter out signals below 100 Hz and above 2000 Hz. Therefore, after am
heart or lung sound is filtered by the corresponding filter to filter out the
Sensors 2022, 22, x FOR PEER REVIEW 5 of 25
Sensors 2022, 22, x FOR PEER REVIEW
Sensors 2022, 22, x FOR PEER REVIEW
5 of 25
5 of 25
quencies of sound. As the two processed heart/lung signals and noise signals need sub-
quencies
traction, of
quenciesthe of sound.
circuitAs
sound. As the two processed
the two
signal-processingprocessed heart/lung
heart/lung
delay signals
wouldsignals
affect andand
noise noise
noise signals
signals
reduction. need
need
To sub-
solve sub-
this
Sensors 2022, 22, 4263 5 of 25
traction,
problem, two-stage amplification/filtering circuits with different gains were appliedthis
traction, the
the circuit
circuit signal-processing
signal-processing delay
delay would
would affect
affect noise
noise reduction.
reduction. To To solve
solve this for
problem,
problem,
noise two-stage
two-stage
sounds. amplification/filtering
amplification/filtering
Figures circuitswith
circuits
3 and 4 illustrate amplification withcircuits
different
different gains
gains
for were
were
heart/lung applied
applied
sounds for
forand
noise
noise sounds.
sounds. Figures
Figures 3 and
and 4
4 illustrate
illustrate amplification
amplification circuits
circuits forfor heart/lung
heart/lung
noise sounds, respectively. Figure 5 depicts the filter circuits for heart and lung sounds. sounds
sounds and and
Figure
noise 5 depicts
noisesounds,
sounds, the filter circuits
respectively.
respectively. Figure
Figure for heart
55depicts
depictsand
the
thelung
filtersounds.
circuits
filter circuitsThe
for same
heart
for filters
and
heart andlungwere applied
sounds.
lung sounds.
The same filters were applied to noise sounds. Figure 6 shows the whole circuit diagram
to
The noise
The same
same sounds.
filters Figure
were
filters were 6 shows
applied to the
noisewhole
[Link] diagram
Figure 6 showswhere
the amplifiers
whole and
circuit filters
diagram are
where amplifiers and applied
filters aretomarked
noise sounds. Figure
for better 6 shows the whole circuit diagram
understanding.
marked
where for better
amplifiers understanding.
and filters are marked for better
where amplifiers and filters are marked for better understanding. understanding.
(a)
(a) (b) (c)
(c) (d)(d) (e) (e)
(a) (b) (c) (d) (e)
[Link]
Condenser microphone
Figure
Figure 2. Condenser microphoneplaced
microphone placedinin
placed instethoscope.
stethoscope.
stethoscope. (a)(a)
a mic
(a) putput
aa mic
mic into
put a stethoscope
into
into aa stethoscope
stethoscope head, (b) a(b) a
head,
head, (b) a
Figure 2. Condenser
stethoscope
stethoscope head with
head microphone
mic placed
mic 11 inside,
inside,the in stethoscope.
thecork,
cork, and
andthe (a) a mic(c)put
thediaphragm,
diaphragm, into
thethe
(c) edge a of
edge stethoscope
aofhead (face
a head head, (b) a
up) up)
(face
stethoscope
stethoscope head with mic 1 inside, the cork, and the diaphragm, (c) the edge of a head (face up)
surroundedhead
surrounded by with
by the
the mic
cork,
cork, (d)1the
(d) inside,
the backthe
back cork,
ofofthe
thehead and
head the diaphragm,
attached
attached toto (c) the edge
a microphone
a microphone (mic of and
2),
(mic a head
2), (face
(e) theup)
(e) the
and
surrounded
surrounded by
by the cork,
theencased (d)
cork, (d) the back
thethe
back of the head
of the head attached
attached to
tothea microphone
ahead
microphone (mic
(micby2),2), and (e) the front
frontof
front ofthe
the head
head encased with
with the diaphragm;
diaphragm; the
theedge
edge ofofthe surrounded
head surrounded byand
the cork(e)
the corkthe
of theofhead
front encased
the head with the
encased withdiaphragm;
the diaphragm; the edge of theofhead
the edge surrounded
the head surroundedby theby cork
the cork
(a) (b)
(a) (b)
(a)
Figure 3. (a) First-stage (b)
amplification circuit and (b) second-stage amplification circuit for
Figure 3. (a)
heart/lung First-stage amplification circuit and (b) second-stage amplification circuit for
sounds.
Figure
Figure 3.3.(a)(a)
heart/lung First-stage
First-stage
sounds. amplification
amplification circuit
circuit and and (b) second-stage
(b) second-stage amplification
amplification circuit circuit
for heart/lung for
sounds.
heart/lung sounds.
(a) (b)
(a) amplification circuit and (b) second-stage
Sensors 2022, 22, x FOR PEER REVIEWFigure 4. (a) First-stage (b)amplification circuit for noise
6 of 25
sounds. (a) (b)
Figure 4. (a)
Figure 4. (a)First-stage
First-stage amplification
amplification circuit
circuit and
and (b) (b) second-stage
second-stage amplification
amplification circuit forcircuit for noise
noise sounds.
Figure
sounds.4. (a) First-stage amplification circuit and (b) second-stage amplification circuit for noise
sounds.
(a) (b)
Figure
Figure5.
5.(a)
(a)Filter
Filtercircuit
circuitfor
forheart
heart sounds
sounds and
and (b)
(b) filter
filter circuit
circuit for
for lung
lung sounds.
sounds.
Sensors 2022, 22, 4263 (a) (b) 6 of 25
Figure 5. (a) Filter circuit for heart sounds and (b) filter circuit for lung sounds.
[Link]
Figure Circuitdiagram
diagram of
of the
the presented
presentedsystem.
system.
AAgraphical
graphical user
user interface
interface (GUI)
(GUI)for
forrecording
recordingsounds
soundswaswasalso developed
also developed during thethe
during
course of this study. For noise reduction, we designed 5 different models of
course of this study. For noise reduction, we designed 5 different models of electronic electronic
stethoscopeheads
stethoscope headsasasshown
shownininTable
Table1 1[24],
[24],where
wherea acommercial
commercialelectronic
electronicstethoscope
stethoscopewas
was included. Two microphones were used in Models 1~2, one inside
included. Two microphones were used in Models 1~2, one inside the stethoscope headthe stethoscope
head for collecting heart/lung sounds and the other on the back of the stethoscope head
for collecting heart/lung sounds and the other on the back of the stethoscope head for
for collecting environmental noise. The difference between these two models is that one
collecting environmental noise. The difference between these two models is that one is
without a cork (Model 1) and the other one is surrounded by a cork (Model 2, shown as
in Figure 2c,e). Two stethoscope heads connected back-to-back are used in Models 3 and
4 and each stethoscope has its own microphone inside the head. The difference between
these two models is that one is without a cork (Model 3) and the other one is surrounded
by a cork (Model 4) as in Model 2. The difference between Models 1–2 and Models 3–4
is the way the microphone is used to collect environmental noise. Mic 2 in Models 1–2 is
exposed to the air, whereas the one in Models 3–4 is inside the stethoscope head. Therefore,
by comparing Models 1 and 3 (or Models 2 and 4), we can see the effects of noise reduction
by the subtracter when a different way of collecting environmental noise is used. When
compared with Model 1, Model 2 is designed to see the effects of noise isolation on the
cork. When compared with Model 3, the design of Model 4 is also to see the effects of noise
isolation on the cork. One stethoscope head (one microphone inside the head) fully covered
by a cork (Model
by a cork4)by as aincork
(Model Model (Model
4) [Link] 4)difference
Model as [Link] between 2. The difference
difference Models
between 1–2between
and Models
Models Models
1–2 3–4Mod
and 1–2
is
by a cork
by a cork (Model 4)by(Model
asain cork 4)
Model as 2.
(Model inTheModel
4) as [Link]
differenceThe difference between
2. The difference
between Models 1–2 Models
between
and Models1–2
Modelsand 3–4 Mod
1–2 is a
the way thethe way the
microphone the way the microphone
is used
microphone to collect is collect
is usedenvironmental
to used toenvironmental
collect
noise. environmental
Mic noise.
2 in Models
Micnoise. inMic
2 1–2 Modis 2
the way the themicrophone
way the the way microphone
is the
used to iscollect
microphone used environmental
to
is collect
used toenvironmental
[Link]
Mic noise.
2 in Mic 2 in
noise.
Models 1–2 Mode
Mic is2
exposed to exposed exposed
the air, whereas
to the air, towhereas
the the
oneair, whereas
in Models
the one3–4 inthe isone
inside
Models in3–4
Models 3–4 is
theisstethoscope
inside theinsidehead. theTherefore,
stethoscope stethoscop
head. T
exposed toexposed to the air,the
exposed
the air, whereas
by 1comparing
towhereas
the
one air,
in the one in
whereas
Models the
3–4 Models
one
is in3–4
inside is stethoscope
Models
the inside
3–4 is the stethoscope
inside head. T
theTherefore,
head. stethoscop
by comparing Models
by comparing and 1Models
3 (or Models
Models and 3 (or 21andand
Models 3 (or
4), we2Models
can
andsee4), 2the
weand 4),see
effects
can we of can
the seereduction
noise
effects theofeffects
noise r
by comparing by comparing
Models 1 Models
by comparing
and 3 (or 1Models
and 3 (or Models
12 and 34),(orwe2Models
and
can 4),
see 2weand
the can 4),see
effectswethe can
of effects
see the
noise ofeffects
noise ro
reduction
by the subtracter
by the when by the
subtracter subtracter
a different
when away when a different
of collecting
different way of collecting
way ofenvironmental
collecting noise
environmental environmental
is [Link] no
is use
Sensors 2022, 22, 4263 by the subtracter
by the subtracter when when away
by thea subtracter
different different
when of 1, way of collecting
acollecting
different environmental environmental
way of collecting noise ofnoise
environmental
is effects
used. Whenis use
noi
compared with Model
compared compared1, Model
with Model with isModel
2 1, designed
Model 2 Model see2the
istodesigned is designed
effects
to see of theto
noisesee the
effects of7 noise
isolation 25 on of nois
the
isolatio
comparedcompared
with Model with
compared
cork.
Model
1,When
Model with21,Model
comparedisModel
designed1,2Model
is to
with3,Model
designed
see2 is
the
3, the
to
designedsee the
effects effects
oftonoise
see the ofeffects
noiseon
isolation isolatio
of noise
the
cork. [Link]
When with Model
compared 3, the
with design
Model of
theModel
design ofdesign
4 is also
Model toof 4 Model
see isthe 4 is
alsoeffects
to seealso
of
the toeffect
noise see
cork. When cork. When
compared cork. compared
When
with Model with
compared3, Model
the with3,Model
design the
of design
Model3, the4ofis Model
design
also of4see
to is also
Modelthe to
4 issee
effectsalsothe
of to effects
see t
noise
isolation onisolation isolation
the [Link] on
the stethoscopethe cork.
cork. One stethoscope One stethoscope
head (one microphone head
head (one microphone (one microphone
inside the head) insidefully inside
cov-thf
the head)
isolation on isolation
the cork. onOne the cork.
isolation on the One
stethoscope stethoscope
cork. One(one
head head
stethoscope (one
microphone headmicrophone
(one microphone
inside the inside
head) the head)
inside
fully f
the
ered
by a cork is by
useda cork
in isby
Model
ered usedaered in by
5,cork
which is aused
Model cork
is isModel
5,inwhich
different usedfromisin5, Model
Model5,
different
which 2which
from
is is different
Model
by missing
different 2one
from byModel from2 Model
missing
microphone onemissing
by 2cov-
micro- by om
ered is
ered by a cork byused
aered
corkin byisaused
Model cork5, in Model
iswhich
used in is5,different
which5,isfrom
Model different
which from
is different
Model 2 by Modelfrom2 Model
missing byonemissing
2 by m
micro- on
on the phone
back ofonthethehead
back
phone and
on phone
of adding
the
the on an
head
back the
ofand back
additional
the adding
head of the an
and head
piece ofand
additional
adding cork adding
an (as
piece
additionalancork
shown
of additional
in
pieceFigure
(as shownpiece
2b) in
of cork toofFigure
(as cork (a
shown
phone on the phonebackonphone
of the
the back
on the
head ofandthe
back head and
of the
adding adding
head andan additional
adding
piecean piece
ofadditional (asof cork
piece(as shown
ofFigure
cork (as
cover the
2b) back of 2b)
to cover the head;
thetoback 2b)
cover oftothe
the cover
therefore, head;
back the
the back
design
oftherefore,
the ofan
head; ofthe
additional
Model
the head;
design
therefore, 5therefore,
isthe
oftoModel
see the
design
cork
the isdesign
effects
5of
shown
to see
Model ofof5noise
the
in
Model
iseffects
to see 5 istheto
of
2b)ontothe
cover2b)the
to back
cover
2b)of the
to back
cover
the head; of therefore,
the the
back head;
of the therefore,
the head;
design the
therefore,design
of Model the5ofdesign
isModel
to seeof5the is to
Model see
effects5 is the
ofto
isolation
noise cork
isolation without
noise noise
onisolation
the using
cork isolation
the
without
on corkon
the subtracter.
usingthe cork without
the subtracter.
without using theusing the subtracter.
subtracter.
noiseon
noise isolation isolation
noise
the cork onwithout
the cork
isolation onusingwithout
the cork using theusing
without
the subtracter. subtracter.
the subtracter.
Table [Link] 1. Six Table
Six differentdifferent
models Table
models
of
1. Six [Link]
electronic
different different
electronic
models models
stethoscope
stethoscope
of of electronic
heads.
electronic heads. stethoscope
stethoscope heads. heads.
Table 1. SixTable 1. Six
different different
Table
models1. Six models
differentofmodels
of electronic electronic stethoscope
of electronic
stethoscope heads. heads.
heads. stethoscope
Model 1 Model 1ModelModel11 Model22Model 2
Model 2 Model Model 33 Model 3Model 3
Model
Model 1 Model 1Model 1 Model 2 Model 2Model 2 Model 3 Model 3Model 3
One stethoscope One stethoscope One stethoscope
One stethoscope
One stethoscope
stethoscope One One stethoscope
stethoscope
One stethoscope
stethoscope
One stethoscope
One stethoscope One One
One stethoscope One stethoscope Two
Two stethoscopes Two stethos
Two stethoscopes
stethoscopes
One microphone One
One microphonemicrophone One mirophone
One microphone One One mirophone
mirophone
One mirophone Two stethoscopes
Two stethoscopes Two stethosc
One microphone
One microphone One microphone
(without cork)
One
One mirophone mirophone
(with
One mirophone
cork) (without(without
(without cork)
(without cork) cork) co
(without cork) (without
(without cork) cork) (with cork)(with cork) (with cork) (without(without
(without cork) cork) co
(without(without
(without cork) cork) cork) (with cork)(with cork)
(with cork)
Figure [Link]
Figure diagram of experimental site II. site II.
Figure [Link]
Schematic diagram
diagramof experimental
of experimental site II.
2.2. Heart/Lung Sound Classification
2.2. Heart/Lung SoundweClassification
In this subsection, focus on the design of a heart/lung sound classification algo-
2.2. Heart/Lung Sound Classification
[Link]
Taiwan, collecting we
subsection, patient dataon
focus for the
analysis
designmustof first be approved by
a heart/lung the Institu-
sound classificatio
In thisBoard
tional Review subsection, we focus
(IRB). Moreover, even theonapproval
the design of acollecting
is granted, heart/lungenough sound
data class
rithm.
is In
time- andTaiwan, collecting
labor-consuming. patient data
For thepatient
convenience for analysis
of for
research must
without first be approved
goingfirst
through an by th
rithm.
tutional
In Taiwan,
Review
collecting data analysis must be approve
Institutional ReviewBoard (IRB).
Board (IRB) Moreover,
review, the studyeven
of thethe approval
presented is granted,
algorithm is focusedcollecting
on e
tutional
data
the heart Review
is time- andsound
and lung Board (IRB).
labor-consuming.
samples from the Moreover,
For the domain
public even
convenience the approval
[28,29]. of research
These is granted, coll
without going t
cardiopulmonary
data
samples is time-
were and
first labor-consuming.
pre-processed by For
segmentation, the convenience
dimensionality
an Institutional Review Board (IRB) review, the study of the presented of research
reduction, and without
signal
algorithm
processing,
an then
Institutional the time-
Reviewand frequency-domain
Board (IRB) features
review, of
thethe samples
study were
of theextracted
presented al
cused on the heart and lung sound samples from the public domain [28,29]. These
and the classifier model was designed by using ensemble learning, and finally, these feature
cusedwere
opulmonary
vectors on the
fed heart
samples
into and lung
the were first
classifier sound samples
pre-processed
model for training from the public
by testing.
and segmentation, domain [28,29
dimensionality red
opulmonary
and samplesthen
signal processing, were thefirst pre-processed
time- by segmentation,
and frequency-domain features ofdimensiona
the sample
2.2.1. Pre-Processing
extracted
and signal andprocessing,
the classifier model
then thewas
time-designed by using ensemblefeatures
and frequency-domain learning,ofand
the
these In this
featurestudy, the
vectors heartbeat
were frequency
fed into the(50–80 beats
classifier per minute)
model for and respiration
training and fre-
testing.
extracted and the classifier model was designed by using ensemble learnin
quency (12–20 beats per minute) were assumed for segmentation. To keep the complete
these feature
information of onevectors
heartbeatwereor onefed into the
respiration in aclassifier
small frame,modeldifferentfor training
lengths of smalland test
2.2.1.
framesPre-Processing
were tested. The heart and lung sound signals were segmented into small frames.
Different lengths
In Pre-Processing
this study, of small
the frames
heartbeat werefrequency
tested to see(50–80
which length
beats would
per minute)give theand betterrespirat
2.2.1.
performance, 1, 1.5, and 2 s for heart sounds, and 3, 4, and 5 s for lung sounds. In addition,
quency (12–20 beats per minute) were assumed for segmentation. To keep the co
In this
overlapping study,
ratios, such asthe0%,heartbeat
20%, and 50%, frequency
were adopted (50–80 beatsthe
to segment per minute)
original sam- and r
information
ples in order toof one heartbeat
increase the number orofone respiration
samples and to seeinifathesmall frame,segmentation
overlapping different lengths o
quency (12–20
frames
beats per minute) were assumed for segmentation. To keep
approachwere
would tested. Therecognition
give better heart andresults.
lung After
sound signals were
segmentation, segmented
the small into small
frame signals
information
were subjected
Different toofprincipal
lengths one heartbeat
of small component
frames oranalysis
one respiration
were (PCA)to
tested [30]see inwhich
and a small
then the frame,
original
length different
signals
would le
give the
frames were1,tested.
were subtracted
performance, 1.5, andThe
based on PCA toheart
2 s forobtainand
heart the lung sound
difference
sounds, signals
and 3, 4, signals
and 5 swere
(frames after segmented
PCA).
for lung In our
sounds. In into
ad
experience, abnormal heart sounds can be divided into three categories: heartbeat compo-
Different lengths
overlapping ratios, of small
such as 0%,frames
20%, were
and 50%,tested
were to see which
adopted to lengththe
segment would
origing
nents, abnormal beating components, and noise. The heartbeat components refer to major
performance,
ples in order
cardiac cycles. to 1, 1.5, and
Forincrease
a normal the 2number
or an sabnormal
for heart
ofheartsounds,
samples
sound,and and 3, 4,ifthese
to see
there exist and 5 s forcycles:
the overlapping
cardiac lung segme
sound
overlapping
approach
diastole, would
systole, ratios,
give and
diastole, such
better as 0%,
For20%,
recognition
systole. and 50%,
results.
an abnormal were
After
heart sound adopted
segmentation,
in terms of PCG, to the
segment the
small fra
there
exist
nals irregular vibrating waves during these cycles. These vibrations here refer to abnormal
pleswere subjected
in order to principal
to increase component
the number analysisand
of samples (PCA) to see[30]ifandthethen the origi
overlapping
beating components. Basically, the heartbeat sound will account for more than 80% of the
nals
total
were
approach subtracted
would
input signal
based
give
and these
onwill
better
signals
PCA to obtain the
recognition
often dominateresults.
difference
After signals
the classification segmentation,(frames after P
results. The noise the sm
our
nals
or experience,
wereheart/lung
abnormal abnormal
subjected sounds heart sounds
to principal
account forcomponent
a smallcan be divided
amount analysis into
of the original(PCA) three
signal.[30]categories:
and thenhe
However, th
components,
what we need and are concerned about are the abnormal beating sounds. Therefore,componen
abnormal beating components, and noise.
nals were subtracted based on PCA to obtain the difference signals (frames The heartbeat we
use principal component analysis to remove the first 85–95% of the principal component
our experience, abnormal heart sounds can be divided into three catego
components, abnormal beating components, and noise. The heartbeat com
than 80% of the total input signal and these signals will often dominate the classi
results. The noise or abnormal heart/lung sounds account for a small amount of th
inal signal. However, what we need and are concerned about are the abnormal
Sensors 2022, 22, 4263
sounds. Therefore, we use principal component analysis to remove the9first of 25
85–95%
principal component to reduce the classification impact dominated by the he
sound, leaving the signal data with abnormal sound as the main component for
extraction.
to reduce theFigure 9 shows
classification thedominated
impact sound framesby the before
heartbeatand after
sound, PCA the
leaving forsignal
normal and
data heart
mal with abnormal
[Link]
Figureas 10
theshows
main component
the sound forframes
feature extraction.
before and Figure
after9PCA
showsfor norm
the sound frames before and after PCA for normal and abnormal heart sounds. Figure 10
abnormal lung sounds. The horizontal axis represents the sampling points and the
shows the sound frames before and after PCA for normal and abnormal lung sounds. The
axis represents
horizontal the sound
axis represents amplitudes
the sampling inand
points a wav [Link]
the vertical In represents
addition,thethesound
number of
nents to beinremoved
amplitudes canInbeaddition,
a wav format. decided theaccording to the degree
number of components to beof contribution
removed can be and u
the principal
decided accordingcomponents
to the degree ofwith a cumulative
contribution contribution
and usually, the principal of 85–95% with
components are taken.
a cumulative contribution of 85–95% are
study, 95% cumulative contribution was used. taken. In this study, 95% cumulative contribution
was used.
absolute deviation, median, first quartile, third quartile, interquartile range, skewness,
kurtosis, Shannon entropy, and spectral entropy.
Secondly, the frequency-domain information of sound signal processing is more likely
to show differences than the time-domain features because the cardiopulmonary signals
have periodic characteristics and different sounds have different frequencies. The Mel-
frequency cepstral coefficients (MFCC) [31] are suitable frequency-domain features that
are closer to the characteristics of the human ear in analyzing sound than the general
spectrum or the inverse spectrum coefficients. The reason is that the human auditory
system only responds linearly to frequencies below 1 KHz but rather shows a logarithmic
function at higher frequencies. By using this relationship, the MFCC are spectral features
and are obtained as follows. The sound signal is first pre-reinforced, such as passing the
signal through a high-pass filter to enhance the information of a high frequency. This is
because the energy of a high frequency is usually smaller than that of a low frequency.
Then, Fourier transform is performed to obtain the power spectrum. The power spectrum
obtained from each audio frame is then passed through a Mel filter to obtain the Mel scale.
Forty Mel filters are usually used. Then, the logarithmic energy is extracted for each Mel
scale, and discrete cosine conversion for the inverse spectrum domain is performed. Since
the coefficients after filtering are highly correlated, the correlation can be removed and
downscaled by discrete cosine conversion and the Mel-frequency inversion coefficient is
the amplitude of the Mel-frequency inversion spectrum. In this study, 12 coefficients were
used and the energy of the audio frame was superimposed to form the 13th coefficient.
In addition, the maximum value in the power spectrum, the frequency of the maximum
value in the power spectrum, and the percentage of the maximum energy in the power
spectrum to the total energy were calculated. A total of 16 frequency-domain features
together with 11 statistical features were used as the input feature vectors for the classifier.
So, the dimension is reduced from 240,000 to 27 for a 2 s sound frame.
2.2.3. Classifier
The classification algorithm is used to classify normal (healthy) and abnormal (patho-
logical) heart sounds and normal and abnormal lung sounds. In this study, we adopted the
concept of ensemble learning [32] to design the classifier model and experimented with
several classical ensemble learning methods, including Bagging (bootstrap aggregating),
AdaBoost (adaptive boosting), GentleBoost (gentle adaptive boosting), LogitBoost (adap-
tive logistic), and RUSBoost (random under-sampling boosting), to see which learning
method is most suitable for the database selected for this study. Each sample segmentation
approach (e.g., 0% overlap in 1 s, 20% overlap in 1 s, etc.) is trained 5 times, and an ensemble
learning method is randomly selected for each time. The final classifier model is determined
based on the best observation obtained by Bayesian optimization. The original sample was
divided into many small frames for training and the model identified the results of the
classification of the small frames for voting. If the number of abnormal frames exceeded a
specific threshold, the original sample is regarded as an abnormal cardiopulmonary sample,
whereas the opposite is a normal one. The experimental results are displayed according to
different proportions (e.g., 5%, 10%, 15%, . . . , 95%) of the voting results and it is observed
which proportion has the best voting results. In addition, if the length of an original sample
could not be divided by more than one small frame, the sample is ignored and not counted.
To evaluate the performance of the algorithm proposed in this paper, four common
evaluation metrics are used (Equations (1)–(5)), including accuracy (Acc), sensitivity (Se),
specificity (Sp), and F1 score, which are commonly used to evaluate the performance
of “abnormality recognition algorithms” (e.g., disease diagnosis). Accuracy is the most
intuitive way to evaluate the average accuracy of the algorithm. Specificity focuses on the
misdiagnosis, and higher specificity means lower misdiagnosis. Sensitivity evaluates the
ability to detect patients, and higher sensitivity means better ability to identify patients.
Sensors 2022, 22, 4263 12 of 25
The F1 score is the summed average of the above scores and is used to summarize the
overall performance of the algorithm.
TP + TN
Sensors 2022, 22, x FOR PEER REVIEW accuracy = 12 of (1)
25
TP + FN + FP + TN
TN
specificity = (2)
FP + 𝑇𝑃TN
sensitivity = TP (3)
sensitivity = 𝑇𝑃 + 𝐹𝑁 (3)
+ FN
TP 𝑇𝑃
precision = TP (4)
precision = 𝑇𝑃 + 𝐹𝑃 (4)
TP + FP
2 × sensitivity × precision
F1 Score =2 × sensitivity × precision (5)
F1 Score = sensitivity + precision (5)
sensitivity + precision
where TP stands for true positive (with disease and classified as abnormal), TN stands for
where
true TP stands
negative for true
(without positive
disease (with disease
and classified and classified
as normal), FP standsasfor
abnormal), TN (with-
false positive stands
for disease
out true negative (without
but classified disease andFN
as abnormal), classified as normal),
stands for FP stands
false negative (with for falsebut
disease positive
clas-
(without disease
sified as normal). but classified as abnormal), FN stands for false negative (with disease but
classified as normal).
3. Results
3. Results
3.1.
[Link]
DesignofofElectronic
ElectronicStethoscope
Stethoscope
Figure
Figure 11a shows thePCB
11a shows the PCBcircuit
circuitof ofthe
thepresented
presentedsystem
systemin inaccordance
accordancewith withthe thesize
size
of Raspberry Pi 3 Model B (85 mm × 55 mm). Figure 11b [24]
of Raspberry Pi 3 Model B (85 mm × 55 mm). Figure 11b [24] shows the whole designed shows the whole designed
system.
[Link] ofthe
thefirst
firstamplifier
amplifier for for the
the heart/lung
heart/lungsounds
soundsisisdesigned
designedatat1.0 1.0sosothat
that
the
the gain of the first amplifier for mic 2 (noise) does not need to be adjusted to a highgain.
gain of the first amplifier for mic 2 (noise) does not need to be adjusted to a high gain.
The
Thegain
gainofof the
the second
second amplifier
amplifier forfor the
the heart/lung
heart/lungsounds
soundswas wasadjusted
adjustedtotoroughly
roughly3.0. 3.0.
For
Forthe
theexperimental
experimentalsite siteII(Figure
(Figure7),7),two
twodifferent
differentnoise
noisesources
sourceswere wereplayed.
[Link]
Figure12 12
shows the 10 s sound waveforms and FFTs measured by mic
shows the 10 s sound waveforms and FFTs measured by mic 1, mic 2, and the subtracter1, mic 2, and the subtracter
(noise
(noisereduction)
reduction)for forthe
theward
wardconstruction
constructionnoise noisethat
thatwas
wasplayed
playedfrom fromthe thecomputer
computerwith with
aa40%
40%volume
volumeand andaaspeaker
speakerwithwiththethelargest
largestvolume.
[Link]
Basedon onthe
theFFTFFTspectra,
spectra,the thenoise
noise
amplitude
amplitudepeak peakbefore beforenoise
noisereduction
reductionisis aboutabout 0.0013,
0.0013, whereas
whereasitit isis 0.0003
0.0003 after
after noise
noise
reduction.
[Link],
Apparently, the thevolume
volumeafterafternoise
noisereduction
reductionisis44times
timeslower
lowerthanthanbefore
beforenoisenoise
reduction;
reduction; the noisenoise reduction
reductionisisabout
about12.7412.74dBdB based
based on on
thethe formula
formula for signal-to-noise
for the the signal-to-
noise ratio (SNR).
ratio (SNR). The 10The 10 s sound
s sound waves waves
and FFTs and FFTs measured
measured by mic 1,by mic and
mic2, 1, mic2,
noiseand noise
reduction
reduction for the
for the airport noiseairport noise are
are shown [Link]
in Figure Figure 13. Fornoise
the airport the airport
playednoisefrom played from
the computer
the
withcomputer
the maximal withvolume
the maximal
and a volume
speaker and withathe
speaker
largestwith the largest
volume, the noise volume, the noise
amplitude peak
before noise
amplitude peakreduction
before is about
noise 0.0013 and
reduction about 0.0013
is about 0.00025andafterabout
noise0.00025
[Link]
this case,
re-
the volume
duction. after
In this noise
case, thereduction
volume after is 5.2noise
times lower than
reduction before
is 5.2 timesnoise
lower reduction;
than before thenoise
noise
reduction the
reduction; is about
noise14.32 dB. is about 14.32 dB.
reduction
(a) (b)
Figure
Figure11.
[Link]
PhysicalPCB
PCBcircuit.
circuit.(a)
(a)PCB
PCBof
ofdesigned
designedcircuit.
circuit.(b)
(b)Whole
Wholedesigned
designedsystem.
system.
x FOR PEER REVIEW
Sensors 2022, 22, 4263 1313of
of 25
(a)
(b)
(c)
Figure 12. Noise
Figure 12. reduction effect
Noise reduction effectfor
forward
wardconstruction
constructionnoise. (a)Signal
noise.(a) Signal waveform
waveform andand
FFTFFT spec-
spectrum
trum measured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal
measured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal waveform
waveform and FFT spectrum after noise reduction
and FFT spectrum after noise reduction.
frequency domain FFT
Time domain For the direct measurements of the human body in experimental site II (Figure 8),
ward construction noise was used as the noise source and the test volume was adjusted
to the maximal. The different models of stethoscopes listed in Table 1 were tested. For
conciseness, 10 s waveforms and FFTs of measurements 1–6 only for Model 2 are shown in
Figure 14. For Model 2, Figure 14a shows that the frequency band of the signals received
from the contact of the stethoscope head with the skin of the lower leg is in 0~100 Hz. In
comparison with Figure 14b, the frequency band of the noise received at the stethoscope
end on the skin of the lower leg is in 150~450 Hz and the audio frequency peak appears at
300 Hz. Figure 14c reveals that the frequency band of the noise received by mic 2 at the
ambient end distributes in 150~550 Hz and the audio frequency peak appears at 375 Hz.
Figure 14d shows that the frequency band of the heart sound signals in a quiet environment
is in 0~150 Hz and is mixed with the noise of the skin. Figure 14e reveals that the frequency
(a)
Sensors 2022, 22, 4263 14 of 25
(a)
(b)
(c)
Figure
Figure 13.
13. Noise
Noise reduction
reduction effect
effect for
for airport (a) Signal
noise. (a)
airport noise. Signalwaveform
waveformand
andFFTFFTspectrum
spectrummeasured
meas-
ured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal waveform
by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal waveform and FFT
and FFT spectrum after noise reduction.
spectrum after noise reduction.
For the direct measurements of the human body in experimental site II (Figure 8),
ward construction noise was used as the noise source and the test volume was adjusted
to the maximal. The different models of stethoscopes listed in Table 1 were tested. For
conciseness, 10 s waveforms and FFTs of measurements 1–6 only for Model 2 are shown
in Figure 14. For Model 2, Figure 14a shows that the frequency band of the signals received
from the contact of the stethoscope head with the skin of the lower leg is in 0 Hz~100 Hz.
In comparison with Figure 14b, the frequency band of the noise received at the stethoscope
end on the skin of the lower leg is in 150 Hz~450 Hz and the audio frequency peak appears
at 300 Hz. Figure 14c reveals that the frequency band of the noise received by mic 2 at the
ambient end distributes in 150 Hz~550 Hz and the audio frequency peak appears at 375
Hz. Figure 14d shows that the frequency band of the heart sound signals in a quiet envi-
Sensors 2022, 22, 4263 15 of 25
Sensors 2022, 22, x FOR PEER REVIEW 15 of 25
(a)
(b)
(c)
(d)
(e)
(f)
Figure 14.
Figure 14. Noise
Noisereduction
reductioneffect for for
effect ward construction
ward [Link].
construction (a) Measurement 1: stethoscope
(a) Measurement on
1: stethoscope
left leg skin in quiet space. (b) Measurement 2: stethoscope on the left leg skin in noise space. (c)
on left leg skin in quiet space. (b) Measurement 2: stethoscope on the left leg skin in noise space.
Measurement 3 on lower leg skin: mic for environmental noise. (d) Measurement 4: stethoscope on
(c)
leftMeasurement
chest in quiet 3environment.
on lower leg(e)skin: mic for environmental
Measurement 5: stethoscopenoise.
on left(d) Measurement
chest 4: stethoscope
in noisy environment.
on
(f)left chest in quiet
Measurement 6 onenvironment. (e)noise
left chest after Measurement
reduction. 5: stethoscope on left chest in noisy environment.
(f) Measurement 6 on left chest after noise reduction.
Table 2. Frequency bands (in Hz) of different measurements for different models.
Table 2. Frequency bands (in Hz) of different measurements for different models.
Model 1 Model 2 Model 3 Model 4 Model 5
Measurement 1 0~1001
Model 0~100 2
Model 0~100 3
Model 0~100
Model 4 0~100
Model 5
0~100 0~100 0~100 0~100 0~100
Measurement 1 0~100 0~100 0~100 0~100 0~100
Measurement 2 150~450 150~450 150~450 150~450 150~450
0~100
peak at 300 0~100
peak at 300 peak0~100
at 300 peak 0~100
at 300 peak at0~100
300
Measurement 2 150~450 150~450 150~450 150~450 150~450
150~550
peak at 300 150~550
peak at 300 150~450
peak at 300 250~450
peak at 300 peak at 300
Measurement 3
peak at 375 peak at 375 peak at 350 peak at 350
150~550 150~550 150~450 250~450
Measurement
Measurement 3 4 0~150
peak at 375
0~150
peak at 375
0~150
peak at 350
0~150
peak at 350
0~150
0~150 0~150 0~150 0~150 0~150
Measurement 4 0~150 0~150 0~150 0~150 0~150
Measurement 5 200~450 200~400 240~420 250~400 250~400
0~150
peak at 325 0~150
peak at 325 peak0~150
at 310 peak 0~150
at 310 peak at0~150
310
Measurement 5 200~450 200~400 240~420 250~400 250~400
0~150 0~150 0~150 0~150
Measurement 6 peak at 325 peak at 325 peak at 310 peak at 310 peak at 310
250~450 250~450 250~450 250~450
0~150 0~150 0~150 0~150
Measurement 6
250~450 250~450 250~450 250~450
Table 3 shows the FFT spectra of the heart sound signals with and without the sub-
tracter, measured at the stethoscope end using four different models. Model 5 has no sub-
Table 3 shows the FFT spectra of the heart sound signals with and without the sub-
tracter and Model 6 does not provide the option of using the subtracter. Therefore, Models
tracter,
5 and 6 measured at in
are not listed the stethoscope
Table end of
3. The results using
noisefour different
reduction models.
using Modelcircuit
the subtracter 5 has no
subtracter and Model 6 does not provide the option of using the subtracter.
for these four models are not satisfactory. However, when comparing the FFT spectra of Therefore,
Models
the heart5 and 6 areobtained
sounds not listed in Table
using 3. The results
the subtracter, Modelof noise reduction
4 shows using
the lowest the ampli-
noise subtracter
circuit
tude. When the subtracter is not used, the FFT spectra of the heart sound signals of thethe
for these four models are not satisfactory. However, when comparing sixFFT
spectra of the heart sounds obtained using the subtracter, Model 4 shows the lowest noise
amplitude. When the subtracter is not used, the FFT spectra of the heart sound signals of
the six Models (including the electronic stethoscope from Thinklabs One) are shown in
Sensors 2022,
Sensors2022,
Sensors 22,
22,xxxFOR
2022,22, FOR PEER
FORPEER REVIEW
PEERREVIEW
REVIEW 17 of
17 of
17 25
of 25
25
Sensors 2022, 22, 4263 17 of 25
Sensors2022,
Sensors
Sensors 2022,22,
2022, 22,xxxFOR
22, FORPEER
FOR PEERREVIEW
PEER REVIEW
REVIEW 17of
17
17 of25
of 25
25
Sensors 2022, 22, x FOR PEER REVIEW 17 of 25
Models
Models
Models (including
(including
Figure(including
15. Modelthe the
the
4 electronic
electronic
electronic
shows stethoscope
stethoscope
stethoscope
the lowest from
noisefrom Thinklabs
fromThinklabs
Thinklabs
amplitude when One)
One)
One) are shown
areshown
are
measuring shown in
in
heart Figure
inFigure
Figure
sounds 15.
15.
[Link]
Models
Models
Models
Model
Model
Model 4
44 (including
(including
(including
shows
shows
shows the
the
the the
the
the
lowestelectronic
electronic
electronic
lowest
lowest noise
noise
noise stethoscope
stethoscope
stethoscope
amplitude
amplitude
amplitude when
when
whenfrom
from
from Thinklabsheart
Thinklabs
Thinklabs
measuring
measuring
measuring One)are
One)
One)
heart
heart areshown
are
sounds
sounds
sounds shown
shown in
ininaaain
in
in Figure
Figure
Figure
noisy
noisy
noisy 15.
15.
15.
envi-
envi-
envi-
a Models
noisy environment.
(including the Figureelectronic 15 stethoscope
also shows from that the cork can
Thinklabs One)reduce the noise
are shown in Figureby 1.9
15. dB
ModelModel444shows
Model
ronment.
ronment.
ronment. shows
shows
Figure
Figure
Figure the
the
the15
15 lowest
lowest
lowest
also
15and
also
also shows
shows
shows noise
noise
noise thatamplitude
amplitude
amplitude
the cork when
when
when
can measuring
measuring
measuring
reduce the heart
heart
heart
noise by sounds
sounds
sounds
1.9 dB ininaaanoisy
in
between noisy
noisy envi-
envi-
envi-
Model
between
Model 4Model shows 1the Model
lowest noise
that
2that
and the
the cork
bycork
amplitude
can
can
2.8 dB
when
reduce
reduce
between thenoise
the
Model
measuring
noise
heart
by
3 by
and 1.9
1.9 dBdBbetween
Model
sounds
between
in [Link]
Model Model
Model
4 can
envi-
1ronment.
11ronment.
ronment.
and
and
and
reduce Model
Model
Model
ronment.
Figure
Figure
Figure
22
2 and
the Figure
15
and
and
noise by
15by
15 byalso
also
also
by
15 4.7 2.8
2.8
alsodB
shows
shows
shows
2.8 dB
dB
dB thatthe
that
that
between
between
between
when
shows that
the
the corkcan
cork
cork
Model
Model
Model
compared
can
can
333 and
and
to the
the cork can
reduce
reduce
reduce
and the
Model
Model
Model the4.
the
commercial
reduce
noise
noise
noise
4.
4. Model
Model
Model
the noiseone,
by1.9
by
by 4
by 1.9
4
41.9
1.9
can
can
can
Model
dB
dB
dB between
between
between
reduce
reduce
reduce
6.
dB between
the
the
the Model
Model
Model
noise
noise
noise
Model
1by
by 1and
1by and
and
4.7 Model
Model
Model
dB when222and
and
and byby2.8
by
compared 2.8dB
2.8 dB
dB between
between
tobetween
to the Model33one,
Model
Model
commercial 3and
andModel
and Model6.
Model [Link]
4. Model444can
Model canreduce
can reducethe
reduce thenoise
the noise
noise
4.7
14.7
anddB
dB when
when
Model compared
2compared
and by 2.8 to dB the
the commercial
betweencommercial 3one,
Model one, Model
andModel
Model6. 6.
4. Model 4 can reduce the noise
by byby 4.7
4.7
4.7 dBdB
dB when
when
when
Comparison compared
compared
compared toto
to the
the
the
of FFT spectra commercial
commercial
commercial one,
one,
one, Model
Model
Model 6.
6.
6.
Table
by 4.73. dB when compared to the of heart sound
commercial signals
one, with
Model 6. and w/o the subtracter.
Table 3.
Table3.
Table Comparison
[Link] of
Comparisonof FFT
ofFFT spectra
FFTspectra of
spectraof heart
ofheart sound
heartsound signals
soundsignals with
signalswith and
withand w/o
andw/o the
w/othe subtracter.
thesubtracter.
subtracter.
Table
Table
Table 3.
ModelTable
1 3.
3. Comparison
Comparison
Comparison of
of
of FFT
FFT
FFT spectra
spectra
spectra ofof
of heart
heart
heart sound
sound
sound signals
signals
signals with
with
with and
and
and
3. Comparison of FFT spectra of heart sound signals withModel w/o
w/o
w/o the
the
the subtracter.
subtracter.
subtracter.
2 the subtracter.
and w/o
Model
Model111
Model Model
Model222
Model
w/osubtracter
w/o subtracter Model
Model
Model withsubtracter
111 1 with
with subtracter w/o subtracter
w/o subtracter with subtracter
Model2222 with
Model
Model with subtracter
w/osubtracter
w/o
w/o subtracter
subtracter Model with subtracter
subtracter w/osubtracter
w/o
w/o subtracter Model
subtracter with subtracter
subtracter
w/ow/o
w/o
w/o subtracter
subtracter
subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra with
with
with subtracter
subtracter
subtracter
with subtracter
FFT spectra
FFTspectra
FFT spectra w/osubtracter
w/o
w/o
w/o subtracter
subtracter
subtracter
FFT
FFTspectra
FFT spectra
spectra
with
with
with
with subtracter
subtracter
subtracter
subtracter
FFT
FFTspectra
FFT spectra
spectra
FFT
FFT
FFT spectra
spectra
spectra
FFT spectra FFT
FFT
FFTFFTspectra
spectra
spectra
spectra FFT
FFT
FFT
FFT spectra
spectra
spectra
spectra FFT
FFT
FFT
FFT spectra
spectra
spectra
spectra
Model
Model
Model
Model
3333 Model
Model
Model 444
Model44
w/o subtracter Model
Model
Model
Model 3 33 3 with
with subtracter w/o subtracter Model
Model
Model
Model 444 with
with subtracter
w/osubtracter
w/o subtracter withsubtracter
subtracter w/osubtracter
w/o subtracter withsubtracter
subtracter
w/o
w/ow/osubtracter
subtracter
subtracter with
with
with subtracter
subtracter
subtracter w/o
w/o subtracter
subtracter
w/osubtracter
subtracter withsubtracter
with
with subtracter
subtracter
w/o
w/o subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra with
with subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra w/o
w/o subtracter
FFT spectra
FFTspectra
FFT spectra
with
with subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra
FFT spectra FFT spectra FFT spectra FFT spectra
FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra
FFT
FFTspectra
FFT spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
FFT spectra
FFTspectra
FFT spectra
FFT spectra FFT spectra FFT spectra spectra
FFTspectra
spectra FFT spectra
FFTspectra
FFT
FFT
FFT spectra
spectra
spectra
FFT
FFT spectra
FFT
FFT
FFT spectra
spectra
spectra
spectra
FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra
Figure
Figure
Figure
Figure
Figure 15.
15.
15.
15.
15. Comparison
Comparison
Comparison
Comparison
Comparison of
of
of of
of FFTspectra
FFT
FFT
FFT
FFT spectra
spectra of
spectraof
of
of heart
ofheart
heart
heart sound
heartsound
sound
sound signals
soundsignals
signals
signals without
withoutusing
signalswithout
without
without using
using
using the
using the
the
the subtracter.
the subtracter.
subtracter.
subtracter.
subtracter.
Figure15.
Figure
Figure [Link]
15. Comparisonof
Comparison ofFFT
of FFTspectra
FFT spectraof
spectra ofheart
of heartsound
heart soundsignals
sound signalswithout
signals withoutusing
without usingthe
using thesubtracter.
the subtracter.
subtracter.
3.2.
3.2. Heart/Lung
Heart/Lung Sound Classification
SoundClassification
Classification
3.2.
3.2.
3.2. Heart/Lung
Heart/Lung
Heart/Lung Sound
Sound
Sound Classification
Classification
Two
[Link]/Lung
3.2.
3.2. Heart/Lung
Heart/Lung sample
Two sample Sound
Sound
Sound databases
databases
Classification
Classification
Classification wereadopted
were adopted from from the theInternetInternet [28,29]. [28,29]. The database
The database [28] con-[28]
containsTwo
Two
Two sample
sample
sample
3240 heart databases
databases
databases
sounds were
were
were
in wav adopted
adopted
adopted
format, from
from
from the
includingthe
the Internet
Internet
Internet
2548 [28,29].
[28,29].
[28,29].
normal The
The
The
heart database
database
database
sounds [28]
[28]
[28] con-
con-
con-
tains Two 3240 heart
sample sounds
databases inwerewav
were format,
adopted including
from the 2548
Internetnormal heart
[28,29]. sounds
The databaseandand 692
[28]
692
ab- ab-
tains
tains
tains
normal
Two
Two3240 sample
sample
3240heart
3240 heart
heartsounds databases
databases
sounds
soundsin in
inwav were
wav
wavformat, adopted
adopted
format,
format, from
from
including
including
including the
the Internet
Internet
2548
2548
2548 normal
normal
normal [28,29].
[28,29].
aheart
heart
heart The
The database
database
sounds
sounds
sounds and
and
and [28] 692con-
[28]
692
of 692
con-
con-
ab-
ab-
ab-
normalheart heartsounds,
sounds, with with lengths of of between
between 55and and 120120 s ands and sampling
a sampling rate rate
of 2000 [Link].
tains3240
tains
tains
normal
normal
normal 3240
3240heart
heart
heart heart
heart
heart sounds,
sounds,
sounds, sounds
sounds
sounds within
with
with inlengths
in wavformat,
wav
wav
lengths
lengths format,
format,
of
of between
ofbetween
betweenincluding
including
including 555and
and
and 2548
2548
2548 120
120 ssnormal
normal
120normal andaaaheart
and
ssound
and heartsounds
heart
sampling
sampling
sampling sounds
sounds rateand
rate
rate and
and
of
of2000
of 692
2000
2000692Hz.
692 ab-
ab-
ab-
Hz.
Hz.
The
TheLung LungSound SoundDataset Dataset [29] is a collection collection of
of 920
920 lung
lung sound recordings
recordings in inwav wav format
format
normal
normal
normal
The
The
The Lung
Lung
Lung heart
heart
heart Sound
Sound
Sound sounds,
sounds,
sounds, with
with
with
Dataset
Dataset
Dataset lengths
lengths
lengths
[29]
[29]
[29] is
isisaa90 of ofbetween
of between
between
collection
acollection
collection of
of
of 555920
andand
and 120ssound
120
120
lung sand
andaaarecordings
ssound
and samplingrate
sampling
sampling
recordings rate
rate
in of
wavof2000
of 2000
2000 Hz.
format Hz.
Hz.
with
with lengths
lengths ranging
ranging fromfrom 1010 to to s,
90created
s, created by920920
bylung
two lung
research
two sound
research teams recordings
teamsin Portugal ininwav
in Portugal wav format
andformatGreece,
and
TheLung
The
The
with
with
with Lung
Lung
lengths
lengths
lengths Sound
Sound
Sound ranging
ranging
ranging Dataset
Dataset
Dataset from
from
from [29]
[29]
[29] 10
10
10 isisisaaacollection
to
to
to 90collection
collection
90
90 s,
s,
s, created
created
created of
ofof920920
920
by
by
by lung
lung
lung
two
two
two soundrecordings
sound
sound
research
research
research recordings
recordings
teams
teams
teams in
in
in in
in
in wavformat
wav
wav
Portugal
Portugal
Portugal format
formatand
and
and
containing
Greece, containing 35 normal 35lung
normal sounds,
lung sounds, 1 asthma, 16 bronchiectasis,
1 asthma, 16 bronchiectasis, 13 occlusive
13 occlusive bronchiolitis,
bron-
withlengths
with
with
Greece,
Greece,
Greece, lengths
lengths
containing
containing
containing ranging
ranging
ranging 35 from10
from
from
normal
35normal
35 normal 10
10lungto
lung
lung to90
to 90s,s,s,created
90
sounds,
sounds,
sounds, created
created
111asthma,
asthma, by
by
asthma, bytwo two
two
16 researchteams
research
research
bronchiectasis, teams
teams 13in in
in Portugal
Portugal
Portugal
occlusive and
and
and
bron-
793 chronic
chiolitis, 793 obstructive
chronic pulmonary
obstructive diseases
pulmonary diseases16
(COPD), 16
37 bronchiectasis,
bronchiectasis,
(COPD), pneumonia, 37 pneumonia, 2313
13 occlusive
occlusive
upper 23 bron-
bron-
respiratory
upper
Greece,
Greece,
Greece,
chiolitis,
chiolitis,
chiolitis,
respiratory
tract containing
containing
containing
793 chronic
793chronic
793
infection chronic
tract
(URTI), 35
35
35 normal
normal
normal
obstructive
obstructive
obstructive
infection
and (URTI), lung
lung
lung
2 lower sounds,
sounds,
sounds,
pulmonary
pulmonary
pulmonary
and 1
2 lower
respiratory 11 asthma,
asthma,
asthma,
diseases
diseases
diseases 16
respiratory
tract 16
16 bronchiectasis,
bronchiectasis,
bronchiectasis,
(COPD),
(COPD),
(COPD),
infection tract 37
37 13
13
13
pneumonia,
37pneumonia,
pneumonia,
infection
(LRTI) sounds. occlusive
occlusive
occlusive
(LRTI) The 23 bron-
bron-
bron-
upper
23samples
23
sounds. upper
upper
chiolitis,
chiolitis,
chiolitis,
respiratory
respiratory
respiratory
The 793
samples793
793 chronic
chronic
chronic
tract
tract
tract obstructive
obstructive
obstructive
infection
infection
infection
containing no(URTI),
(URTI),
(URTI),
noise pulmonary
pulmonary
pulmonary
and
and
and
from 2
2 2 lower
[28]
containing no noise from [28] were split into five folds and the samples for traininglower
lower werediseases
diseases
diseases
respiratory
respiratory
respiratory
split (COPD),
(COPD),
(COPD),
into tract
tract
tract
five 3737
37 pneumonia,
pneumonia,
pneumonia,
infection
infection
foldsinfection
and the(LRTI)
(LRTI)
(LRTI)
samples23
2323 upper
upper
upper
sounds.
sounds.
[Link]
respiratory
respiratory
respiratory
The
The
The samples
samples
samples
training and tract
tract
tract infection
infection
infection
containing
containing
containing
testing were no
nono (URTI),
(URTI),
(URTI),
noise
noise
noise
randomly and
and
and
from
from
from 2 22
[28]
[28]
selected.
testing were randomly selected. The sample sizes for training and testing are shown lower
lower
lower
[28] were
were
were
The respiratory
respiratory
respiratory
split
split
split
sample into
into
into sizes tract
tract
tract
five
five
five folds
for infection
infection
infection
folds
folds
trainingand
and
and (LRTI)
(LRTI)
(LRTI)
the
the
the
and samples
samples
samples
testing sounds.
sounds.
[Link]
for
for in
The
The
The
training
training
training
shown
Table samples
samples
samples and
and
and
4. In containing
containing
containing
testing
testing
testing
in addition, were
were
were
Table 4. Insamples nono
no noise
noise
noise
randomly
randomly
randomly
addition, from
from
from [28]
[28]
[28]
selected.
selected.
selected.
samples containing
containing were
were
were
The
The
The
noise in the split
split
split
sample
sample
sample
noise into
into
into
database sizes
sizes
sizes five
five
five for
for
for
in the database folds
folds
folds trainingand
and
and
training
training
set were set the
the
theand
and
and samples
samples
samples
testing
testing
testing
were catego-
categorized for
for
for
are
are
are
as test
training
training
training
shown
shown
shownrizedin
samples in
in
asand
and
and Table
Table
Table
test testing
testing
testing
(samples) 4.
[Link]
samplesIn
Inwere
were
were
to addition,
addition, randomly
randomly
randomly
addition,
(samples)
evaluate samples
samples
samples selected.
selected.
selected.
thetotolerance containing
evaluatecontaining
containing The
The
The
the
of the sample
sample
sample
noise
noisein
noise
tolerance
proposed sizes
sizes
sizes
in
in ofthe
the
the for
the for
for training
training
training
database
database
database
algorithmproposed and
set
set
toset and
andwere
were
were testing
testing
testing
algorithm
noise. catego-
Thecatego-
catego- toare
are
are
score
shown
shown
shown
rized
ofnoise.
rized
rized each as in
in
as in
asThe Table
Table
Table
test
testscore
test
index 4.
samples4.
4. InIn
In addition,
addition,
addition,
(samples)
wasofevaluated
samples
samples each indexby
(samples)
(samples) samples
samples
samples
was to
toto evaluate
evaluated containing
containing
containing
evaluatethe
evaluate
averaging the
by noise
noise
noise
tolerance
averaging
the5-fold
the tolerance
tolerance in in
in
theof
of
classification the
the
the the
of5-fold database
database
database
proposed set
set
set
classification
the proposed
the proposed were
were
were
algorithm catego-
catego-
catego-
results.
algorithm
results. algorithm
The classifica- to
to
to
rized
rized
rized
noise.
noise.
tion as
Theresults
noise. as
as
The
The
The test
test
testscore
score
scoreofsamples
samples
samples
classification of
of
the each
ofeach
each
heart (samples)
(samples)
(samples)
resultsindex
index
index of the
sound was
was
was toto
to
heart evaluate
evaluate
evaluate
sound
evaluated
evaluated
evaluated
dataset with by the
bythe
the
dataset
by tolerance
tolerance
tolerance
averaging
averaging
averaging
different with
segmentation of
the
the
the ofthe
of
different theproposed
the
5-fold
5-fold
5-fold proposed
proposed
segmentation
classification
classification
classification
lengths (1algorithm
algorithm
algorithm
s,lengths
results.
results.
[Link]
s, and to
to
noise.
noise.
noise.
2 (1
The
The
The s, Thes,
The
The
1.5 score
score
score
classification
classification
s),classification
different 2of
of
andoverlap of each
each
each
s),
results
results
results index
index
index
differentof
ofthe
of
ratios was
the
the was
was
overlap
(0%, heart
heart
heart evaluated
evaluated
evaluated
20%,ratios
sound
sound
sound and (0%, by
by by20%,
datasetaveraging
averaging
averaging
dataset
dataset
50%), andand
with
with
with the
50%), the5-fold
the
different
different 5-fold
5-fold
and
different
different classification
classification
classification
different
votingsegmentation
segmentation
segmentation
ratios voting results.
results.
results.
ratios
lengths
lengths
(5–95%) lengths are
(1The
The
The
(1
(1 s,
s,
s, classification
classification
classification
1.5
1.5
1.5 s,
s,s, and
and
and 222 s),
s),
s), results
results
results
different
different
different ofof
of the
the
the
overlap
overlap
overlap heart
heart
heart sound
sound
sound
ratios
ratios
ratios (0%,
(0%,
(0%,
presented in Figures 16–19 in terms of four metrics (accuracy, sensitivity, specificity, and dataset
dataset
dataset
20%,
20%,
20%, with
with
with
and
and
and 50%),different
different
different
50%),
50%), and
and
and segmentation
segmentation
segmentation
different
different
different voting
voting
voting lengths
lengths
lengths
ratios
ratios
ratios
(1(1
(1 s,s,s,score).
F1 1.5s,s,s,and
1.5
1.5 and222s),
and s),different
s), differentoverlap
different overlapratios
overlap ratios(0%,
ratios (0%,20%,
(0%, 20%,and
20%, and50%),
and 50%),and
50%), anddifferent
and differentvoting
different votingratios
voting ratios
ratios
Sensors 2022, 22, 4263 18 of 25
(a) (b)
(a) (b)
(a)Figure 16. Heart sound classification results: Accuracy. (a) Testing
(b)samples. (b) Noisy samples.
Figure [Link]
Figure16. Heartsound
soundclassification
classification results: Accuracy.(a)
results: Accuracy. (a)Testing
Testingsamples.
samples.(b)(b) Noisy
Noisy samples.
samples.
Figure 16. Heart sound classification results: Accuracy. (a) Testing samples. (b) Noisy samples.
(a) (b)
(a) (b)
(a)Figure 17. Heart sound classification results: Sensitivity. (a) Testing
(b) samples. (b) Noisy samples.
Figure 17. Heart sound classification results: Sensitivity. (a) Testing samples. (b) Noisy samples.
Figure [Link]
Figure17. Heart sound
sound classification results:Sensitivity.
classification results: Sensitivity.(a)
(a)Testing
Testingsamples.
samples.
(b)(b) Noisy
Noisy samples.
samples.
(a) (b)
(a) (b)
(a)Figure 18. Heart sound classification results: Specificity. (a) Testing
(b) samples. (b) Noisy samples.
Figure 18. Heart sound classification results: Specificity. (a) Testing samples. (b) Noisy samples.
Figure 18. Heart sound classification results: Specificity. (a) Testing samples. (b) Noisy samples.
Figure 18. Heart sound classification results: Specificity. (a) Testing samples. (b) Noisy samples.
Sensors 2022, 22, 4263
Sensors 2022, 22, x FOR PEER REVIEW
19 of 25
19 of 25
(a) (b)
Figure 19. Heart sound classification results: F1 Score. (a) Testing samples. (b) Noisy samples.
Figure 21.
Figure 21. Lung
Lung sound
sound classification
classification results:
results: Sensitivity.
Sensitivity.
Figure22.
Figure
Figure [Link]
22. Lung
Lung sound
sound
sound classification
classification
classification results:
results:
results: Specificity.
Specificity.
Specificity.
Figure23.
Figure
Figure [Link]
23. Lung
Lung sound
sound
sound classification
classification
classification results:
results:
results: F1 Score.
F1 Score.
F1 Score.
[Link]
Discussion
4. Discussion
4.1. Design of Electronic Stethoscope
4.1. Design
4.1. Design of of Electronic
Electronic Stethoscope
Stethoscope
The presented stethoscope is cost-effective and easily implemented. The condenser mi-
crophone Theused
The presented
presented stethoscope
stethoscope
in the presented is cost-effective
is
stethoscope cost-effective
can be found and
and easily
easily
on the implemented.
implemented.
market at a reasonable The
The condense
condense
price
microphone
microphone
and the filter andused
used in the
in the presented
amplification presented stethoscope
circuits stethoscope
are simple. From canthe
can be found
be found
test on the
on
results the market
market
(Figures at aa13)
at
12 and reasonabl
reasonabl
price
for and
experimental
price the filter
and the filter and
site I,and amplification
the amplification
microphone at the circuits are
noise-receiving
circuits simple.
are simple. end From the
over-received
From test results
the sound
the test results (Figures
(Figures 121
and
at 13)
low13)
and for experimental
experimental
frequencies;
for therefore,sitesite I,I, the
the microphone
a superposition microphone at the
of noise signals
at the noise-receiving
noise-receiving
in the low-frequency endsection
end over-receive
over-received
was
the observed.
sound at Even
low so, the noise
frequencies; reduction
therefore,effecta is still satisfactory,
superposition
the sound at low frequencies; therefore, a superposition of noise signals in the of being
noise 4~6 times
signals inlower
the low-fre
low-fre
than in the
quency section case not
section was using
was [Link] subtracter.
observed. Even Even so, However,
so, the
the noise this
noise reductionphenomenon
reduction effecteffect isaffected
is still the noise
still satisfactory,
satisfactory, being
bein
quency
reduction when the stethoscope was applied to human skin. With the human body test
4~6 times
4~6experimentallower
times lowersite than
than in the case not using the subtracter. However, this phenomenon af
for II, in
thethe casereduction
noise not using the was
effect subtracter. However,
far different from the this phenomenon
test results af
fected the
fected the noise
noise reduction
reduction when when the stethoscope
stethoscope was applied applied to to human
human skin. skin. With th
conducted in experimental site I. It isthebecause the noisewas peak frequency received at theWith th
human
human body
stethoscope bodyend test
test
onforfor experimental
theexperimental
skin of the lower site II,
siteleg the noise
II, (Measurement reduction
the noise reduction effect
2) is ateffect
300 Hz, was
was far different
far different
whereas the from
from
the test
the
noise test results
peakresults
frequency conducted
conducted
received in in experimental
by experimental
mic 2 at the ambient site [Link]
site It is
It is because
because
above theofnoise
the
the skin noise peakleg
peak
the lower frequenc
frequency
received
(Measurement at the stethoscope
received at the3)stethoscope
is at 350 Hz for end
endModelson the skin
on the3skinand of of
4 orthe
the375 lower
Hz for
lower leg (Measurement
legModels 2)
1 and 2, as2)listed
(Measurement is at 300
is at 300 Hz
Hz
in Table 2. There is a frequency shift due to different sound media. The media for the
microphone inside the stethoscope are the skin, diaphragm, and the air. The medium for
the microphone of Models 1 and 2 to collect environmental noise is the air. The media for
the microphone of Models 3 and 4 to collect environmental noise are the diaphragm and the
air. When heart sounds were mixed with the noise, the noise peak frequency (Measurement
5) became 325 Hz for Models 1 and 2 or 310 Hz for Models 3 and 4 (see Table 2). This
peak frequency shift resulted in some noise signals being superposed, as shown in Table 3,
when the subtraction of two signals was performed. Nevertheless, when Models 1–4 all
used the subtracter, the design of Model 4 outperformed the others (Models 1–3) with the
Sensors 2022, 22, 4263 21 of 25
lowest noise amplitudes. Even when the subtracter was not used for Models 1–4, Model 4
still performed better than the others. The cork surrounding the double heads in Model 4
presents certain noise isolation.
From the results shown in Figures 16b, 17b, 18b and 19b for noisy samples of heart
sounds, at 25% of the voting ratio, segment lengths of 1.5 and 2 s and overlap ratios of 20%
and 50% provide better performance in terms of accuracy; segment lengths of 1.5 and 2 s
and overlap ratios of 20% offer better performance in terms of sensitivity; segment lengths
of 1.5 and 2 s and overlap ratios of 50% give better performance in terms of specificity; and
segment lengths of 1.5 and 2 s and overlap ratios of 20% and 50% offer better performance
in terms of F1 score. Therefore, based on these four metrics, the model using a segmentation
length of 2 s and an overlap ratio of 20% obtains the highest overall score: 69% for accuracy,
68.3% for sensitivity, 69.7% for specificity, and 68.8% for F1 score.
In summary, the signal segmentation length of 2 s and the overlap of 20% or 50% were
found to be the most effective for the heart sounds. This indicates that a 2 s segmentation
length can contain the most complete information for a single heartbeat. In addition, some
degree of overlap is also effective in improving the overall classification accuracy.
As shown in Figure 20 for the lung sound classification, there is no significant change
in the voting ratios for accuracy and the trend is similar. From Figures 21 and 22, the
sensitivity falls in the range of 53.3–70% and the specificity falls in the range of 50–93.3%.
The voting ratio does not have much influence on these scores, only a slight upward or
downward trend occurs at around 30–35% and 70–75% of the voting ratios. The best overall
F1 score, as seen in Figure 23, falls at 56.7–71.5% at 5–30% of the voting ratios. Based on
these four metrics, the best overall score occurs at 5–30% of the voting ratios. However,
from the experimental results, the voting ratio did not have much effect on the scores
and there was no significant change from 5% to 65% of the voting ratios. This might be
caused by the high proportion of abnormal lung sound samples or uneven distribution.
The segmentation length of 5 s with 50% overlap gave the best classification result: 73.3%
for accuracy, 66.7% for sensitivity, 80% for specificity, and 71.5% for F1 score. This indicates
that a segmentation length of 5 s can contain the most complete information about one
respiration. In addition, allowing a certain degree of overlap in the segmentation as in the
case of heart sounds, can effectively improve the overall classification accuracy.
Sensors 2022, 22, 4263 23 of 25
In summary, the above results show that the voting ratio usually correlates with
the metrics. There are three factors that affect the trends in the metrics. The first is the
percentage of abnormal heart sounds or lung sounds in the samples. In medical terms, if
a heart sound or lung sound is diagnosed as abnormal, not every heartbeat in that heart
sound or respiration in that lung sound may be abnormal. The second is the total number
of small frames segmented from a given sample. The higher the number of frames, the
smoother the score curve and the easier it is to see the trend and the more accurate the
results. The last one is the accuracy of the classifier in training the frames. The higher the
accuracy of the frames, the higher the accuracy of the subsequent voting.
5. Conclusions
In this paper, the design of an electronic stethoscope and an AI classification algorithm
for cardiopulmonary sounds were addressed. Five models of electronic stethoscopes
have been proposed and tested. In Models 1 and 2, a microphone is installed inside the
stethoscope head to collect heart and lung sounds and a second microphone is attached
to the back of the stethoscope head for collecting environmental noise. In Models 3 and
4, double stethoscope heads where each has a microphone installed are connected back-
to-back, one stethoscope head is for collecting heart and lung sounds and the other is for
collecting environmental noise. Cork is used in Models 2 and 4 to isolate environmental
sounds. In Model 5, only one stethoscope head covered by cork is used. The collected
sounds are processed through two-stage amplification and filter circuits. Each processed
heart/lung sound then optionally subtracts the processed noise sound for noise reduction
and a Raspberry Pi is used to record the final sound. The effect of noise reduction for the
presented electronic stethoscopes was tested in two different experimental sites. When
subtraction is not used, Model 4 presents better performance with fewer noise amplitudes.
When subtraction is used for Models 1–4, Model 4 still outperforms the other 3 Models
with the lowest subtracted noise signals. The cork surrounding the double heads in Model
4 presents certain noise isolation. However, frequency shifts due to different sound media
associated with the stethoscope head and the microphone for the noise were observed
during the course of the study. Therefore, this issue may need further investigation.
For the cardiopulmonary sound classifications, a voting ensemble learning approach
combined with PCA and MFCC was developed. Two public databases were used for
training and testing. Different segmentation lengths (1 s, 1.5 s, and 2 s) and different
overlap ratios (0%, 20%, and 50%) were applied to segment one sample into several small
frames. Four common metrics (accuracy, sensitivity, specificity, and F1 score) were used to
evaluate the performance of the algorithm. After testing, the best voting for heart sounds
falls at 5–45% and the best voting for lung sounds falls at 5–65%. Based on the results for
the heart sound testing samples containing no noise, the best overall score is obtained using
2 s frame segmentation with a 20% overlap: 86.9% for accuracy, 81.9% for sensitivity, 91.8%
for specificity, and 86.1% for F1 score. For the lung sound testing samples, the best overall
score is yielded using 5 s frame segmentation with a 50% overlap: 73.3% for accuracy, 66.7%
for sensitivity, 80% for specificity, and 71.5% for F1 score. The signal segmentation length is
long enough to cover one heartbeat or one respiration and a certain degree of overlap in
segmentation would effectively improve the overall classification performance.
Author Contributions: Conceptualization, Y.-C.W., C.-C.H., C.-S.C., T.-Y.S., H.-M.C. and J.-Y.L.;
methodology, Y.-C.W., C.-C.H., C.-S.C., F.-L.C. and S.-F.C.; software, F.-L.C. and S.-F.C.; validation,
Y.-C.W., C.-C.H., C.-S.C., F.-L.C., S.-F.C., T.-Y.S., H.-M.C. and J.-Y.L.; formal analysis, Y.-C.W., C.-C.H.,
F.-L.C. and S.-F.C.; investigation, Y.-C.W., C.-C.H., C.-S.C., F.-L.C. and S.-F.C.; resources, Y.-C.W.
and C.-C.H.; data curation, Y.-C.W., C.-C.H., F.-L.C. and S.-F.C.; writing—original draft preparation,
Y.-C.W., C.-C.H., F.-L.C. and S.-F.C.; writing—review and editing, Y.-C.W.; visualization, Y.-C.W.,
C.-C.H., F.-L.C. and S.-F.C.; supervision, Y.-C.W. and C.-C.H.; project administration, Y.-C.W., C.-C.H.,
C.-S.C., T.-Y.S. and H.-M.C.; funding acquisition, Y.-C.W., C.-C.H., T.-Y.S. and H.-M.C. All authors
have read and agreed to the published version of the manuscript.
Sensors 2022, 22, 4263 24 of 25
Funding: This research was partially funded by the Ministry of Science and Technology, Taiwan,
grant number MOST 107-2622-E-239-003-CC3 and by the Taichung Veterans General Hospital, Taiwan,
grant numbers TCVGH-NUU1098901 and TCVGH-NUU1108901.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.
References
1. McLane, I.; Emmanouilidou, D.; West, J.E.; Elhilali, M. Design and Comparative Performance of a Robust Lung Auscultation
System for Noisy Clinical Settings. IEEE J. Biomed. Health Inform. 2021, 25, 2583–2594. [CrossRef] [PubMed]
2. Zhang, X.; Maddipatla, D.; Narakathu, B.B.; Bazuin, B.J.; Atashbar, M.Z. Development of a Novel Wireless Multi-Channel
Stethograph System for Monitoring Cardiovasculara and Cardiopulmonary Diseasses. IEEE Access 2021, 9, 128951–128964.
[CrossRef]
3. Toda, M.; Thompson, M.L. Contact-type Vibration Sensors Using Curved Clamped PVDF Film. IEEE Sens. J. 2006, 6, 1170–1177.
[CrossRef]
4. Duan, S.; Wang, W.; Zhang, S.; Yang, X.; Zhang, Y.; Zhang, G. A Bionic MEMS Electronic Stethoscope with Double-Sided
Diaphragm Packaging. IEEE Access 2021, 9, 27122–27129. [CrossRef]
5. Shi, P.; Li, Y.; Zhang, W.; Zhang, G.; Cui, J.; Wang, S.; Wang, B. Design and Implementation of Bionic MEMS Electronic Heart
Sound Stethoscope. IEEE Sens. J. 2022, 22, 1163–1172. [CrossRef]
6. Andreozzi, E.; Fratini, A.; Esposito, D.; Naik, G.; Polley, C.; Gargiulo, G.D.; Bifulco, P. Forcecardiography: A Novel Technique to
Measure Heart Mechanical Vibrations onto the Chest Wall. Sensors 2020, 20, 3885. [CrossRef] [PubMed]
7. Andreozzi, E.; Gargiulo, G.D.; Esposito, D.; Bifulco, P. A Novel Broadband Forcecardiography Sensor for Simultaneous Monitoring
of Respiration, Infrasonic Cardiac Vibrations and Heart Sounds. Front. Physiol. 2021, 18, 725716. [CrossRef] [PubMed]
8. Chien, J.; Huang, M.; Lin, Y.; Chong, F. A study of heart sound and lung sound separation by independent component analysis
technique. In Proceedings of the 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New
York, NY, USA, 30 August–3 September 2006; pp. 5708–5711.
9. Hadjileontiadis, L.J.; Panas, S.M. A wavelet-based reduction of heart sound noise from lung sounds. Int. J. Med. Inform. 1998, 52,
183–190. [CrossRef]
10. Liu, F.; Wang, Y.; Wang, Y. Research and Implementation of Heart Sound Denoising. Phys. Procedia Vol. 2012, 25, 777–785.
[CrossRef]
11. Mayorga, P.; Valdez, J.A.; Druzgalski, C.; Zeljkovic, V.; Magana-Almaguer, H.; Morales-Carbajal, C. Cardiopulmonary sound
sources separation. In Proceedings of the 2021 Global Medical Engineering Physics Exchanges/Pan American Health Care
Exchanges, Sevilla, Spain, 15–20 March 2021.
12. Lin, L.; Tanumihardja, W.A.; Shih, H. Lung-heart sound separation using noise assisted multivariate empirical mode decomposi-
tion. In Proceedings of the 2013 International Symposium on Intelligent Signal Processing and Communication Systems, Naha,
Japan, 12–15 November 2013; pp. 726–730.
13. Jusak, J.; Puspasari, I.; Susanto, P. Heart murmurs extraction using the complete ensemble empirical mode decomposition and the
Pearson distance metric. In Proceedings of the 2016 International Conference on Information & Communication Technology and
Systems (ICTS), Surabaya, Indonesia, 12 October 2016; pp. 140–145.
14. Papadaniil, C.D.; Hadjileontiadis, L.J. Efficient Heart Sound Segmentation and Extraction Using Ensemble Empirical Mode
Decomposition and Kurtosis Features. IEEE J. Biomed. Health Inform. 2014, 18, 1138–1152. [CrossRef] [PubMed]
15. Varghees, V.N.; Ramachandran, K.I. Effective Heart Sound Segmentation and Murmur Classification Using Empirical Wavelet
Transform and Instantaneous Phase for Electronic Stethoscope. IEEE Sens. J. 2017, 17, 3861–3872. [CrossRef]
16. Ntalampiras, S. Collaborative Framework for Automatic Classification of Respiratory Sounds. IET Signal Process. 2020, 14,
223–228. [CrossRef]
17. Potes, C.; Parvaneh, S.; Rahman, A.; Conroy, B. Ensemble of feature-based and deep learning-based classifiers for detection
of abnormal heart sounds. In Proceedings of the 2016 Computing in Cardiology Conference, Vancouver, BC, Canada, 11–14
September 2016; pp. 621–624.
18. Chowdhury, T.H.; Poudel, K.N.; Hu, Y. Time-frequency Analysis, Denoising, Compression, Segmentation, and Classification of
PCG Signals. IEEE Access 2020, 8, 160882–160890. [CrossRef]
19. Kumar, D.; Carvalho, P.; Antunes, M.; Paiva, R.P.; Henriques, J. Heart murmur classification with feature selection. In Proceedings
of the 32nd Annual International Conference of the IEEE Engineering Medicine and Biology Society, Buenos Aires, Argentina, 31
August–4 September 2010.
Sensors 2022, 22, 4263 25 of 25
20. Li, J.; Ke, L.; Du, Q.; Ding, X.; Chen, X.; Wang, D. Heart Sound Signal Classification Algorithm: A Combination of Wavelet
Scattering Transform and Twin Support Vector Machine. IEEE Access 2019, 7, 179339–179348. [CrossRef]
21. Gjoreski, M.; Gradisek, A.; Budna, B.; Gams, M. Machine Learning and End-to-end Deep Learning for the Detection of Chronic
Heart Failure from Heart Sounds. IEEE Access 2020, 8, 20313–20324. [CrossRef]
22. Shuvo, S.B.; Ali, S.N.; Swapnil, S.I.; Al-Rakhami, M.S.; Gumaei, A. CardioXNet: A Novel Lightweight Deep Learning Framework
for Cardiovasculr Disease Classification Using Heart Sound Recordings. IEEE Access 2021, 9, 36955–36967. [CrossRef]
23. Liu, C.; Springer, D.; Li, Q.; Moody, B.; Juan, R.A.; Chorro, F.J.; Castells, F.; Roig, J.M.; Silva, I.; Johnson, A.E.; et al. An Open
Access Database for the Evaluation of Heart Sound Algorithms. Physiol. Meas. 2016, 37, 2181–2213. [CrossRef] [PubMed]
24. Wu, Y.-C.; Chang, F.-L. Development of an electronic stethoscope using raspberry. In Proceedings of the 2021 IEEE International
Conference on Consumer Electronics-Taiwan, Penghu, Taiwan, 15–17 September 2021.
25. Heart Sounds. Available online: [Link] (accessed on 11 April 2022). (In Chinese)
26. Ward Construction Noise. Available online: [Link] (accessed on 10 April 2022).
27. Airport Noise. Available online: [Link] (accessed on 10 April 2022).
28. Classification of Heart Sound Recordings: The PhysioNet/Computing in Cardiology Challenge 2016. Available online: https:
//[Link]/content/challenge-2016/1.0.0/ (accessed on 10 April 2022).
29. Respiratory Sound Database. Available online: [Link]
(accessed on 10 April 2022).
30. Jolliffe, I.T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Philos. Trans. R. Soc. A 2016, 374,
20150202. [CrossRef] [PubMed]
31. Mel Frequency Cepstral Coefficient (MFCC) Tutorial. Available online: [Link]
machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ (accessed on 11 April 2022).
32. Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Sugantha, P.N. Ensemble Deep Learning: A Review. arXiv 2022,
arXiv:2104.02395v2.
33. Zabihi, M.; Rad, A.B.; Kiranyaz, S.; Gabbouj, M.; Katsaggelos, A.K. Heart sound anomaly and quality detection using ensemble
of neural networks without segmentation. In Proceedings of the 2016 Computing in Cardiology Conference, Vancouver, BC,
Canada, 11–14 September 2016; pp. 613–616.
34. Kay, E.; Agarwal, A. DropConnected neural network trained with diverse features for classifying heart sounds. In Proceedings of
the 2016 Computing in Cardiology Conference, Vancouver, BC, Canada, 11–14 September 2016; pp. 617–620.