0% found this document useful (0 votes)
38 views25 pages

Reference Paper 4

medicne

Uploaded by

jerilg.pulk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views25 pages

Reference Paper 4

medicne

Uploaded by

jerilg.pulk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

sensors

Article
Development of an Electronic Stethoscope and a Classification
Algorithm for Cardiopulmonary Sounds †
Yu-Chi Wu 1, * , Chin-Chuan Han 2 , Chao-Shu Chang 3 , Fu-Lin Chang 1 , Shi-Feng Chen 1 , Tsu-Yi Shieh 4,5 ,
Hsian-Min Chen 6 and Jin-Yuan Lin 1

1 Department of Electrical Engineering, National United University, Miaoli City 36003, Taiwan;
a0976335098@[Link] (F.-L.C.); sisterw961o3y94rm4@[Link] (S.-F.C.); yuan@[Link] (J.-Y.L.)
2 Department of Computer Science and Information Engineering, National United University,
Miaoli City 36003, Taiwan; cchan@[Link]
3 Department of Information Management, National United University, Miaoli City 36003, Taiwan;
cschang@[Link]
4 Section of Clinical Training, Department of Medical Education, Taichung Veterans General Hospital,
Taichung City 40705, Taiwan; zuyihsieh@[Link]
5 Division of Allergy, Immunology and Rheumatology, Taichung Veterans General Hospital,
Taichung City 40705, Taiwan
6 Center for Quantitative Imaging in Medicine (CQUIM), Department of Medical Research, Taichung Veterans
General Hospital, Taichung City 40705, Taiwan; hsmin6511@[Link]
* Correspondence: ycwu@[Link]; Tel.: +886-939967722
† This paper is an extended version of our paper published in 2021 IEEE International Conference on Consumer
Electronics-Taiwan (ICCE-TW), Penghu, Taiwan, 15–17 September 2021.

Abstract: With conventional stethoscopes, the auscultation results may vary from one doctor to
another due to a decline in his/her hearing ability with age or his/her different professional training,
and the problematic cardiopulmonary sound cannot be recorded for analysis. In this paper, to resolve
Citation: Wu, Y.-C.; Han, C.-C.;
the above-mentioned issues, an electronic stethoscope was developed consisting of a traditional
Chang, C.-S.; Chang, F.-L.; Chen, S.-F.;
stethoscope with a condenser microphone embedded in the head to collect cardiopulmonary sounds
Shieh, T.-Y.; Chen, H.-M.; Lin, J.-Y.
and an AI-based classifier for cardiopulmonary sounds was proposed. Different deployments of
Development of an Electronic
the microphone in the stethoscope head with amplification and filter circuits were explored and
Stethoscope and a Classification
Algorithm for Cardiopulmonary
analyzed using fast Fourier transform (FFT) to evaluate the effects of noise reduction. After testing,
Sounds. Sensors 2022, 22, 4263. the microphone placed in the stethoscope head surrounded by cork is found to have better noise
[Link] reduction. For classifying normal (healthy) and abnormal (pathological) cardiopulmonary sounds,
each sample of cardiopulmonary sound is first segmented into several small frames and then a
Academic Editor: Leopoldo
principal component analysis is performed on each small frame. The difference signal is obtained by
Angrisani
subtracting PCA from the original signal. MFCC (Mel-frequency cepstral coefficients) and statistics
Received: 18 April 2022 are used for feature extraction based on the difference signal, and ensemble learning is used as the
Accepted: 1 June 2022 classifier. The final results are determined by voting based on the classification results of each small
Published: 3 June 2022
frame. After the testing, two distinct classifiers, one for heart sounds and one for lung sounds, are
Publisher’s Note: MDPI stays neutral proposed. The best voting for heart sounds falls at 5–45% and the best voting for lung sounds falls at
with regard to jurisdictional claims in 5–65%. The best accuracy of 86.9%, sensitivity of 81.9%, specificity of 91.8%, and F1 score of 86.1%
published maps and institutional affil- are obtained for heart sounds using 2 s frame segmentation with a 20% overlap, whereas the best
iations. accuracy of 73.3%, sensitivity of 66.7%, specificity of 80%, and F1 score of 71.5% are yielded for lung
sounds using 5 s frame segmentation with a 50% overlap.

Keywords: electronic stethoscope; cardiopulmonary sound classification; principal component


Copyright: © 2022 by the authors.
analysis; Mel-frequency cepstral coefficients; ensemble learning
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
1. Introduction
[Link]/licenses/by/ Stethoscopes are an extremely important diagnostic tool in the medical community.
4.0/). The stethoscope can be used to listen to not only heart sounds to determine heart-related

Sensors 2022, 22, 4263. [Link] [Link]


Sensors 2022, 22, 4263 2 of 25

diseases but also lung sounds to diagnose whether there are abnormalities in the lungs.
This is known as auscultation. Traditional stethoscopes utilized horn-shaped stethoscope
heads for listening to the sounds of the movements of the visceral organs. However, with
conventional stethoscopes, the auscultation results may vary from one doctor to another
due to a decline in his/her hearing ability with age or his/her different professional training
background, and the problematic cardiopulmonary sound cannot be recorded for further
analysis. Therefore, electronic stethoscopes that can record and analyze/classify cardiopul-
monary sounds are needed to cope with these issues. If an effective classification algorithm
can be embedded into electronic stethoscopes, doctors can use electronic stethoscopes to
obtain a prompt preliminary diagnosis in an emergency, or those who need secondary pre-
vention after discharge from the hospital and care at home can have long-term monitoring
and early detection of abnormalities.
Based on the sensor used in electronic stethoscopes, air- and contact-conduction elec-
tronic stethoscopes are two common types of electronic stethoscopes. The air-conduction
electronic stethoscope utilizes an electromagnetic coil or electret capacitor as the sensor to
collect sound signals with high stability and strong reliability but with low sensitivity. The
contact conduction electronic stethoscope mostly uses piezoelectric materials as a sound
sensor with improved anti-interference ability and sensitivity. However, any deforma-
tion or damage to the piezoelectric materials in the manufacturing process degrades the
sensitivity of the stethoscope.
In the category of air-conduction electronic stethoscopes, McLane et al. [1] developed
an advanced air-conduction electronic stethoscope using a microphone array, an external-
facing microphone, and an onboard signal processor to perform adaptive noise suppression
of lung auscultation in noisy clinical settings. Zhang et al. [2] designed an electronic
auscultation system for the graphic recording of heart, lung, and trachea (HLT) sounds
by placing microphones in a CNC-machined Delrin housing case covered by diaphragms.
Sixteen acoustic sensors, of which 14 were positioned in a memory foam pad and 2 were
placed directly on the heart and trachea, were used to record the acoustic data using a
LabView program, and the waveforms in the time and frequency domain, as well as a
spectrogram for visual examination, could be plotted using Matlab.
In the category of contact-conduction electronic stethoscopes, Toda and Thompson [3]
devised a contact-vibration sensor by bonding a piezoelectric polyvinylidene fluoride
(PVDF) film to a curved rubber piece having a front-contact face. Vibrations transmitted
from the front-contact face through the rubber to the film cause pressure normal to the
surface of the film and then an electric field is induced by the piezoelectric effect. Duan
et al. [4] proposed a double-sided diaphragm micro-electro-mechanical system (MEMS)
electronic stethoscope (DMES) based on a bionic lollipop-shaped MEMS sound-sensitive
sensor. Shi et al. [5] designed a stethoscope also based on the bionic lollipop-shaped MEMS
and developed an acquisition circuit and PC upper machine for real-time acquisition of
heart sound signals. Recently, forcecardiography (FCG), a non-invasive technique that
measures vibrations via force-sensing resistors (FSR), has been proposed [6,7] to measure
heart mechanical vibrations. The active area of FSR was fixed with a rigid dome and an
accelerometer was used to acquire the dorso-ventral seismocardiography (SCG) signal.
In [6], the FCG sensor and the accelerometer were firmly mounted on a plexiglass rigid
board. In [7], a lead-zirconate-titanate piezoelectric disk equipped with the same dome-
shaped mechanical coupler used for the FSR-based sensor was proposed for simultaneous
monitoring of respiration, infrasonic cardiac vibrations, and heart sounds.
Three steps are involved in the analysis or classification of auscultatory sounds: pre-
processing, feature extraction, and classifier design. Pre-processing deals with the noise-
processing or signal-frame decomposition. For noise processing, noise reduction and
de-noising of the samples are performed to enhance the noise immunity of the classification
algorithm. For cardiopulmonary signals, the location of the recorded cardiopulmonary
sounds may vary but usually, both cardiac and pulmonary signals are included. Therefore,
Chien et al. [8] used two microphones to collect signals from the left and right chest and
Sensors 2022, 22, 4263 3 of 25

presented a fast independent component analysis (ICA) algorithm to separate heart and
lung sounds. Hadjileontiadis and Panas [9] and Liu et al. [8] used wavelet transform tech-
niques to denoise heart sounds from lung signals [9] and, conversely, filter out lung sounds
from heart sounds [10]. Mayorga et al. [11] proposed an empirical mode decomposition
(EMD) followed by Gaussian mixed models (GMM) to improve ICA for separating heart
and lung sounds. In addition to noise reduction, EMD is also a method to estimate the
frequency components [12–14] and each component can be used as a classification feature.
In [12], EMD was used to separate heart sounds from lung signals; in [13], EMD was used to
capture heart murmurs; and in [14], EMD was used to segment and capture the underlying
heart sounds (S1, S2). Varghees and Ramachandran [15] proposed an empirical wavelet
transform (EWT)-based heart sound signal decomposition method by integrating both
EMD and WT. Ntalampiras [16] transformed the lung sound signals to the frequency and
wavelet domains before performing the subsequent analysis.
The pre-processed acoustic signals then go through feature extraction for time- and/or
frequency-domain features. The main purpose of capturing time-domain features is to
observe inter-beat or inter-respiratory variations. The frequency-domain features are most
commonly expressed as coefficients of frequency spectrum or inverse frequency spectrum,
and the signals are transformed to the frequency domain to observe the changes in the
signals in the frequency domain. The Mel-frequency cepstral coefficient is widely used in
sound processing [17,18]. Potes et al. [17] used frequency-domain features together with
time-domain features. Chowdhury et al. [18] denoised and compressed phonocardiography
(PCG) signals using a multi-resolution analysis based on the discrete wavelet transform
(DWT), segmented PCG signals using the Shannon energy envelope and zero-crossing into
four parts and extracted features from PCG signals using a Mel-scaled power spectrogram
and Mel-frequency cepstral coefficients (MFCC). Kumar et al. [19] used a set of features:
time, frequency, and statistical or phase space features.
The extracted features are finally fed into a designed classifier. In [17], a total of
124 time-frequency features were extracted from the PCG and input to a variant of the
AdaBoost classifier, and PCG cardiac cycles decomposed into four frequency bands were
input to a second classifier using a convolutional neural network (CNN). An ensemble
of classifiers combining the outputs of AdaBoost and the CNN was designed to classify
normal/abnormal heart sounds. In [18], a five-layer feed-forward deep neural network
(DNN) model was used. In [19], a support vector machine (SVM) classifier was trained and
applied for each of the feature sets. Li et al. [20] applied a wavelet scattering transform and
multidimensional scaling method and then presented a twin SVM (TWSVM) to classify
heart sound signals. Gjoreski et al. [21] utilized classic machine-learning (ML) to learn from
expert features and end-to-end deep learning (DL) to learn from a spectro-temporal repre-
sentation of the signal. Shuvo et al. [22] proposed a lightweight end-to-end convolutional
recurrent neural network (CRNN) architecture for the automatic detection of five classes of
cardiac auscultation using raw PCG signals. This model was tested on PhysioNet/CinC
2016 challenge dataset [23] achieving an accuracy of 86.57%.
In this study, a low-cost electronic stethoscope with noise reduction is proposed, and
an effective classification algorithm, based on principal component analysis (PCA), MFCC,
and ensemble learning, is developed for auscultatory sounds. A graphical user interface is
established to save or replay the recorded sounds in Raspberry Pi. The presented system
can be useful in medical care.

2. Materials and Methods


2.1. Design of Electronic Stethoscope
Figure 1 shows the system architecture of the designed electronic stethoscope. Two
microphones are used, one is installed inside the stethoscope head for heart/lung sounds,
and the other is optionally attached to the back of the stethoscope head for noise reduction.
Sound signals are then amplified through the first amplifier circuits. Different filters are
designed for heart and lung sounds. After the first filtering, the signals are amplified
Figure 1 shows the system architecture of the designed electronic st
microphones are used, one is installed inside the stethoscope head for hea
and the other is optionally attached to the back of the stethoscope head f
Sensors 2022, 22, 4263
tion. Sound signals are then amplified through the first amplifier circuits.
4 of 25
are designed for heart and lung sounds. After the first filtering, the signa
through the second amplifier circuits for the second filtering. The two pr
are then
through subtracted
the second amplifierbycircuits
a differential amplifier
for the second andtwo
filtering. The transmitted to a Raspb
processed sounds
are then subtracted by a differential amplifier and transmitted to a Raspberry Pi through a
a 16-bit ADC with a sampling rate of 44.1 kHz.
16-bit ADC with a sampling rate of 44.1 kHz.

Figure 1. System
Figure architecture
1. System of electronicofstethoscope.
architecture electronic stethoscope.
A condenser microphone that could be easily found on the market at a reasonable price
A condenser
was utilized in this study. microphone that could
Condenser microphones withoutbethe easily
magnet found
and coil ongenerate
the market
voltage
price was utilized in this study. Condenser microphones withoutand
changes with changing distances between two diaphragms in the capacitor the magn
present the advantages of being lightweight, small in size, and have high sensitivity,
anderate
they voltage
are often usedchanges with changing
in high-quality [Link] between two
Condenser microphones withdiaphragms
higher
and present
sensitivity the advantages
could record of being
more sound details but lightweight,
would also easily small
absorb innoise
size,inand the have h
environment; therefore, they are more suitable for use in quiet
and they are often used in high-quality recording. Condenser microphon studios. Figure 2a shows
our design of the electronic stethoscope head [24] where the condenser microphone (mic 1)
is sensitivity
placed. Figurecould
2b shows record
the headmore
with micsound details
1 inside, the cork,butandwould also easily
the diaphragm. Here, aabsorb
vironment;
round disk cork istherefore,
presented and they aretomore
is used suitable
cover the back of for use ifindesired.
the head quiet Figure
studios. 2c Figu
illustrates the edge of the head (face up) surrounded by the cork.
design of the electronic stethoscope head [24] where the condenser microp No gasket is stuffed inside
the head. Figure 2d shows the back of the head attached to a microphone (mic 2) to collect
placed. Figure
environmental noise. 2b shows
Figure the head
2e depicts with
the front of themic
head1 encased
inside,with thethecork, and the diap
diaphragm;
theround
edge ofdisk cork
the head is presented
is surrounded by theand is Aused
cork. to cover
shielded line wasthe used back of thethehead if
to connect
microphone and amplifier circuits. Two-stage amplification/filtering
2c illustrates the edge of the head (face up) surrounded by the is adopted to process
cork. No g
the small signals collected by microphones and more accurate heart/lung sound waveforms
areinside
obtainedthe head.
when Figure
compared 2donly
with shows the back
one-stage of the head attached
amplification/filtering. The frequency to a microp
ofcollect
the heartenvironmental
sounds is from 1 Hz noise.
to 800 Figure
Hz and the 2ehuman
depicts ear the front in
is sensitive ofthetherange
head of encase
40phragm;
Hz to 400 Hz s most of the signals below 20 Hz are inaudible [25].
the edge of the head is surrounded by the cork. A shielded line w Therefore, the filtering
band of heart sounds in this study is below 400 Hz and the low-pass filter is designed to
nect the microphone and amplifier circuits. Two-stage amplification/filte
attenuate the signals above 400 Hz. The main frequency range of the lung sounds is from
100toHzprocess
to 2000 the small
Hz, and the signals
high-passcollected by microphones
filter and low-pass and more
filter are designed to filteraccurate
out he
signals below 100 Hz and above 2000 Hz. Therefore, after
waveforms are obtained when compared with only one-stage amplificatioamplification, the heart or lung
sound is filtered by the corresponding filter to filter out the unwanted frequencies of sound.
frequency of the heart sounds is from 1 Hz to 800 Hz and the human ear is
As the two processed heart/lung signals and noise signals need subtraction, the circuit
range of 40 Hz
signal-processing delaytowould
400 Hz s most
affect of the signals
noise reduction. To solve below 20 Hztwo-stage
this problem, are inaudible
amplification/filtering
the filtering band of heart sounds in this study is below 400 Figures
circuits with different gains were applied for noise sounds. Hz and 3 the lo
and 4 illustrate amplification circuits for heart/lung sounds and noise sounds, respectively.
designed to attenuate the signals above 400 Hz. The main frequency ra
sounds is from 100 Hz to 2000 Hz, and the high-pass filter and low-pass filt
to filter out signals below 100 Hz and above 2000 Hz. Therefore, after am
heart or lung sound is filtered by the corresponding filter to filter out the
Sensors 2022, 22, x FOR PEER REVIEW 5 of 25
Sensors 2022, 22, x FOR PEER REVIEW
Sensors 2022, 22, x FOR PEER REVIEW
5 of 25
5 of 25

quencies of sound. As the two processed heart/lung signals and noise signals need sub-
quencies
traction, of
quenciesthe of sound.
circuitAs
sound. As the two processed
the two
signal-processingprocessed heart/lung
heart/lung
delay signals
wouldsignals
affect andand
noise noise
noise signals
signals
reduction. need
need
To sub-
solve sub-
this
Sensors 2022, 22, 4263 5 of 25
traction,
problem, two-stage amplification/filtering circuits with different gains were appliedthis
traction, the
the circuit
circuit signal-processing
signal-processing delay
delay would
would affect
affect noise
noise reduction.
reduction. To To solve
solve this for
problem,
problem,
noise two-stage
two-stage
sounds. amplification/filtering
amplification/filtering
Figures circuitswith
circuits
3 and 4 illustrate amplification withcircuits
different
different gains
gains
for were
were
heart/lung applied
applied
sounds for
forand
noise
noise sounds.
sounds. Figures
Figures 3 and
and 4
4 illustrate
illustrate amplification
amplification circuits
circuits forfor heart/lung
heart/lung
noise sounds, respectively. Figure 5 depicts the filter circuits for heart and lung sounds. sounds
sounds and and
Figure
noise 5 depicts
noisesounds,
sounds, the filter circuits
respectively.
respectively. Figure
Figure for heart
55depicts
depictsand
the
thelung
filtersounds.
circuits
filter circuitsThe
for same
heart
for filters
and
heart andlungwere applied
sounds.
lung sounds.
The same filters were applied to noise sounds. Figure 6 shows the whole circuit diagram
to
The noise
The same
same sounds.
filters Figure
were
filters were 6 shows
applied to the
noisewhole
[Link] diagram
Figure 6 showswhere
the amplifiers
whole and
circuit filters
diagram are
where amplifiers and applied
filters aretomarked
noise sounds. Figure
for better 6 shows the whole circuit diagram
understanding.
marked
where for better
amplifiers understanding.
and filters are marked for better
where amplifiers and filters are marked for better understanding. understanding.

(a)
(a) (b) (c)
(c) (d)(d) (e) (e)
(a) (b) (c) (d) (e)
[Link]
Condenser microphone
Figure
Figure 2. Condenser microphoneplaced
microphone placedinin
placed instethoscope.
stethoscope.
stethoscope. (a)(a)
a mic
(a) putput
aa mic
mic into
put a stethoscope
into
into aa stethoscope
stethoscope head, (b) a(b) a
head,
head, (b) a
Figure 2. Condenser
stethoscope
stethoscope head with
head microphone
mic placed
mic 11 inside,
inside,the in stethoscope.
thecork,
cork, and
andthe (a) a mic(c)put
thediaphragm,
diaphragm, into
thethe
(c) edge a of
edge stethoscope
aofhead (face
a head head, (b) a
up) up)
(face
stethoscope
stethoscope head with mic 1 inside, the cork, and the diaphragm, (c) the edge of a head (face up)
surroundedhead
surrounded by with
by the
the mic
cork,
cork, (d)1the
(d) inside,
the backthe
back cork,
ofofthe
thehead and
head the diaphragm,
attached
attached toto (c) the edge
a microphone
a microphone (mic of and
2),
(mic a head
2), (face
(e) theup)
(e) the
and
surrounded
surrounded by
by the cork,
theencased (d)
cork, (d) the back
thethe
back of the head
of the head attached
attached to
tothea microphone
ahead
microphone (mic
(micby2),2), and (e) the front
frontof
front ofthe
the head
head encased with
with the diaphragm;
diaphragm; the
theedge
edge ofofthe surrounded
head surrounded byand
the cork(e)
the corkthe
of theofhead
front encased
the head with the
encased withdiaphragm;
the diaphragm; the edge of theofhead
the edge surrounded
the head surroundedby theby cork
the cork

(a) (b)
(a) (b)
(a)
Figure 3. (a) First-stage (b)
amplification circuit and (b) second-stage amplification circuit for
Figure 3. (a)
heart/lung First-stage amplification circuit and (b) second-stage amplification circuit for
sounds.
Figure
Figure 3.3.(a)(a)
heart/lung First-stage
First-stage
sounds. amplification
amplification circuit
circuit and and (b) second-stage
(b) second-stage amplification
amplification circuit circuit
for heart/lung for
sounds.
heart/lung sounds.

(a) (b)
(a) amplification circuit and (b) second-stage
Sensors 2022, 22, x FOR PEER REVIEWFigure 4. (a) First-stage (b)amplification circuit for noise
6 of 25
sounds. (a) (b)
Figure 4. (a)
Figure 4. (a)First-stage
First-stage amplification
amplification circuit
circuit and
and (b) (b) second-stage
second-stage amplification
amplification circuit forcircuit for noise
noise sounds.
Figure
sounds.4. (a) First-stage amplification circuit and (b) second-stage amplification circuit for noise
sounds.

(a) (b)
Figure
Figure5.
5.(a)
(a)Filter
Filtercircuit
circuitfor
forheart
heart sounds
sounds and
and (b)
(b) filter
filter circuit
circuit for
for lung
lung sounds.
sounds.
Sensors 2022, 22, 4263 (a) (b) 6 of 25

Figure 5. (a) Filter circuit for heart sounds and (b) filter circuit for lung sounds.

[Link]
Figure Circuitdiagram
diagram of
of the
the presented
presentedsystem.
system.

AAgraphical
graphical user
user interface
interface (GUI)
(GUI)for
forrecording
recordingsounds
soundswaswasalso developed
also developed during thethe
during
course of this study. For noise reduction, we designed 5 different models of
course of this study. For noise reduction, we designed 5 different models of electronic electronic
stethoscopeheads
stethoscope headsasasshown
shownininTable
Table1 1[24],
[24],where
wherea acommercial
commercialelectronic
electronicstethoscope
stethoscopewas
was included. Two microphones were used in Models 1~2, one inside
included. Two microphones were used in Models 1~2, one inside the stethoscope headthe stethoscope
head for collecting heart/lung sounds and the other on the back of the stethoscope head
for collecting heart/lung sounds and the other on the back of the stethoscope head for
for collecting environmental noise. The difference between these two models is that one
collecting environmental noise. The difference between these two models is that one is
without a cork (Model 1) and the other one is surrounded by a cork (Model 2, shown as
in Figure 2c,e). Two stethoscope heads connected back-to-back are used in Models 3 and
4 and each stethoscope has its own microphone inside the head. The difference between
these two models is that one is without a cork (Model 3) and the other one is surrounded
by a cork (Model 4) as in Model 2. The difference between Models 1–2 and Models 3–4
is the way the microphone is used to collect environmental noise. Mic 2 in Models 1–2 is
exposed to the air, whereas the one in Models 3–4 is inside the stethoscope head. Therefore,
by comparing Models 1 and 3 (or Models 2 and 4), we can see the effects of noise reduction
by the subtracter when a different way of collecting environmental noise is used. When
compared with Model 1, Model 2 is designed to see the effects of noise isolation on the
cork. When compared with Model 3, the design of Model 4 is also to see the effects of noise
isolation on the cork. One stethoscope head (one microphone inside the head) fully covered
by a cork (Model
by a cork4)by as aincork
(Model Model (Model
4) [Link] 4)difference
Model as [Link] between 2. The difference
difference Models
between 1–2between
and Models
Models Models
1–2 3–4Mod
and 1–2
is
by a cork
by a cork (Model 4)by(Model
asain cork 4)
Model as 2.
(Model inTheModel
4) as [Link]
differenceThe difference between
2. The difference
between Models 1–2 Models
between
and Models1–2
Modelsand 3–4 Mod
1–2 is a
the way thethe way the
microphone the way the microphone
is used
microphone to collect is collect
is usedenvironmental
to used toenvironmental
collect
noise. environmental
Mic noise.
2 in Models
Micnoise. inMic
2 1–2 Modis 2
the way the themicrophone
way the the way microphone
is the
used to iscollect
microphone used environmental
to
is collect
used toenvironmental
[Link]
Mic noise.
2 in Mic 2 in
noise.
Models 1–2 Mode
Mic is2
exposed to exposed exposed
the air, whereas
to the air, towhereas
the the
oneair, whereas
in Models
the one3–4 inthe isone
inside
Models in3–4
Models 3–4 is
theisstethoscope
inside theinsidehead. theTherefore,
stethoscope stethoscop
head. T
exposed toexposed to the air,the
exposed
the air, whereas
by 1comparing
towhereas
the
one air,
in the one in
whereas
Models the
3–4 Models
one
is in3–4
inside is stethoscope
Models
the inside
3–4 is the stethoscope
inside head. T
theTherefore,
head. stethoscop
by comparing Models
by comparing and 1Models
3 (or Models
Models and 3 (or 21andand
Models 3 (or
4), we2Models
can
andsee4), 2the
weand 4),see
effects
can we of can
the seereduction
noise
effects theofeffects
noise r
by comparing by comparing
Models 1 Models
by comparing
and 3 (or 1Models
and 3 (or Models
12 and 34),(orwe2Models
and
can 4),
see 2weand
the can 4),see
effectswethe can
of effects
see the
noise ofeffects
noise ro
reduction
by the subtracter
by the when by the
subtracter subtracter
a different
when away when a different
of collecting
different way of collecting
way ofenvironmental
collecting noise
environmental environmental
is [Link] no
is use
Sensors 2022, 22, 4263 by the subtracter
by the subtracter when when away
by thea subtracter
different different
when of 1, way of collecting
acollecting
different environmental environmental
way of collecting noise ofnoise
environmental
is effects
used. Whenis use
noi
compared with Model
compared compared1, Model
with Model with isModel
2 1, designed
Model 2 Model see2the
istodesigned is designed
effects
to see of theto
noisesee the
effects of7 noise
isolation 25 on of nois
the
isolatio
comparedcompared
with Model with
compared
cork.
Model
1,When
Model with21,Model
comparedisModel
designed1,2Model
is to
with3,Model
designed
see2 is
the
3, the
to
designedsee the
effects effects
oftonoise
see the ofeffects
noiseon
isolation isolatio
of noise
the
cork. [Link]
When with Model
compared 3, the
with design
Model of
theModel
design ofdesign
4 is also
Model toof 4 Model
see isthe 4 is
alsoeffects
to seealso
of
the toeffect
noise see
cork. When cork. When
compared cork. compared
When
with Model with
compared3, Model
the with3,Model
design the
of design
Model3, the4ofis Model
design
also of4see
to is also
Modelthe to
4 issee
effectsalsothe
of to effects
see t
noise
isolation onisolation isolation
the [Link] on
the stethoscopethe cork.
cork. One stethoscope One stethoscope
head (one microphone head
head (one microphone (one microphone
inside the head) insidefully inside
cov-thf
the head)
isolation on isolation
the cork. onOne the cork.
isolation on the One
stethoscope stethoscope
cork. One(one
head head
stethoscope (one
microphone headmicrophone
(one microphone
inside the inside
head) the head)
inside
fully f
the
ered
by a cork is by
useda cork
in isby
Model
ered usedaered in by
5,cork
which is aused
Model cork
is isModel
5,inwhich
different usedfromisin5, Model
Model5,
different
which 2which
from
is is different
Model
by missing
different 2one
from byModel from2 Model
missing
microphone onemissing
by 2cov-
micro- by om
ered is
ered by a cork byused
aered
corkin byisaused
Model cork5, in Model
iswhich
used in is5,different
which5,isfrom
Model different
which from
is different
Model 2 by Modelfrom2 Model
missing byonemissing
2 by m
micro- on
on the phone
back ofonthethehead
back
phone and
on phone
of adding
the
the on an
head
back the
ofand back
additional
the adding
head of the an
and head
piece ofand
additional
adding cork adding
an (as
piece
additionalancork
shown
of additional
in
pieceFigure
(as shownpiece
2b) in
of cork toofFigure
(as cork (a
shown
phone on the phonebackonphone
of the
the back
on the
head ofandthe
back head and
of the
adding adding
head andan additional
adding
piecean piece
ofadditional (asof cork
piece(as shown
ofFigure
cork (as
cover the
2b) back of 2b)
to cover the head;
thetoback 2b)
cover oftothe
the cover
therefore, head;
back the
the back
design
oftherefore,
the ofan
head; ofthe
additional
Model
the head;
design
therefore, 5therefore,
isthe
oftoModel
see the
design
cork
the isdesign
effects
5of
shown
to see
Model ofof5noise
the
in
Model
iseffects
to see 5 istheto
of
2b)ontothe
cover2b)the
to back
cover
2b)of the
to back
cover
the head; of therefore,
the the
back head;
of the therefore,
the head;
design the
therefore,design
of Model the5ofdesign
isModel
to seeof5the is to
Model see
effects5 is the
ofto
isolation
noise cork
isolation without
noise noise
onisolation
the using
cork isolation
the
without
on corkon
the subtracter.
usingthe cork without
the subtracter.
without using theusing the subtracter.
subtracter.
noiseon
noise isolation isolation
noise
the cork onwithout
the cork
isolation onusingwithout
the cork using theusing
without
the subtracter. subtracter.
the subtracter.
Table [Link] 1. Six Table
Six differentdifferent
models Table
models
of
1. Six [Link]
electronic
different different
electronic
models models
stethoscope
stethoscope
of of electronic
heads.
electronic heads. stethoscope
stethoscope heads. heads.
Table 1. SixTable 1. Six
different different
Table
models1. Six models
differentofmodels
of electronic electronic stethoscope
of electronic
stethoscope heads. heads.
heads. stethoscope
Model 1 Model 1ModelModel11 Model22Model 2
Model 2 Model Model 33 Model 3Model 3
Model
Model 1 Model 1Model 1 Model 2 Model 2Model 2 Model 3 Model 3Model 3
One stethoscope One stethoscope One stethoscope
One stethoscope
One stethoscope
stethoscope One One stethoscope
stethoscope
One stethoscope
stethoscope
One stethoscope
One stethoscope One One
One stethoscope One stethoscope Two
Two stethoscopes Two stethos
Two stethoscopes
stethoscopes
One microphone One
One microphonemicrophone One mirophone
One microphone One One mirophone
mirophone
One mirophone Two stethoscopes
Two stethoscopes Two stethosc
One microphone
One microphone One microphone
(without cork)
One
One mirophone mirophone
(with
One mirophone
cork) (without(without
(without cork)
(without cork) cork) co
(without cork) (without
(without cork) cork) (with cork)(with cork) (with cork) (without(without
(without cork) cork) co
(without(without
(without cork) cork) cork) (with cork)(with cork)
(with cork)

Model 4 Model 4Model Model 4 Model 5Model 5


Model 5 Model Model 6 Model 6Model 6
Model 4 Model 4Model 44 Model 5 Model 55Model 5 Model
Model66 Model 6Model 6
Two stethoscope Two
Two stethoscope stethoscope One stethoscope
One One stethoscope
stethoscope Thinklabs One
Thinklabs Thinklabs
One
Two stethoscope Two stethoscope
Two stethoscope
Two stethoscope One stethoscope
One
One stethoscopestethoscope
One stethoscope Thinklabs One
Thinklabs Thinklabs
One One
Thinklabs O
(withcork)
(with cork)(with cork)
(with cork) covered with
covered
covered covered
cork with corkwith cork Digital
withcork Digital Stethoscope Digital Stetho
Digital Stethoscope
Stethoscope
(with cork)(with cork)
(with cork) covered
covered with with cork
cork covered Digital Stethoscope
with cork Digital Stethoscope Digital Stetho

Note: Areas Note: Note:


with slashes
Areas Areas
represent
with with
cork.
slashes slashes represent
represent cork. cork.
Note:
Note: Areas Areas
with Note:
with
slashes Areas
slashes
represent with
Note: slashes
Areas with
represent
cork. represent
cork. cork.
slashes represent cork.
To test the noise To test the Tonoise
reduction test effect
the noise
reductionof thereduction
designed
effect of theeffect of the
circuits,
designed twodesigned
experimental
circuits, circuits,
two twowere
sites
experimental expers
To test the noise
To test thereduction
To
noise Toeffect
testreduction
the test of
noise the the
reductiondesigned
noise
effect effect
reduction
of the circuits,
of the
designed effect two
designed experimental
of the
circuits, circuits,
designed
two experimental sitesexperimental
two
circuits, were two were
sites experis
set up. Figure set [Link] set the
Figure up.7 schematic
Figure
shows 7the shows
diagram the
schematic schematic
of experimental
diagram diagram siteof
of experimental experimental
I in which site aI thick site
in which I in
book w
a th
set up. set
Figure
up. 7Figure
shows
set up.7the
shows schematic
Figure
set up.
the7 Figure
shows diagram
schematic 7the
shows of experimental
schematic
diagram diagram
the schematic
ofhuman site
experimental ofI experimental
diagramin which
site a thick
ofIexperimental
in whichsitebookIain issitebook
which
thick I ina wth
is used forissimulating
used is used
the
for simulating for
human simulating
chest
the the
wall.
humanspeaker A regular
chest wall. chest wall.
speaker
A regular A
for regular
PCs,
speaker a speaker
MECMAR
for PCs, a M for
used forissimulating the
used foris simulating
used human
for
is used chest
simulating
the for wall.
human Achest
the
simulating regular
human the chest
wall. human
A regular for
wall. PCs,
chest Aspeaker a MECMAR
regular
wall. Afor speaker
regular
PCs, speaker
for
aspeaker
MECMARPCs,for aM P
speaker (Boxe speaker speaker
PC Mecmar
(Boxe PC (Boxe
2.0, 240PC
Mecmar W Mecmar
PMPO)
2.0, 240 W2.0,
with 240
PMPO) W with
a dimension PMPO) a ofwith70 mm
dimension a dimension
(width)
of 70 mm ×of16370
(wid
(Boxe PC Mecmar
speaker 2.0, 240
speaker
(Boxe W
(BoxePMPO)
speaker
PC Mecmar PC(Boxewith
Mecmar
2.0, 240 a dimension
PC W 2.0,
Mecmar
PMPO)240 W of
2.0,
with70240
PMPO) mm
a W(width)
with
PMPO)
dimension a× 163
dimension
with
of 70 mm
amm (height)
of
dimension 70 mm
(width) of
× (wid
70 m
163
mm(depth),
(height) × 70 mm mm(depth),
(height) was×(depth),
70used.
mm Ward (depth), was
construction used. Ward [26]
noise construction
and noise [26]
× 70 mm mm (height)
mm
mm was (height)
70used.
× (height)
mm(depth),
mm
× 70 mm
×Ward
70 mm
(height) construction
×(depth),
was 70used.
was
was
mm (depth), noise
Ward
used.
used. [26]
was
Ward
Wardand
constructionused.
construction
airportnoisenoise
construction
Ward [27]airport
noise
noise
construction
[26] and
[26]
were
[26]
airport and
noise
noise
and airp
airp
[26]
noise
[27] were selected
[27] were [27]
for were for
observation.
selected selected The for
observation. observation.
distance The of the The distance
stethoscope
distance of the of
head the
stethoscope from stethoscope
the
head sound
from head
isthe
selected[27]
forwere
observation.
[27] were
selected The
[27]
for distance
selected
were
observation. selectedofThe
the
for observation. stethoscope
fordistance Theofdistance
observation. thehead
The from
of thethe
distance
stethoscope sound
stethoscope
of
head thefrom is about
head
stethoscope
the from
sound headthe
is f
about 1m for the
about 1m about
measurement.
for 1mmeasurement.
the forThethe audio
measurement.
sounds
The audio ofThenoiseaudio
sounds sounds
received
of noise from of noise
receivedthe received
stethoscope
from the fro
ste
1 m forabout
the measurement.
1m forabout the1m The
for the
about
[Link] forsounds
1mmeasurement.
the
The of noise
measurement.
audio The
sounds received
audio The
of from
sounds
audio
noise the
of
sounds
received stethoscope
noise received
of
from noisethe head
from thefrom
received
stethoscope ste
head
and thehead and
microphone the
headand head
microphone
andthe andand
the microphone thethe microphone
audio sound and
and noise-canceling the
after
the audio soundwere audio
noise-cancelingsound
after analyzed afterwere
noise-canceling noise-canceling
analyzed were anaby
and head and
the microphone theaudio
head microphone
and and sound
the after
and the
microphone
the audio sound audio
and thesound
after audio after
sound
noise-canceling noise-canceling
after were byanalyzed
FFTwere
noise-canceling anaw
by
FFT for
for comparison. comparison.
FFT
Figure for FFTFigurefor
comparison. comparison.
8 shows
Figure the 8 Figure
experimental
shows 8theshows the
site
experimental II experimental
in which
site the
II in site II
stethoscope
which in thewhich on t
stetho
FFT
FFT for comparison. for8comparison.
shows
FFT the experimental
for comparison.
Figure 8 Figurethe
shows 8 Figure
shows site8the
experimental II experimental
showsin which thewhich
theIIexperimental
site in stethoscope
site IIthein site
which oninthe
II
stethoscope a which
stetho
on th
humanaabodyhuman is body
a human
tested. MECMARa body
is [Link] MECMAR body was
is tested.
speaker isMECMAR
tested.
speaker
utilized MECMAR
was utilized
speaker
for playing speaker
was for
the wassource.
playing
utilized
noise utilized
fortheplaying
noise
Ward forsource.
playing
the nois
a human
human body is tested. body
a human is tested. MECMAR
body is tested.
MECMAR speaker[26] MECMAR
wasspeaker
utilized was
speaker utilized for
was utilized
forasplaying playing
thesource, the
for playing
noise source. noise
Ward noise
construction construction
Ward
[26] was Ward
noise
construction
selected construction
[26] was
asnoise [26]noise
theselected
noise was as selected
source, the thewas
noise selected
assource,
sound the noise thethe soundnoise
source, volume
the sound the
towas soun
ad-
volum
Ward
Ward construction construction
Ward
noise [26] noise
construction
was [26]
selected was
noise as selected
[26]
the was
noise thevolume
assource,
selected noiseasthethe was
source,
noise
sound adjusted
the sound
source,
volume the
was volume
sound
ad-
justed
the maximal, to the maximal,
justed to justed
the and tothe
maximal, the maximal,
distance
and theof and
the
distance the distance
measurement
of the of the
position
measurement measurement
is 1 m
position away position
is from
1 m the
awayis 1
justed and
to the the
justed distance
tojusted
maximal, the and
speaker.
oftothe
maximal, themeasurement
theThe
and theofdistance
maximal,
distance
sound
and
the
source
position
the of the
measurement
[26] at
is 1measurement
distance mofaway
0:30~0:40 position
was
from
the measurementis
measured
the
position
1 m speaker.
away is from
1 m away
position
for all
is 1
the
testing.
speaker.
The sound sourceThe sound
speaker.
[26] source
The soundwas
at speaker.
0:30~0:40 [26] at 0:30~0:40
source was
[26] at 0:30~0:40
measured for measured for
was measured
allattesting. all testing. for In order
all testing. to isolate
In order
speaker. Thespeaker.
sound The
source
the
sound The
[26]
interference
source
sound
at [26]
0:30~0:40
of
atwas
source
heart
0:30~0:40
[26]measured
sounds, the forInall
was measured
0:30~0:40
skin
order
was
the
forto all
measured
of testing. lower
isolate forthe
In testing.
order
leg, toIn
all order
testing.
isolate
the interference
interference of heart of heart
thesounds,
interference the sounds,
of heart
skin of thesounds,
the skin
lower ofleg,
thewhich
the lower
skin ofisleg,
the
the which
lower
farthest is from
leg,the farthest
which the is which
from
the
heart, is
the
farthestthe
the interference
the interference of the
heart of heartthe
interference
sounds, sounds,
of skin the
heartofsounds,
the skin
lower of the
the skin
leg, lowerof the
which leg,lower
is thewhich leg,iswhich
farthest the fromfarthest
isthethe
heart,aswas
was chosen chosen
heart,
theheart,
medium washeart,
as the
forchosen was
medium aschosen
the as the
formedium
the medium
measurement. forThe
for the measurement. the soundmeasurement. measurement
The sound Themeasureme
sound
(meas- me
heart, was chosen was as thethe
chosen
heart,
urement
was measurement.
mediumas
1)
thefor
chosen
using
medium
asthethe
the
The forsound
medium
measurement.
stethoscope
measurement
the measurement.
for
on
the
The
the
measurement.
sound
skin of
(measurement
The sound
measurement
the lower
Themeasuremen
sound
leg (meas-
was
mea
per
urement 1)
1) using the stethoscope using
urement the1) stethoscope
onurement using
theusingskin 1) the
of the on
stethoscope
lower skin
leg was onof the
the lower
skin
performed of leg
the
in was
lower
a lower
quiet performed
leg was
environment. in a
performedquiet i
urement 1)urementusing the 1) stethoscope theusing
stethoscope
onthe skin onofthe
thestethoscope theskinon
lower of the
the skin
leg wasof the leglower
performed was leg performed
in wasa quietperfi
Then, the sound measurement on the skin of the lower leg was performed again in a noisy
environment to collect both sounds from the stethoscope head (measurement 2) and from
the microphone designed for collecting the environmental noise (measurement 3). The
noise frequencies of these two sounds were compared. The measurements for heart sounds
were conducted by placing the stethoscope head on the left chest (second intercostal space
at the left edge of the sternum) in a quiet environment (measurement 4) and then in a noisy
environment (measurements 5 and 6). When in a noisy environment, the measurements
from the stethoscope without using the subtracter (measurement 5) and with using the
subtracter (measurement 6) were compared by FFT.
urement 3). The noise frequencies of these two sounds were compared. The measur
intercostal space at the left edge of the sternum) in a quiet environment (m
for heart sounds were conducted by placing the stethoscope head on the left chest (
and then space
intercostal in a noisy
at theenvironment (measurements
left edge of the 5 and
sternum) in a quiet 6). When in
environment a noisy
(measurem
the then
and measurements from the stethoscope
in a noisy environment (measurementswithout using
5 and 6). Whentheinsubtracter (m
a noisy enviro
Sensors 2022, 22, 4263 andmeasurements
the with using the subtracter
from (measurement
the stethoscope without6) werethe
using compared
subtracterby FFT.
(measurem
8 of 25

and with using the subtracter (measurement 6) were compared by FFT.

Figure 7. Schematic diagram of the experimental site I.


Figure [Link] diagram of theof
experimental site I.
Figure Schematic diagram the experimental site I.

Figure [Link]
Figure diagram of experimental site II. site II.
Figure [Link]
Schematic diagram
diagramof experimental
of experimental site II.
2.2. Heart/Lung Sound Classification
2.2. Heart/Lung SoundweClassification
In this subsection, focus on the design of a heart/lung sound classification algo-
2.2. Heart/Lung Sound Classification
[Link]
Taiwan, collecting we
subsection, patient dataon
focus for the
analysis
designmustof first be approved by
a heart/lung the Institu-
sound classificatio
In thisBoard
tional Review subsection, we focus
(IRB). Moreover, even theonapproval
the design of acollecting
is granted, heart/lungenough sound
data class
rithm.
is In
time- andTaiwan, collecting
labor-consuming. patient data
For thepatient
convenience for analysis
of for
research must
without first be approved
goingfirst
through an by th
rithm.
tutional
In Taiwan,
Review
collecting data analysis must be approve
Institutional ReviewBoard (IRB).
Board (IRB) Moreover,
review, the studyeven
of thethe approval
presented is granted,
algorithm is focusedcollecting
on e
tutional
data
the heart Review
is time- andsound
and lung Board (IRB).
labor-consuming.
samples from the Moreover,
For the domain
public even
convenience the approval
[28,29]. of research
These is granted, coll
without going t
cardiopulmonary
data
samples is time-
were and
first labor-consuming.
pre-processed by For
segmentation, the convenience
dimensionality
an Institutional Review Board (IRB) review, the study of the presented of research
reduction, and without
signal
algorithm
processing,
an then
Institutional the time-
Reviewand frequency-domain
Board (IRB) features
review, of
thethe samples
study were
of theextracted
presented al
cused on the heart and lung sound samples from the public domain [28,29]. These
and the classifier model was designed by using ensemble learning, and finally, these feature
cusedwere
opulmonary
vectors on the
fed heart
samples
into and lung
the were first
classifier sound samples
pre-processed
model for training from the public
by testing.
and segmentation, domain [28,29
dimensionality red
opulmonary
and samplesthen
signal processing, were thefirst pre-processed
time- by segmentation,
and frequency-domain features ofdimensiona
the sample
2.2.1. Pre-Processing
extracted
and signal andprocessing,
the classifier model
then thewas
time-designed by using ensemblefeatures
and frequency-domain learning,ofand
the
these In this
featurestudy, the
vectors heartbeat
were frequency
fed into the(50–80 beats
classifier per minute)
model for and respiration
training and fre-
testing.
extracted and the classifier model was designed by using ensemble learnin
quency (12–20 beats per minute) were assumed for segmentation. To keep the complete
these feature
information of onevectors
heartbeatwereor onefed into the
respiration in aclassifier
small frame,modeldifferentfor training
lengths of smalland test
2.2.1.
framesPre-Processing
were tested. The heart and lung sound signals were segmented into small frames.
Different lengths
In Pre-Processing
this study, of small
the frames
heartbeat werefrequency
tested to see(50–80
which length
beats would
per minute)give theand betterrespirat
2.2.1.
performance, 1, 1.5, and 2 s for heart sounds, and 3, 4, and 5 s for lung sounds. In addition,
quency (12–20 beats per minute) were assumed for segmentation. To keep the co
In this
overlapping study,
ratios, such asthe0%,heartbeat
20%, and 50%, frequency
were adopted (50–80 beatsthe
to segment per minute)
original sam- and r
information
ples in order toof one heartbeat
increase the number orofone respiration
samples and to seeinifathesmall frame,segmentation
overlapping different lengths o
quency (12–20
frames
beats per minute) were assumed for segmentation. To keep
approachwere
would tested. Therecognition
give better heart andresults.
lung After
sound signals were
segmentation, segmented
the small into small
frame signals
information
were subjected
Different toofprincipal
lengths one heartbeat
of small component
frames oranalysis
one respiration
were (PCA)to
tested [30]see inwhich
and a small
then the frame,
original
length different
signals
would le
give the
frames were1,tested.
were subtracted
performance, 1.5, andThe
based on PCA toheart
2 s forobtainand
heart the lung sound
difference
sounds, signals
and 3, 4, signals
and 5 swere
(frames after segmented
PCA).
for lung In our
sounds. In into
ad
experience, abnormal heart sounds can be divided into three categories: heartbeat compo-
Different lengths
overlapping ratios, of small
such as 0%,frames
20%, were
and 50%,tested
were to see which
adopted to lengththe
segment would
origing
nents, abnormal beating components, and noise. The heartbeat components refer to major
performance,
ples in order
cardiac cycles. to 1, 1.5, and
Forincrease
a normal the 2number
or an sabnormal
for heart
ofheartsounds,
samples
sound,and and 3, 4,ifthese
to see
there exist and 5 s forcycles:
the overlapping
cardiac lung segme
sound
overlapping
approach
diastole, would
systole, ratios,
give and
diastole, such
better as 0%,
For20%,
recognition
systole. and 50%,
results.
an abnormal were
After
heart sound adopted
segmentation,
in terms of PCG, to the
segment the
small fra
there
exist
nals irregular vibrating waves during these cycles. These vibrations here refer to abnormal
pleswere subjected
in order to principal
to increase component
the number analysisand
of samples (PCA) to see[30]ifandthethen the origi
overlapping
beating components. Basically, the heartbeat sound will account for more than 80% of the
nals
total
were
approach subtracted
would
input signal
based
give
and these
onwill
better
signals
PCA to obtain the
recognition
often dominateresults.
difference
After signals
the classification segmentation,(frames after P
results. The noise the sm
our
nals
or experience,
wereheart/lung
abnormal abnormal
subjected sounds heart sounds
to principal
account forcomponent
a smallcan be divided
amount analysis into
of the original(PCA) three
signal.[30]categories:
and thenhe
However, th
components,
what we need and are concerned about are the abnormal beating sounds. Therefore,componen
abnormal beating components, and noise.
nals were subtracted based on PCA to obtain the difference signals (frames The heartbeat we
use principal component analysis to remove the first 85–95% of the principal component
our experience, abnormal heart sounds can be divided into three catego
components, abnormal beating components, and noise. The heartbeat com
than 80% of the total input signal and these signals will often dominate the classi
results. The noise or abnormal heart/lung sounds account for a small amount of th
inal signal. However, what we need and are concerned about are the abnormal
Sensors 2022, 22, 4263
sounds. Therefore, we use principal component analysis to remove the9first of 25
85–95%
principal component to reduce the classification impact dominated by the he
sound, leaving the signal data with abnormal sound as the main component for
extraction.
to reduce theFigure 9 shows
classification thedominated
impact sound framesby the before
heartbeatand after
sound, PCA the
leaving forsignal
normal and
data heart
mal with abnormal
[Link]
Figureas 10
theshows
main component
the sound forframes
feature extraction.
before and Figure
after9PCA
showsfor norm
the sound frames before and after PCA for normal and abnormal heart sounds. Figure 10
abnormal lung sounds. The horizontal axis represents the sampling points and the
shows the sound frames before and after PCA for normal and abnormal lung sounds. The
axis represents
horizontal the sound
axis represents amplitudes
the sampling inand
points a wav [Link]
the vertical In represents
addition,thethesound
number of
nents to beinremoved
amplitudes canInbeaddition,
a wav format. decided theaccording to the degree
number of components to beof contribution
removed can be and u
the principal
decided accordingcomponents
to the degree ofwith a cumulative
contribution contribution
and usually, the principal of 85–95% with
components are taken.
a cumulative contribution of 85–95% are
study, 95% cumulative contribution was used. taken. In this study, 95% cumulative contribution
was used.

(a) Normal heart sounds

(b) Abnormal heart sounds


Figure
Figure 9.9.(a)(a) Normal
Normal heart sounds
heart sounds before/after
before/after PCA and (b)PCA andheart
abnormal (b) sounds
abnormal heart PCA.
before/after sounds befo
PCA.
SensorsSensors
2022, 2022,
22, x22,
FOR4263PEER REVIEW 10 of 25

(a) Normal lung sounds

(b) Abnormal lung sounds


Figure 10.(a)(a)
Figure 10. Normal
Normal lung sounds
lung sounds before/after
before/after PCA and (b)PCA andlung
abnormal (b)sounds
abnormal lung PCA.
before/after sounds befo
PCA.
2.2.2. Feature Extraction
As mentioned in the previous section, although the difference signals can be directly
2.2.2. Feature
observed by theExtraction
naked eye, they are still too complex for computers. Therefore, the
difference signals need in
As mentioned to be
thecharacterized for easieralthough
previous section, identification.
theIndifference
this study, two kindscan be d
signals
of features, time-domain features and frequency-domain features, were extracted.
observed by the naked eye, they are still too complex for computers. Therefore, the
First of all, in terms of time-domain features, the original difference signals are too
ence signals need
high-dimensional tonot
and beeasy
characterized for aeasier
to handle. With identification.
sampling rate of 2000 Hz, In this
a 2 s study,
sound two k
features,
frame would time-domain
have 2 × 60 × features and frequency-domain
2000 (240,000) features, were
points, which is a high-dimension extracted.
vector. This
First of all, in terms of time-domain features, the original differencetosignals
high-dimension vector can be processed by MFCC and statistical feature extraction
reduce its [Link]
high-dimensional Statistical
not easytime-domain features
to handle. Withwere used to effectively
a sampling rate of simplify
2000 Hz, a 2 s
the high-dimensional difference signals while retaining the characteristics of the original
frame would have 2×60×2000 (240,000) points, which is a high-dimension vecto
signals. In this study, 11 statistical features were selected: mean, standard deviation, mean
high-dimension vector can be processed by MFCC and statistical feature extractio
duce its dimension. Statistical time-domain features were used to effectively simp
high-dimensional difference signals while retaining the characteristics of the origi
nals. In this study, 11 statistical features were selected: mean, standard deviation
Sensors 2022, 22, 4263 11 of 25

absolute deviation, median, first quartile, third quartile, interquartile range, skewness,
kurtosis, Shannon entropy, and spectral entropy.
Secondly, the frequency-domain information of sound signal processing is more likely
to show differences than the time-domain features because the cardiopulmonary signals
have periodic characteristics and different sounds have different frequencies. The Mel-
frequency cepstral coefficients (MFCC) [31] are suitable frequency-domain features that
are closer to the characteristics of the human ear in analyzing sound than the general
spectrum or the inverse spectrum coefficients. The reason is that the human auditory
system only responds linearly to frequencies below 1 KHz but rather shows a logarithmic
function at higher frequencies. By using this relationship, the MFCC are spectral features
and are obtained as follows. The sound signal is first pre-reinforced, such as passing the
signal through a high-pass filter to enhance the information of a high frequency. This is
because the energy of a high frequency is usually smaller than that of a low frequency.
Then, Fourier transform is performed to obtain the power spectrum. The power spectrum
obtained from each audio frame is then passed through a Mel filter to obtain the Mel scale.
Forty Mel filters are usually used. Then, the logarithmic energy is extracted for each Mel
scale, and discrete cosine conversion for the inverse spectrum domain is performed. Since
the coefficients after filtering are highly correlated, the correlation can be removed and
downscaled by discrete cosine conversion and the Mel-frequency inversion coefficient is
the amplitude of the Mel-frequency inversion spectrum. In this study, 12 coefficients were
used and the energy of the audio frame was superimposed to form the 13th coefficient.
In addition, the maximum value in the power spectrum, the frequency of the maximum
value in the power spectrum, and the percentage of the maximum energy in the power
spectrum to the total energy were calculated. A total of 16 frequency-domain features
together with 11 statistical features were used as the input feature vectors for the classifier.
So, the dimension is reduced from 240,000 to 27 for a 2 s sound frame.

2.2.3. Classifier
The classification algorithm is used to classify normal (healthy) and abnormal (patho-
logical) heart sounds and normal and abnormal lung sounds. In this study, we adopted the
concept of ensemble learning [32] to design the classifier model and experimented with
several classical ensemble learning methods, including Bagging (bootstrap aggregating),
AdaBoost (adaptive boosting), GentleBoost (gentle adaptive boosting), LogitBoost (adap-
tive logistic), and RUSBoost (random under-sampling boosting), to see which learning
method is most suitable for the database selected for this study. Each sample segmentation
approach (e.g., 0% overlap in 1 s, 20% overlap in 1 s, etc.) is trained 5 times, and an ensemble
learning method is randomly selected for each time. The final classifier model is determined
based on the best observation obtained by Bayesian optimization. The original sample was
divided into many small frames for training and the model identified the results of the
classification of the small frames for voting. If the number of abnormal frames exceeded a
specific threshold, the original sample is regarded as an abnormal cardiopulmonary sample,
whereas the opposite is a normal one. The experimental results are displayed according to
different proportions (e.g., 5%, 10%, 15%, . . . , 95%) of the voting results and it is observed
which proportion has the best voting results. In addition, if the length of an original sample
could not be divided by more than one small frame, the sample is ignored and not counted.
To evaluate the performance of the algorithm proposed in this paper, four common
evaluation metrics are used (Equations (1)–(5)), including accuracy (Acc), sensitivity (Se),
specificity (Sp), and F1 score, which are commonly used to evaluate the performance
of “abnormality recognition algorithms” (e.g., disease diagnosis). Accuracy is the most
intuitive way to evaluate the average accuracy of the algorithm. Specificity focuses on the
misdiagnosis, and higher specificity means lower misdiagnosis. Sensitivity evaluates the
ability to detect patients, and higher sensitivity means better ability to identify patients.
Sensors 2022, 22, 4263 12 of 25

The F1 score is the summed average of the above scores and is used to summarize the
overall performance of the algorithm.

TP + TN
Sensors 2022, 22, x FOR PEER REVIEW accuracy = 12 of (1)
25
TP + FN + FP + TN
TN
specificity = (2)
FP + 𝑇𝑃TN
sensitivity = TP (3)
sensitivity = 𝑇𝑃 + 𝐹𝑁 (3)
+ FN
TP 𝑇𝑃
precision = TP (4)
precision = 𝑇𝑃 + 𝐹𝑃 (4)
TP + FP
2 × sensitivity × precision
F1 Score =2 × sensitivity × precision (5)
F1 Score = sensitivity + precision (5)
sensitivity + precision
where TP stands for true positive (with disease and classified as abnormal), TN stands for
where
true TP stands
negative for true
(without positive
disease (with disease
and classified and classified
as normal), FP standsasfor
abnormal), TN (with-
false positive stands
for disease
out true negative (without
but classified disease andFN
as abnormal), classified as normal),
stands for FP stands
false negative (with for falsebut
disease positive
clas-
(without disease
sified as normal). but classified as abnormal), FN stands for false negative (with disease but
classified as normal).
3. Results
3. Results
3.1.
[Link]
DesignofofElectronic
ElectronicStethoscope
Stethoscope
Figure
Figure 11a shows thePCB
11a shows the PCBcircuit
circuitof ofthe
thepresented
presentedsystem
systemin inaccordance
accordancewith withthe thesize
size
of Raspberry Pi 3 Model B (85 mm × 55 mm). Figure 11b [24]
of Raspberry Pi 3 Model B (85 mm × 55 mm). Figure 11b [24] shows the whole designed shows the whole designed
system.
[Link] ofthe
thefirst
firstamplifier
amplifier for for the
the heart/lung
heart/lungsounds
soundsisisdesigned
designedatat1.0 1.0sosothat
that
the
the gain of the first amplifier for mic 2 (noise) does not need to be adjusted to a highgain.
gain of the first amplifier for mic 2 (noise) does not need to be adjusted to a high gain.
The
Thegain
gainofof the
the second
second amplifier
amplifier forfor the
the heart/lung
heart/lungsounds
soundswas wasadjusted
adjustedtotoroughly
roughly3.0. 3.0.
For
Forthe
theexperimental
experimentalsite siteII(Figure
(Figure7),7),two
twodifferent
differentnoise
noisesources
sourceswere wereplayed.
[Link]
Figure12 12
shows the 10 s sound waveforms and FFTs measured by mic
shows the 10 s sound waveforms and FFTs measured by mic 1, mic 2, and the subtracter1, mic 2, and the subtracter
(noise
(noisereduction)
reduction)for forthe
theward
wardconstruction
constructionnoise noisethat
thatwas
wasplayed
playedfrom fromthe thecomputer
computerwith with
aa40%
40%volume
volumeand andaaspeaker
speakerwithwiththethelargest
largestvolume.
[Link]
Basedon onthe
theFFTFFTspectra,
spectra,the thenoise
noise
amplitude
amplitudepeak peakbefore beforenoise
noisereduction
reductionisis aboutabout 0.0013,
0.0013, whereas
whereasitit isis 0.0003
0.0003 after
after noise
noise
reduction.
[Link],
Apparently, the thevolume
volumeafterafternoise
noisereduction
reductionisis44times
timeslower
lowerthanthanbefore
beforenoisenoise
reduction;
reduction; the noisenoise reduction
reductionisisabout
about12.7412.74dBdB based
based on on
thethe formula
formula for signal-to-noise
for the the signal-to-
noise ratio (SNR).
ratio (SNR). The 10The 10 s sound
s sound waves waves
and FFTs and FFTs measured
measured by mic 1,by mic and
mic2, 1, mic2,
noiseand noise
reduction
reduction for the
for the airport noiseairport noise are
are shown [Link]
in Figure Figure 13. Fornoise
the airport the airport
playednoisefrom played from
the computer
the
withcomputer
the maximal withvolume
the maximal
and a volume
speaker and withathe
speaker
largestwith the largest
volume, the noise volume, the noise
amplitude peak
before noise
amplitude peakreduction
before is about
noise 0.0013 and
reduction about 0.0013
is about 0.00025andafterabout
noise0.00025
[Link]
this case,
re-
the volume
duction. after
In this noise
case, thereduction
volume after is 5.2noise
times lower than
reduction before
is 5.2 timesnoise
lower reduction;
than before thenoise
noise
reduction the
reduction; is about
noise14.32 dB. is about 14.32 dB.
reduction

(a) (b)
Figure
Figure11.
[Link]
PhysicalPCB
PCBcircuit.
circuit.(a)
(a)PCB
PCBof
ofdesigned
designedcircuit.
circuit.(b)
(b)Whole
Wholedesigned
designedsystem.
system.
x FOR PEER REVIEW
Sensors 2022, 22, 4263 1313of
of 25

(a)

(b)

(c)
Figure 12. Noise
Figure 12. reduction effect
Noise reduction effectfor
forward
wardconstruction
constructionnoise. (a)Signal
noise.(a) Signal waveform
waveform andand
FFTFFT spec-
spectrum
trum measured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal
measured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal waveform
waveform and FFT spectrum after noise reduction
and FFT spectrum after noise reduction.
frequency domain FFT
Time domain For the direct measurements of the human body in experimental site II (Figure 8),
ward construction noise was used as the noise source and the test volume was adjusted
to the maximal. The different models of stethoscopes listed in Table 1 were tested. For
conciseness, 10 s waveforms and FFTs of measurements 1–6 only for Model 2 are shown in
Figure 14. For Model 2, Figure 14a shows that the frequency band of the signals received
from the contact of the stethoscope head with the skin of the lower leg is in 0~100 Hz. In
comparison with Figure 14b, the frequency band of the noise received at the stethoscope
end on the skin of the lower leg is in 150~450 Hz and the audio frequency peak appears at
300 Hz. Figure 14c reveals that the frequency band of the noise received by mic 2 at the
ambient end distributes in 150~550 Hz and the audio frequency peak appears at 375 Hz.
Figure 14d shows that the frequency band of the heart sound signals in a quiet environment
is in 0~150 Hz and is mixed with the noise of the skin. Figure 14e reveals that the frequency
(a)
Sensors 2022, 22, 4263 14 of 25

band of the noise received at (c)


the stethoscope end is in 200~400 Hz and the audio frequency
Figure 12. Noise reduction effect for14f
peak appears at 325 Hz. Figure shows
ward that thenoise.
construction frequency band
(a) Signal of the noise
waveform residual
and FFT spec-
trum measured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal of
signals after noise reduction is in 250~450 Hz. Table 2 summarizes the observations
the tests. and FFT spectrum after noise reduction
waveform
frequency domain FFT
Time domain

Sensors 2022, 22, x FOR PEER REVIEW 14 of 25

(a)

(b)

(c)
Figure
Figure 13.
13. Noise
Noise reduction
reduction effect
effect for
for airport (a) Signal
noise. (a)
airport noise. Signalwaveform
waveformand
andFFTFFTspectrum
spectrummeasured
meas-
ured by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal waveform
by mic 1. (b) Signal waveform and FFT spectrum measured by mic 2. (c) Signal waveform and FFT
and FFT spectrum after noise reduction.
spectrum after noise reduction.

For the direct measurements of the human body in experimental site II (Figure 8),
ward construction noise was used as the noise source and the test volume was adjusted
to the maximal. The different models of stethoscopes listed in Table 1 were tested. For
conciseness, 10 s waveforms and FFTs of measurements 1–6 only for Model 2 are shown
in Figure 14. For Model 2, Figure 14a shows that the frequency band of the signals received
from the contact of the stethoscope head with the skin of the lower leg is in 0 Hz~100 Hz.
In comparison with Figure 14b, the frequency band of the noise received at the stethoscope
end on the skin of the lower leg is in 150 Hz~450 Hz and the audio frequency peak appears
at 300 Hz. Figure 14c reveals that the frequency band of the noise received by mic 2 at the
ambient end distributes in 150 Hz~550 Hz and the audio frequency peak appears at 375
Hz. Figure 14d shows that the frequency band of the heart sound signals in a quiet envi-
Sensors 2022, 22, 4263 15 of 25
Sensors 2022, 22, x FOR PEER REVIEW 15 of 25

(a)

(b)

(c)

(d)

Figure 14. Cont.


Sensors 2022, 22, 4263 16 of 25
Sensors 2022, 22, x FOR PEER REVIEW 16 of 25

(e)

(f)
Figure 14.
Figure 14. Noise
Noisereduction
reductioneffect for for
effect ward construction
ward [Link].
construction (a) Measurement 1: stethoscope
(a) Measurement on
1: stethoscope
left leg skin in quiet space. (b) Measurement 2: stethoscope on the left leg skin in noise space. (c)
on left leg skin in quiet space. (b) Measurement 2: stethoscope on the left leg skin in noise space.
Measurement 3 on lower leg skin: mic for environmental noise. (d) Measurement 4: stethoscope on
(c)
leftMeasurement
chest in quiet 3environment.
on lower leg(e)skin: mic for environmental
Measurement 5: stethoscopenoise.
on left(d) Measurement
chest 4: stethoscope
in noisy environment.
on
(f)left chest in quiet
Measurement 6 onenvironment. (e)noise
left chest after Measurement
reduction. 5: stethoscope on left chest in noisy environment.
(f) Measurement 6 on left chest after noise reduction.
Table 2. Frequency bands (in Hz) of different measurements for different models.
Table 2. Frequency bands (in Hz) of different measurements for different models.
Model 1 Model 2 Model 3 Model 4 Model 5
Measurement 1 0~1001
Model 0~100 2
Model 0~100 3
Model 0~100
Model 4 0~100
Model 5
0~100 0~100 0~100 0~100 0~100
Measurement 1 0~100 0~100 0~100 0~100 0~100
Measurement 2 150~450 150~450 150~450 150~450 150~450
0~100
peak at 300 0~100
peak at 300 peak0~100
at 300 peak 0~100
at 300 peak at0~100
300
Measurement 2 150~450 150~450 150~450 150~450 150~450
150~550
peak at 300 150~550
peak at 300 150~450
peak at 300 250~450
peak at 300 peak at 300
Measurement 3
peak at 375 peak at 375 peak at 350 peak at 350
150~550 150~550 150~450 250~450
Measurement
Measurement 3 4 0~150
peak at 375
0~150
peak at 375
0~150
peak at 350
0~150
peak at 350
0~150
0~150 0~150 0~150 0~150 0~150
Measurement 4 0~150 0~150 0~150 0~150 0~150
Measurement 5 200~450 200~400 240~420 250~400 250~400
0~150
peak at 325 0~150
peak at 325 peak0~150
at 310 peak 0~150
at 310 peak at0~150
310
Measurement 5 200~450 200~400 240~420 250~400 250~400
0~150 0~150 0~150 0~150
Measurement 6 peak at 325 peak at 325 peak at 310 peak at 310 peak at 310
250~450 250~450 250~450 250~450
0~150 0~150 0~150 0~150
Measurement 6
250~450 250~450 250~450 250~450
Table 3 shows the FFT spectra of the heart sound signals with and without the sub-
tracter, measured at the stethoscope end using four different models. Model 5 has no sub-
Table 3 shows the FFT spectra of the heart sound signals with and without the sub-
tracter and Model 6 does not provide the option of using the subtracter. Therefore, Models
tracter,
5 and 6 measured at in
are not listed the stethoscope
Table end of
3. The results using
noisefour different
reduction models.
using Modelcircuit
the subtracter 5 has no
subtracter and Model 6 does not provide the option of using the subtracter.
for these four models are not satisfactory. However, when comparing the FFT spectra of Therefore,
Models
the heart5 and 6 areobtained
sounds not listed in Table
using 3. The results
the subtracter, Modelof noise reduction
4 shows using
the lowest the ampli-
noise subtracter
circuit
tude. When the subtracter is not used, the FFT spectra of the heart sound signals of thethe
for these four models are not satisfactory. However, when comparing sixFFT
spectra of the heart sounds obtained using the subtracter, Model 4 shows the lowest noise
amplitude. When the subtracter is not used, the FFT spectra of the heart sound signals of
the six Models (including the electronic stethoscope from Thinklabs One) are shown in
Sensors 2022,
Sensors2022,
Sensors 22,
22,xxxFOR
2022,22, FOR PEER
FORPEER REVIEW
PEERREVIEW
REVIEW 17 of
17 of
17 25
of 25
25
Sensors 2022, 22, 4263 17 of 25
Sensors2022,
Sensors
Sensors 2022,22,
2022, 22,xxxFOR
22, FORPEER
FOR PEERREVIEW
PEER REVIEW
REVIEW 17of
17
17 of25
of 25
25
Sensors 2022, 22, x FOR PEER REVIEW 17 of 25

Models
Models
Models (including
(including
Figure(including
15. Modelthe the
the
4 electronic
electronic
electronic
shows stethoscope
stethoscope
stethoscope
the lowest from
noisefrom Thinklabs
fromThinklabs
Thinklabs
amplitude when One)
One)
One) are shown
areshown
are
measuring shown in
in
heart Figure
inFigure
Figure
sounds 15.
15.
[Link]
Models
Models
Models
Model
Model
Model 4
44 (including
(including
(including
shows
shows
shows the
the
the the
the
the
lowestelectronic
electronic
electronic
lowest
lowest noise
noise
noise stethoscope
stethoscope
stethoscope
amplitude
amplitude
amplitude when
when
whenfrom
from
from Thinklabsheart
Thinklabs
Thinklabs
measuring
measuring
measuring One)are
One)
One)
heart
heart areshown
are
sounds
sounds
sounds shown
shown in
ininaaain
in
in Figure
Figure
Figure
noisy
noisy
noisy 15.
15.
15.
envi-
envi-
envi-
a Models
noisy environment.
(including the Figureelectronic 15 stethoscope
also shows from that the cork can
Thinklabs One)reduce the noise
are shown in Figureby 1.9
15. dB
ModelModel444shows
Model
ronment.
ronment.
ronment. shows
shows
Figure
Figure
Figure the
the
the15
15 lowest
lowest
lowest
also
15and
also
also shows
shows
shows noise
noise
noise thatamplitude
amplitude
amplitude
the cork when
when
when
can measuring
measuring
measuring
reduce the heart
heart
heart
noise by sounds
sounds
sounds
1.9 dB ininaaanoisy
in
between noisy
noisy envi-
envi-
envi-
Model
between
Model 4Model shows 1the Model
lowest noise
that
2that
and the
the cork
bycork
amplitude
can
can
2.8 dB
when
reduce
reduce
between thenoise
the
Model
measuring
noise
heart
by
3 by
and 1.9
1.9 dBdBbetween
Model
sounds
between
in [Link]
Model Model
Model
4 can
envi-
1ronment.
11ronment.
ronment.
and
and
and
reduce Model
Model
Model
ronment.
Figure
Figure
Figure
22
2 and
the Figure
15
and
and
noise by
15by
15 byalso
also
also
by
15 4.7 2.8
2.8
alsodB
shows
shows
shows
2.8 dB
dB
dB thatthe
that
that
between
between
between
when
shows that
the
the corkcan
cork
cork
Model
Model
Model
compared
can
can
333 and
and
to the
the cork can
reduce
reduce
reduce
and the
Model
Model
Model the4.
the
commercial
reduce
noise
noise
noise
4.
4. Model
Model
Model
the noiseone,
by1.9
by
by 4
by 1.9
4
41.9
1.9
can
can
can
Model
dB
dB
dB between
between
between
reduce
reduce
reduce
6.
dB between
the
the
the Model
Model
Model
noise
noise
noise
Model
1by
by 1and
1by and
and
4.7 Model
Model
Model
dB when222and
and
and byby2.8
by
compared 2.8dB
2.8 dB
dB between
between
tobetween
to the Model33one,
Model
Model
commercial 3and
andModel
and Model6.
Model [Link]
4. Model444can
Model canreduce
can reducethe
reduce thenoise
the noise
noise
4.7
14.7
anddB
dB when
when
Model compared
2compared
and by 2.8 to dB the
the commercial
betweencommercial 3one,
Model one, Model
andModel
Model6. 6.
4. Model 4 can reduce the noise
by byby 4.7
4.7
4.7 dBdB
dB when
when
when
Comparison compared
compared
compared toto
to the
the
the
of FFT spectra commercial
commercial
commercial one,
one,
one, Model
Model
Model 6.
6.
6.
Table
by 4.73. dB when compared to the of heart sound
commercial signals
one, with
Model 6. and w/o the subtracter.
Table 3.
Table3.
Table Comparison
[Link] of
Comparisonof FFT
ofFFT spectra
FFTspectra of
spectraof heart
ofheart sound
heartsound signals
soundsignals with
signalswith and
withand w/o
andw/o the
w/othe subtracter.
thesubtracter.
subtracter.
Table
Table
Table 3.
ModelTable
1 3.
3. Comparison
Comparison
Comparison of
of
of FFT
FFT
FFT spectra
spectra
spectra ofof
of heart
heart
heart sound
sound
sound signals
signals
signals with
with
with and
and
and
3. Comparison of FFT spectra of heart sound signals withModel w/o
w/o
w/o the
the
the subtracter.
subtracter.
subtracter.
2 the subtracter.
and w/o
Model
Model111
Model Model
Model222
Model
w/osubtracter
w/o subtracter Model
Model
Model withsubtracter
111 1 with
with subtracter w/o subtracter
w/o subtracter with subtracter
Model2222 with
Model
Model with subtracter
w/osubtracter
w/o
w/o subtracter
subtracter Model with subtracter
subtracter w/osubtracter
w/o
w/o subtracter Model
subtracter with subtracter
subtracter
w/ow/o
w/o
w/o subtracter
subtracter
subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra with
with
with subtracter
subtracter
subtracter
with subtracter
FFT spectra
FFTspectra
FFT spectra w/osubtracter
w/o
w/o
w/o subtracter
subtracter
subtracter
FFT
FFTspectra
FFT spectra
spectra
with
with
with
with subtracter
subtracter
subtracter
subtracter
FFT
FFTspectra
FFT spectra
spectra
FFT
FFT
FFT spectra
spectra
spectra
FFT spectra FFT
FFT
FFTFFTspectra
spectra
spectra
spectra FFT
FFT
FFT
FFT spectra
spectra
spectra
spectra FFT
FFT
FFT
FFT spectra
spectra
spectra
spectra

Model
Model
Model
Model
3333 Model
Model
Model 444
Model44
w/o subtracter Model
Model
Model
Model 3 33 3 with
with subtracter w/o subtracter Model
Model
Model
Model 444 with
with subtracter
w/osubtracter
w/o subtracter withsubtracter
subtracter w/osubtracter
w/o subtracter withsubtracter
subtracter
w/o
w/ow/osubtracter
subtracter
subtracter with
with
with subtracter
subtracter
subtracter w/o
w/o subtracter
subtracter
w/osubtracter
subtracter withsubtracter
with
with subtracter
subtracter
w/o
w/o subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra with
with subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra w/o
w/o subtracter
FFT spectra
FFTspectra
FFT spectra
with
with subtracter
subtracter
FFT spectra
FFTspectra
FFT spectra
FFT spectra FFT spectra FFT spectra FFT spectra
FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra

FFT
FFTspectra
FFT spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
FFT spectra
FFTspectra
FFT spectra
FFT spectra FFT spectra FFT spectra spectra
FFTspectra
spectra FFT spectra
FFTspectra
FFT
FFT
FFT spectra
spectra
spectra
FFT
FFT spectra
FFT
FFT
FFT spectra
spectra
spectra
spectra
FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra FFT
FFT
FFT spectra
spectra
spectra

Figure
Figure
Figure
Figure
Figure 15.
15.
15.
15.
15. Comparison
Comparison
Comparison
Comparison
Comparison of
of
of of
of FFTspectra
FFT
FFT
FFT
FFT spectra
spectra of
spectraof
of
of heart
ofheart
heart
heart sound
heartsound
sound
sound signals
soundsignals
signals
signals without
withoutusing
signalswithout
without
without using
using
using the
using the
the
the subtracter.
the subtracter.
subtracter.
subtracter.
subtracter.
Figure15.
Figure
Figure [Link]
15. Comparisonof
Comparison ofFFT
of FFTspectra
FFT spectraof
spectra ofheart
of heartsound
heart soundsignals
sound signalswithout
signals withoutusing
without usingthe
using thesubtracter.
the subtracter.
subtracter.
3.2.
3.2. Heart/Lung
Heart/Lung Sound Classification
SoundClassification
Classification
3.2.
3.2.
3.2. Heart/Lung
Heart/Lung
Heart/Lung Sound
Sound
Sound Classification
Classification
Two
[Link]/Lung
3.2.
3.2. Heart/Lung
Heart/Lung sample
Two sample Sound
Sound
Sound databases
databases
Classification
Classification
Classification wereadopted
were adopted from from the theInternetInternet [28,29]. [28,29]. The database
The database [28] con-[28]
containsTwo
Two
Two sample
sample
sample
3240 heart databases
databases
databases
sounds were
were
were
in wav adopted
adopted
adopted
format, from
from
from the
includingthe
the Internet
Internet
Internet
2548 [28,29].
[28,29].
[28,29].
normal The
The
The
heart database
database
database
sounds [28]
[28]
[28] con-
con-
con-
tains Two 3240 heart
sample sounds
databases inwerewav
were format,
adopted including
from the 2548
Internetnormal heart
[28,29]. sounds
The databaseandand 692
[28]
692
ab- ab-
tains
tains
tains
normal
Two
Two3240 sample
sample
3240heart
3240 heart
heartsounds databases
databases
sounds
soundsin in
inwav were
wav
wavformat, adopted
adopted
format,
format, from
from
including
including
including the
the Internet
Internet
2548
2548
2548 normal
normal
normal [28,29].
[28,29].
aheart
heart
heart The
The database
database
sounds
sounds
sounds and
and
and [28] 692con-
[28]
692
of 692
con-
con-
ab-
ab-
ab-
normalheart heartsounds,
sounds, with with lengths of of between
between 55and and 120120 s ands and sampling
a sampling rate rate
of 2000 [Link].
tains3240
tains
tains
normal
normal
normal 3240
3240heart
heart
heart heart
heart
heart sounds,
sounds,
sounds, sounds
sounds
sounds within
with
with inlengths
in wavformat,
wav
wav
lengths
lengths format,
format,
of
of between
ofbetween
betweenincluding
including
including 555and
and
and 2548
2548
2548 120
120 ssnormal
normal
120normal andaaaheart
and
ssound
and heartsounds
heart
sampling
sampling
sampling sounds
sounds rateand
rate
rate and
and
of
of2000
of 692
2000
2000692Hz.
692 ab-
ab-
ab-
Hz.
Hz.
The
TheLung LungSound SoundDataset Dataset [29] is a collection collection of
of 920
920 lung
lung sound recordings
recordings in inwav wav format
format
normal
normal
normal
The
The
The Lung
Lung
Lung heart
heart
heart Sound
Sound
Sound sounds,
sounds,
sounds, with
with
with
Dataset
Dataset
Dataset lengths
lengths
lengths
[29]
[29]
[29] is
isisaa90 of ofbetween
of between
between
collection
acollection
collection of
of
of 555920
andand
and 120ssound
120
120
lung sand
andaaarecordings
ssound
and samplingrate
sampling
sampling
recordings rate
rate
in of
wavof2000
of 2000
2000 Hz.
format Hz.
Hz.
with
with lengths
lengths ranging
ranging fromfrom 1010 to to s,
90created
s, created by920920
bylung
two lung
research
two sound
research teams recordings
teamsin Portugal ininwav
in Portugal wav format
andformatGreece,
and
TheLung
The
The
with
with
with Lung
Lung
lengths
lengths
lengths Sound
Sound
Sound ranging
ranging
ranging Dataset
Dataset
Dataset from
from
from [29]
[29]
[29] 10
10
10 isisisaaacollection
to
to
to 90collection
collection
90
90 s,
s,
s, created
created
created of
ofof920920
920
by
by
by lung
lung
lung
two
two
two soundrecordings
sound
sound
research
research
research recordings
recordings
teams
teams
teams in
in
in in
in
in wavformat
wav
wav
Portugal
Portugal
Portugal format
formatand
and
and
containing
Greece, containing 35 normal 35lung
normal sounds,
lung sounds, 1 asthma, 16 bronchiectasis,
1 asthma, 16 bronchiectasis, 13 occlusive
13 occlusive bronchiolitis,
bron-
withlengths
with
with
Greece,
Greece,
Greece, lengths
lengths
containing
containing
containing ranging
ranging
ranging 35 from10
from
from
normal
35normal
35 normal 10
10lungto
lung
lung to90
to 90s,s,s,created
90
sounds,
sounds,
sounds, created
created
111asthma,
asthma, by
by
asthma, bytwo two
two
16 researchteams
research
research
bronchiectasis, teams
teams 13in in
in Portugal
Portugal
Portugal
occlusive and
and
and
bron-
793 chronic
chiolitis, 793 obstructive
chronic pulmonary
obstructive diseases
pulmonary diseases16
(COPD), 16
37 bronchiectasis,
bronchiectasis,
(COPD), pneumonia, 37 pneumonia, 2313
13 occlusive
occlusive
upper 23 bron-
bron-
respiratory
upper
Greece,
Greece,
Greece,
chiolitis,
chiolitis,
chiolitis,
respiratory
tract containing
containing
containing
793 chronic
793chronic
793
infection chronic
tract
(URTI), 35
35
35 normal
normal
normal
obstructive
obstructive
obstructive
infection
and (URTI), lung
lung
lung
2 lower sounds,
sounds,
sounds,
pulmonary
pulmonary
pulmonary
and 1
2 lower
respiratory 11 asthma,
asthma,
asthma,
diseases
diseases
diseases 16
respiratory
tract 16
16 bronchiectasis,
bronchiectasis,
bronchiectasis,
(COPD),
(COPD),
(COPD),
infection tract 37
37 13
13
13
pneumonia,
37pneumonia,
pneumonia,
infection
(LRTI) sounds. occlusive
occlusive
occlusive
(LRTI) The 23 bron-
bron-
bron-
upper
23samples
23
sounds. upper
upper
chiolitis,
chiolitis,
chiolitis,
respiratory
respiratory
respiratory
The 793
samples793
793 chronic
chronic
chronic
tract
tract
tract obstructive
obstructive
obstructive
infection
infection
infection
containing no(URTI),
(URTI),
(URTI),
noise pulmonary
pulmonary
pulmonary
and
and
and
from 2
2 2 lower
[28]
containing no noise from [28] were split into five folds and the samples for traininglower
lower werediseases
diseases
diseases
respiratory
respiratory
respiratory
split (COPD),
(COPD),
(COPD),
into tract
tract
tract
five 3737
37 pneumonia,
pneumonia,
pneumonia,
infection
infection
foldsinfection
and the(LRTI)
(LRTI)
(LRTI)
samples23
2323 upper
upper
upper
sounds.
sounds.
[Link]
respiratory
respiratory
respiratory
The
The
The samples
samples
samples
training and tract
tract
tract infection
infection
infection
containing
containing
containing
testing were no
nono (URTI),
(URTI),
(URTI),
noise
noise
noise
randomly and
and
and
from
from
from 2 22
[28]
[28]
selected.
testing were randomly selected. The sample sizes for training and testing are shown lower
lower
lower
[28] were
were
were
The respiratory
respiratory
respiratory
split
split
split
sample into
into
into sizes tract
tract
tract
five
five
five folds
for infection
infection
infection
folds
folds
trainingand
and
and (LRTI)
(LRTI)
(LRTI)
the
the
the
and samples
samples
samples
testing sounds.
sounds.
[Link]
for
for in
The
The
The
training
training
training
shown
Table samples
samples
samples and
and
and
4. In containing
containing
containing
testing
testing
testing
in addition, were
were
were
Table 4. Insamples nono
no noise
noise
noise
randomly
randomly
randomly
addition, from
from
from [28]
[28]
[28]
selected.
selected.
selected.
samples containing
containing were
were
were
The
The
The
noise in the split
split
split
sample
sample
sample
noise into
into
into
database sizes
sizes
sizes five
five
five for
for
for
in the database folds
folds
folds trainingand
and
and
training
training
set were set the
the
theand
and
and samples
samples
samples
testing
testing
testing
were catego-
categorized for
for
for
are
are
are
as test
training
training
training
shown
shown
shownrizedin
samples in
in
asand
and
and Table
Table
Table
test testing
testing
testing
(samples) 4.
[Link]
samplesIn
Inwere
were
were
to addition,
addition, randomly
randomly
randomly
addition,
(samples)
evaluate samples
samples
samples selected.
selected.
selected.
thetotolerance containing
evaluatecontaining
containing The
The
The
the
of the sample
sample
sample
noise
noisein
noise
tolerance
proposed sizes
sizes
sizes
in
in ofthe
the
the for
the for
for training
training
training
database
database
database
algorithmproposed and
set
set
toset and
andwere
were
were testing
testing
testing
algorithm
noise. catego-
Thecatego-
catego- toare
are
are
score
shown
shown
shown
rized
ofnoise.
rized
rized each as in
in
as in
asThe Table
Table
Table
test
testscore
test
index 4.
samples4.
4. InIn
In addition,
addition,
addition,
(samples)
wasofevaluated
samples
samples each indexby
(samples)
(samples) samples
samples
samples
was to
toto evaluate
evaluated containing
containing
containing
evaluatethe
evaluate
averaging the
by noise
noise
noise
tolerance
averaging
the5-fold
the tolerance
tolerance in in
in
theof
of
classification the
the
the the
of5-fold database
database
database
proposed set
set
set
classification
the proposed
the proposed were
were
were
algorithm catego-
catego-
catego-
results.
algorithm
results. algorithm
The classifica- to
to
to
rized
rized
rized
noise.
noise.
tion as
Theresults
noise. as
as
The
The
The test
test
testscore
score
scoreofsamples
samples
samples
classification of
of
the each
ofeach
each
heart (samples)
(samples)
(samples)
resultsindex
index
index of the
sound was
was
was toto
to
heart evaluate
evaluate
evaluate
sound
evaluated
evaluated
evaluated
dataset with by the
bythe
the
dataset
by tolerance
tolerance
tolerance
averaging
averaging
averaging
different with
segmentation of
the
the
the ofthe
of
different theproposed
the
5-fold
5-fold
5-fold proposed
proposed
segmentation
classification
classification
classification
lengths (1algorithm
algorithm
algorithm
s,lengths
results.
results.
[Link]
s, and to
to
noise.
noise.
noise.
2 (1
The
The
The s, Thes,
The
The
1.5 score
score
score
classification
classification
s),classification
different 2of
of
andoverlap of each
each
each
s),
results
results
results index
index
index
differentof
ofthe
of
ratios was
the
the was
was
overlap
(0%, heart
heart
heart evaluated
evaluated
evaluated
20%,ratios
sound
sound
sound and (0%, by
by by20%,
datasetaveraging
averaging
averaging
dataset
dataset
50%), andand
with
with
with the
50%), the5-fold
the
different
different 5-fold
5-fold
and
different
different classification
classification
classification
different
votingsegmentation
segmentation
segmentation
ratios voting results.
results.
results.
ratios
lengths
lengths
(5–95%) lengths are
(1The
The
The
(1
(1 s,
s,
s, classification
classification
classification
1.5
1.5
1.5 s,
s,s, and
and
and 222 s),
s),
s), results
results
results
different
different
different ofof
of the
the
the
overlap
overlap
overlap heart
heart
heart sound
sound
sound
ratios
ratios
ratios (0%,
(0%,
(0%,
presented in Figures 16–19 in terms of four metrics (accuracy, sensitivity, specificity, and dataset
dataset
dataset
20%,
20%,
20%, with
with
with
and
and
and 50%),different
different
different
50%),
50%), and
and
and segmentation
segmentation
segmentation
different
different
different voting
voting
voting lengths
lengths
lengths
ratios
ratios
ratios
(1(1
(1 s,s,s,score).
F1 1.5s,s,s,and
1.5
1.5 and222s),
and s),different
s), differentoverlap
different overlapratios
overlap ratios(0%,
ratios (0%,20%,
(0%, 20%,and
20%, and50%),
and 50%),and
50%), anddifferent
and differentvoting
different votingratios
voting ratios
ratios
Sensors 2022, 22, 4263 18 of 25

Table 4. Sizes of heart sound samples.


Sensors 2022, 22, x FOR PEER REVIEW 18 of 25
Sensors 2022, 22, x FOR PEER REVIEW 18 of 25
Sensors 2022, 22, x FOR PEER REVIEW Training Samples Testing Sampes Noisy Samples
18 of 25
Abnormal Normal Abnormal Normal Abnormal Normal
(5–95%) are presented
411 are in Figures 16–19
1940in in terms of
103 in four metrics 151
485four (accuracy, sensitivity,
150
(5–95%) presented Figures 16–19 terms of metrics (accuracy, sensitivity,
specificity,
(5–95%) areand F1 score).
presented in Figures 16–19 in terms of four metrics (accuracy, sensitivity,
specificity, and F1 score).
specificity, and F1 score).

(a) (b)
(a) (b)
(a)Figure 16. Heart sound classification results: Accuracy. (a) Testing
(b)samples. (b) Noisy samples.
Figure [Link]
Figure16. Heartsound
soundclassification
classification results: Accuracy.(a)
results: Accuracy. (a)Testing
Testingsamples.
samples.(b)(b) Noisy
Noisy samples.
samples.
Figure 16. Heart sound classification results: Accuracy. (a) Testing samples. (b) Noisy samples.

(a) (b)
(a) (b)
(a)Figure 17. Heart sound classification results: Sensitivity. (a) Testing
(b) samples. (b) Noisy samples.
Figure 17. Heart sound classification results: Sensitivity. (a) Testing samples. (b) Noisy samples.
Figure [Link]
Figure17. Heart sound
sound classification results:Sensitivity.
classification results: Sensitivity.(a)
(a)Testing
Testingsamples.
samples.
(b)(b) Noisy
Noisy samples.
samples.

(a) (b)
(a) (b)
(a)Figure 18. Heart sound classification results: Specificity. (a) Testing
(b) samples. (b) Noisy samples.
Figure 18. Heart sound classification results: Specificity. (a) Testing samples. (b) Noisy samples.
Figure 18. Heart sound classification results: Specificity. (a) Testing samples. (b) Noisy samples.
Figure 18. Heart sound classification results: Specificity. (a) Testing samples. (b) Noisy samples.
Sensors 2022, 22, 4263
Sensors 2022, 22, x FOR PEER REVIEW
19 of 25
19 of 25

(a) (b)
Figure 19. Heart sound classification results: F1 Score. (a) Testing samples. (b) Noisy samples.

Table 4. Sizes of heart sound samples.

Training Samples Testing Sampes Noisy Samples


(a) Abnormal Normal Abnormal Normal (b) Abnormal Normal
411 1940 103 485 151 150
Figure 19. Heart sound classification results: F1 Score. (a) Testing samples. (b) Noisy samples.
Figure 19. Heart sound classification results: F1 Score. (a) Testing samples. (b) Noisy samples.
TableFurthermore,
4. Sizes of heartsounds
Furthermore, from the
sound samples.
sounds from the lung
lung sound
sound database
database [29][29] were
were used
used as
as the
the training
training
and testing
and testing samples
samples for the proposed
for the proposed algorithm.
algorithm. This database contains
This database contains several types
several types ofof
Training Samples Testing Sampes Noisy Samples
lung sounds.
lungAbnormal However,
sounds. However, the number
the number of samples
of samples of of some types
some types is too small. Therefore,
is too small. Therefore, only
Normal Abnormal Normal Abnormal Normal only
30 normal
30 normal lung lung sounds
sounds and
and 30
30 abnormal
abnormal lung
lung sounds
sounds (including
(including 6 6bronchial
bronchialdilatation,
dilatation,6
411 1940 103 485 151 150
6occlusive
occlusivebronchitis,
bronchitis,66chronic
chronicobstructive
obstructivepulmonary
pulmonary disease,
disease, 66 pneumonia,
pneumonia, and and 66 upper
upper
respiratory
respiratory tract infection)
tract infection) were used to split the samples into three folds, and the samples
Furthermore, sounds were
from used
the lungto split
soundthedatabase
samples [29]
into were
three used
folds,asand
thethe samples
training
were classified
were classifiedin in aa binary
binary way,
way,i.e.,
i.e., all
all lung-related
lung-related diseases
diseases were
were considered
considered as as abnormal
abnormal
and testing samples for the proposed algorithm. This database contains several types of
lung
lung sounds.
lungsounds.
[Link] The number
number
However, theofof training
training
number ofandand test
test
samples samples
samples
of somewas was
types 20too
20 is
and and
10, 10, respectively.
respectively.
small. Simi-
Therefore,Similarly,
only
larly,
the the
30 results
normalwereresults were
lung observed observed
sounds and by30four by four
metrics:lung
abnormal metrics:
accuracy,accuracy,
soundssensitivity, sensitivity,
(includingspecificity, specificity,
6 bronchialand and6 F1
F1 score,
dilatation, as
score,
shown as
in shown
Figures in Figures
20–23, 20–23, respectively.
respectively.
occlusive bronchitis, 6 chronic obstructive pulmonary disease, 6 pneumonia, and 6 upper
respiratory tract infection) were used to split the samples into three folds, and the samples
were classified in a binary way, i.e., all lung-related diseases were considered as abnormal
lung sounds. The number of training and test samples was 20 and 10, respectively. Simi-
larly, the results were observed by four metrics: accuracy, sensitivity, specificity, and F1
score, as shown in Figures 20–23, respectively.

Sensors 2022, 22, x FOR PEER REVIEW 20 of 25


Figure 20. Lung sound classification results: Accuracy.
Figure 20. Lung sound classification results: Accuracy.

Figure 20. Lung sound classification results: Accuracy.

Figure 21. Lung sound classification results: Sensitivity.


Figure 21. Lung sound classification results: Sensitivity.
Sensors 2022, 22, 4263 20 of 25

Figure 21.
Figure 21. Lung
Lung sound
sound classification
classification results:
results: Sensitivity.
Sensitivity.

Figure22.
Figure
Figure [Link]
22. Lung
Lung sound
sound
sound classification
classification
classification results:
results:
results: Specificity.
Specificity.
Specificity.

Figure23.
Figure
Figure [Link]
23. Lung
Lung sound
sound
sound classification
classification
classification results:
results:
results: F1 Score.
F1 Score.
F1 Score.

[Link]
Discussion
4. Discussion
4.1. Design of Electronic Stethoscope
4.1. Design
4.1. Design of of Electronic
Electronic Stethoscope
Stethoscope
The presented stethoscope is cost-effective and easily implemented. The condenser mi-
crophone Theused
The presented
presented stethoscope
stethoscope
in the presented is cost-effective
is
stethoscope cost-effective
can be found and
and easily
easily
on the implemented.
implemented.
market at a reasonable The
The condense
condense
price
microphone
microphone
and the filter andused
used in the
in the presented
amplification presented stethoscope
circuits stethoscope
are simple. From canthe
can be found
be found
test on the
on
results the market
market
(Figures at aa13)
at
12 and reasonabl
reasonabl
price
for and
experimental
price the filter
and the filter and
site I,and amplification
the amplification
microphone at the circuits are
noise-receiving
circuits simple.
are simple. end From the
over-received
From test results
the sound
the test results (Figures
(Figures 121
and
at 13)
low13)
and for experimental
experimental
frequencies;
for therefore,sitesite I,I, the
the microphone
a superposition microphone at the
of noise signals
at the noise-receiving
noise-receiving
in the low-frequency endsection
end over-receive
over-received
was
the observed.
sound at Even
low so, the noise
frequencies; reduction
therefore,effecta is still satisfactory,
superposition
the sound at low frequencies; therefore, a superposition of noise signals in the of being
noise 4~6 times
signals inlower
the low-fre
low-fre
than in the
quency section case not
section was using
was [Link] subtracter.
observed. Even Even so, However,
so, the
the noise this
noise reductionphenomenon
reduction effecteffect isaffected
is still the noise
still satisfactory,
satisfactory, being
bein
quency
reduction when the stethoscope was applied to human skin. With the human body test
4~6 times
4~6experimentallower
times lowersite than
than in the case not using the subtracter. However, this phenomenon af
for II, in
thethe casereduction
noise not using the was
effect subtracter. However,
far different from the this phenomenon
test results af
fected the
fected the noise
noise reduction
reduction when when the stethoscope
stethoscope was applied applied to to human
human skin. skin. With th
conducted in experimental site I. It isthebecause the noisewas peak frequency received at theWith th
human
human body
stethoscope bodyend test
test
onforfor experimental
theexperimental
skin of the lower site II,
siteleg the noise
II, (Measurement reduction
the noise reduction effect
2) is ateffect
300 Hz, was
was far different
far different
whereas the from
from
the test
the
noise test results
peakresults
frequency conducted
conducted
received in in experimental
by experimental
mic 2 at the ambient site [Link]
site It is
It is because
because
above theofnoise
the
the skin noise peakleg
peak
the lower frequenc
frequency
received
(Measurement at the stethoscope
received at the3)stethoscope
is at 350 Hz for end
endModelson the skin
on the3skinand of of
4 orthe
the375 lower
Hz for
lower leg (Measurement
legModels 2)
1 and 2, as2)listed
(Measurement is at 300
is at 300 Hz
Hz
in Table 2. There is a frequency shift due to different sound media. The media for the
microphone inside the stethoscope are the skin, diaphragm, and the air. The medium for
the microphone of Models 1 and 2 to collect environmental noise is the air. The media for
the microphone of Models 3 and 4 to collect environmental noise are the diaphragm and the
air. When heart sounds were mixed with the noise, the noise peak frequency (Measurement
5) became 325 Hz for Models 1 and 2 or 310 Hz for Models 3 and 4 (see Table 2). This
peak frequency shift resulted in some noise signals being superposed, as shown in Table 3,
when the subtraction of two signals was performed. Nevertheless, when Models 1–4 all
used the subtracter, the design of Model 4 outperformed the others (Models 1–3) with the
Sensors 2022, 22, 4263 21 of 25

lowest noise amplitudes. Even when the subtracter was not used for Models 1–4, Model 4
still performed better than the others. The cork surrounding the double heads in Model 4
presents certain noise isolation.

4.2. Heart/Lung Sound Classification


For the testing samples of the heart sounds in Table 4, we can see from Figure 16a that
the best overall accuracy falls in the range of 84.3–86.9% at 20–40% of the voting ratios. The
sensitivity and the specificity are inversely proportional to each other, with one being higher
and the other lower. The sensitivity falls in the range of 42.2–88.3% and the specificity
falls in the range of 76.6–98.8%. The trend shows that the sensitivity is greater than 76.6%
only when the voting ratio is less than 25%. When the specificity is also considered, i.e.,
Figures 17a and 18a are considered together, the best results are obtained at 20% and 25% of
the voting ratios. The average sensitivity and the average specificity are 81.2% and 89.4%,
respectively, at a voting ratio of 20%. The average sensitivity and the average specificity
are 79.8% and 90.7%, respectively, at a voting ratio of 25%. The best overall F1 score is
in the range of 83.1–86.2% at 10–20% of the voting ratios and the highest overall score
was obtained at 20% of the voting ratio, as shown in Figure 19a. Based on the above four
metrics (accuracy, sensitivity, specificity, and F1 score), the overall best score is at 20% of
the voting ratio.
For the noisy samples of the heart sounds, the best overall accuracy is in the range of
25–80% of the voting ratios with a value of 66.1–70.4% as seen in Figure 16b. The sensitivity
value ranges from 34.3 to 72.7% and the specificity value ranges from 58.5 to 92.7%. From
the trend in Figure 17b, the sensitivity is greater than 58.5% only when the voting ratio is
less than 25%, and furthermore, when considering the specificity of Figure 18b, the best
performance occurs when the voting ratio is 25%, with an average sensitivity of 65.1% and
an average specificity of 71.3%. From Figure 19b, the best overall F1 score is in the range of
63.6–69.5% at 5–40% of the voting ratios. Based on the four metrics, the best overall score is
at 25% of the voting ratio.
The experimental results of different segmentation methods for heart sounds were
compiled and discussed according to different voting ratios. Although nine different
segmentation methods were presented, only the top three scoring methods are discussed
here. From the results shown in Figures 16a, 17a, 18a and 19a for the testing samples, the
best accuracy was obtained with a 20% voting ratio and a segmentation length of 2 s, the
best sensitivity was obtained with a segmentation length of 1 and 1.5 s and an overlap
ratio of 50%, the best specificity was obtained with a segmentation length of 2 s and an
overlap ratio of 50%, and the best F1 score was obtained with a segmentation length of 1
or 2 s and overlap ratios of 20% or 50%. Based on these four metrics, the model using the
segmentation length of 2 s and overlap ratio of 20% achieves the best overall score. The
scores are 86.9% for accuracy, which is slightly higher than that of [22], 81.9% for sensitivity,
91.8% for specificity, and 86.1% for F1 score. For further comparison considering the sample
source, we chose the top three models [17,33,34] in the 2016 PhysioNet/CinC Heart Sound
Classification Challenge [28]. Since the original training code of the authors participating
in the 2016 competition was not available, only the final classification program and the
trained weights of the neural network were available, so it was not possible to compare
the results fairly. The comparison results are shown in the last four rows of Table 5. The
results show that the recognition of normal heart sounds using our model is a little better
than those using the other methods but the recognition of abnormal heart sounds is a
little worse. Although the overall score was about 4–7% different from the top three in the
competition, the classification method they used was a very complex DNN method that
required special hardware with high computing power to execute and, in addition to the
higher cost and inconvenience of porting the model, these methods could not be executed
on embedded systems with low computing power. For outpatient physicians, they are
unable to obtain the patient’s heart sound recognition results in real-time to assist them
in making correct judgments. On the other hand, our model is not complicated and with
Sensors 2022, 22, 4263 22 of 25

reasonable performance, it can be imported into an embedded system such as Raspberry


Pi. The performance indices of other models [17,18,20–22,33,34] that were reviewed in the
Introduction are also listed in rows 1–7 of Table 5. However, since we don’t have access to
the codes of these models, Table 5 only lists the test results collected from the literature.

Table 5. Comparison results.

Models Training #: Testing # Accuracy Sensitivity Specificity F1


Adaptive Boosting + CNN [17] 9:1 86.02 94.24 77.81 –
DNN [18] 9:1 97.10 99.26 94.86 –
WST + PCA + 2SVM [20] * 7:3 93.06 – – –
Classic ML + DL [21] 9:1 92.9 82.3 96.2 –
1D CNN+ BiLSTM [22] 9:1 86.57 91.78 59.05 91.78
Ensemble-NN [33] 9:1 91.5 94.23 88.76 –
DropConnected-NN [34] 9:1 84.1 84.8 93.3 –
Adaptive Boosting + CNN [17] ** – 89.6 93.7 85.6 90
Ensemble-NN [33] ** – 93.0 94.5 91.4 93.1
DropConnected-NN [34] ** – 93.1 94.5 91.7 93.1
Presented model 4:1 86.9 81.9 91.8 86.1
#: number(s). DNN: deep neural network. ML: machine learning. DL: deep learning. NN: neural network. CNN:
convolutional neural network. WST + 2SVM: Wavelet scattering transformation (WST) + twin support vector
machine (2SVM). 1D CNN + BiLSTM: 1D CNN (1DCNN) + bi-directional long short-term memory (BiLSTM). 9:1
means 10-fold evaluation approach in PhysioNet dataset. *: dataset A in PhysioNet dataset. **: The test using the
program and trained weights of NN was obtained from the PhysioNet website [28]. –: data not available.

From the results shown in Figures 16b, 17b, 18b and 19b for noisy samples of heart
sounds, at 25% of the voting ratio, segment lengths of 1.5 and 2 s and overlap ratios of 20%
and 50% provide better performance in terms of accuracy; segment lengths of 1.5 and 2 s
and overlap ratios of 20% offer better performance in terms of sensitivity; segment lengths
of 1.5 and 2 s and overlap ratios of 50% give better performance in terms of specificity; and
segment lengths of 1.5 and 2 s and overlap ratios of 20% and 50% offer better performance
in terms of F1 score. Therefore, based on these four metrics, the model using a segmentation
length of 2 s and an overlap ratio of 20% obtains the highest overall score: 69% for accuracy,
68.3% for sensitivity, 69.7% for specificity, and 68.8% for F1 score.
In summary, the signal segmentation length of 2 s and the overlap of 20% or 50% were
found to be the most effective for the heart sounds. This indicates that a 2 s segmentation
length can contain the most complete information for a single heartbeat. In addition, some
degree of overlap is also effective in improving the overall classification accuracy.
As shown in Figure 20 for the lung sound classification, there is no significant change
in the voting ratios for accuracy and the trend is similar. From Figures 21 and 22, the
sensitivity falls in the range of 53.3–70% and the specificity falls in the range of 50–93.3%.
The voting ratio does not have much influence on these scores, only a slight upward or
downward trend occurs at around 30–35% and 70–75% of the voting ratios. The best overall
F1 score, as seen in Figure 23, falls at 56.7–71.5% at 5–30% of the voting ratios. Based on
these four metrics, the best overall score occurs at 5–30% of the voting ratios. However,
from the experimental results, the voting ratio did not have much effect on the scores
and there was no significant change from 5% to 65% of the voting ratios. This might be
caused by the high proportion of abnormal lung sound samples or uneven distribution.
The segmentation length of 5 s with 50% overlap gave the best classification result: 73.3%
for accuracy, 66.7% for sensitivity, 80% for specificity, and 71.5% for F1 score. This indicates
that a segmentation length of 5 s can contain the most complete information about one
respiration. In addition, allowing a certain degree of overlap in the segmentation as in the
case of heart sounds, can effectively improve the overall classification accuracy.
Sensors 2022, 22, 4263 23 of 25

In summary, the above results show that the voting ratio usually correlates with
the metrics. There are three factors that affect the trends in the metrics. The first is the
percentage of abnormal heart sounds or lung sounds in the samples. In medical terms, if
a heart sound or lung sound is diagnosed as abnormal, not every heartbeat in that heart
sound or respiration in that lung sound may be abnormal. The second is the total number
of small frames segmented from a given sample. The higher the number of frames, the
smoother the score curve and the easier it is to see the trend and the more accurate the
results. The last one is the accuracy of the classifier in training the frames. The higher the
accuracy of the frames, the higher the accuracy of the subsequent voting.

5. Conclusions
In this paper, the design of an electronic stethoscope and an AI classification algorithm
for cardiopulmonary sounds were addressed. Five models of electronic stethoscopes
have been proposed and tested. In Models 1 and 2, a microphone is installed inside the
stethoscope head to collect heart and lung sounds and a second microphone is attached
to the back of the stethoscope head for collecting environmental noise. In Models 3 and
4, double stethoscope heads where each has a microphone installed are connected back-
to-back, one stethoscope head is for collecting heart and lung sounds and the other is for
collecting environmental noise. Cork is used in Models 2 and 4 to isolate environmental
sounds. In Model 5, only one stethoscope head covered by cork is used. The collected
sounds are processed through two-stage amplification and filter circuits. Each processed
heart/lung sound then optionally subtracts the processed noise sound for noise reduction
and a Raspberry Pi is used to record the final sound. The effect of noise reduction for the
presented electronic stethoscopes was tested in two different experimental sites. When
subtraction is not used, Model 4 presents better performance with fewer noise amplitudes.
When subtraction is used for Models 1–4, Model 4 still outperforms the other 3 Models
with the lowest subtracted noise signals. The cork surrounding the double heads in Model
4 presents certain noise isolation. However, frequency shifts due to different sound media
associated with the stethoscope head and the microphone for the noise were observed
during the course of the study. Therefore, this issue may need further investigation.
For the cardiopulmonary sound classifications, a voting ensemble learning approach
combined with PCA and MFCC was developed. Two public databases were used for
training and testing. Different segmentation lengths (1 s, 1.5 s, and 2 s) and different
overlap ratios (0%, 20%, and 50%) were applied to segment one sample into several small
frames. Four common metrics (accuracy, sensitivity, specificity, and F1 score) were used to
evaluate the performance of the algorithm. After testing, the best voting for heart sounds
falls at 5–45% and the best voting for lung sounds falls at 5–65%. Based on the results for
the heart sound testing samples containing no noise, the best overall score is obtained using
2 s frame segmentation with a 20% overlap: 86.9% for accuracy, 81.9% for sensitivity, 91.8%
for specificity, and 86.1% for F1 score. For the lung sound testing samples, the best overall
score is yielded using 5 s frame segmentation with a 50% overlap: 73.3% for accuracy, 66.7%
for sensitivity, 80% for specificity, and 71.5% for F1 score. The signal segmentation length is
long enough to cover one heartbeat or one respiration and a certain degree of overlap in
segmentation would effectively improve the overall classification performance.

Author Contributions: Conceptualization, Y.-C.W., C.-C.H., C.-S.C., T.-Y.S., H.-M.C. and J.-Y.L.;
methodology, Y.-C.W., C.-C.H., C.-S.C., F.-L.C. and S.-F.C.; software, F.-L.C. and S.-F.C.; validation,
Y.-C.W., C.-C.H., C.-S.C., F.-L.C., S.-F.C., T.-Y.S., H.-M.C. and J.-Y.L.; formal analysis, Y.-C.W., C.-C.H.,
F.-L.C. and S.-F.C.; investigation, Y.-C.W., C.-C.H., C.-S.C., F.-L.C. and S.-F.C.; resources, Y.-C.W.
and C.-C.H.; data curation, Y.-C.W., C.-C.H., F.-L.C. and S.-F.C.; writing—original draft preparation,
Y.-C.W., C.-C.H., F.-L.C. and S.-F.C.; writing—review and editing, Y.-C.W.; visualization, Y.-C.W.,
C.-C.H., F.-L.C. and S.-F.C.; supervision, Y.-C.W. and C.-C.H.; project administration, Y.-C.W., C.-C.H.,
C.-S.C., T.-Y.S. and H.-M.C.; funding acquisition, Y.-C.W., C.-C.H., T.-Y.S. and H.-M.C. All authors
have read and agreed to the published version of the manuscript.
Sensors 2022, 22, 4263 24 of 25

Funding: This research was partially funded by the Ministry of Science and Technology, Taiwan,
grant number MOST 107-2622-E-239-003-CC3 and by the Taichung Veterans General Hospital, Taiwan,
grant numbers TCVGH-NUU1098901 and TCVGH-NUU1108901.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.

References
1. McLane, I.; Emmanouilidou, D.; West, J.E.; Elhilali, M. Design and Comparative Performance of a Robust Lung Auscultation
System for Noisy Clinical Settings. IEEE J. Biomed. Health Inform. 2021, 25, 2583–2594. [CrossRef] [PubMed]
2. Zhang, X.; Maddipatla, D.; Narakathu, B.B.; Bazuin, B.J.; Atashbar, M.Z. Development of a Novel Wireless Multi-Channel
Stethograph System for Monitoring Cardiovasculara and Cardiopulmonary Diseasses. IEEE Access 2021, 9, 128951–128964.
[CrossRef]
3. Toda, M.; Thompson, M.L. Contact-type Vibration Sensors Using Curved Clamped PVDF Film. IEEE Sens. J. 2006, 6, 1170–1177.
[CrossRef]
4. Duan, S.; Wang, W.; Zhang, S.; Yang, X.; Zhang, Y.; Zhang, G. A Bionic MEMS Electronic Stethoscope with Double-Sided
Diaphragm Packaging. IEEE Access 2021, 9, 27122–27129. [CrossRef]
5. Shi, P.; Li, Y.; Zhang, W.; Zhang, G.; Cui, J.; Wang, S.; Wang, B. Design and Implementation of Bionic MEMS Electronic Heart
Sound Stethoscope. IEEE Sens. J. 2022, 22, 1163–1172. [CrossRef]
6. Andreozzi, E.; Fratini, A.; Esposito, D.; Naik, G.; Polley, C.; Gargiulo, G.D.; Bifulco, P. Forcecardiography: A Novel Technique to
Measure Heart Mechanical Vibrations onto the Chest Wall. Sensors 2020, 20, 3885. [CrossRef] [PubMed]
7. Andreozzi, E.; Gargiulo, G.D.; Esposito, D.; Bifulco, P. A Novel Broadband Forcecardiography Sensor for Simultaneous Monitoring
of Respiration, Infrasonic Cardiac Vibrations and Heart Sounds. Front. Physiol. 2021, 18, 725716. [CrossRef] [PubMed]
8. Chien, J.; Huang, M.; Lin, Y.; Chong, F. A study of heart sound and lung sound separation by independent component analysis
technique. In Proceedings of the 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New
York, NY, USA, 30 August–3 September 2006; pp. 5708–5711.
9. Hadjileontiadis, L.J.; Panas, S.M. A wavelet-based reduction of heart sound noise from lung sounds. Int. J. Med. Inform. 1998, 52,
183–190. [CrossRef]
10. Liu, F.; Wang, Y.; Wang, Y. Research and Implementation of Heart Sound Denoising. Phys. Procedia Vol. 2012, 25, 777–785.
[CrossRef]
11. Mayorga, P.; Valdez, J.A.; Druzgalski, C.; Zeljkovic, V.; Magana-Almaguer, H.; Morales-Carbajal, C. Cardiopulmonary sound
sources separation. In Proceedings of the 2021 Global Medical Engineering Physics Exchanges/Pan American Health Care
Exchanges, Sevilla, Spain, 15–20 March 2021.
12. Lin, L.; Tanumihardja, W.A.; Shih, H. Lung-heart sound separation using noise assisted multivariate empirical mode decomposi-
tion. In Proceedings of the 2013 International Symposium on Intelligent Signal Processing and Communication Systems, Naha,
Japan, 12–15 November 2013; pp. 726–730.
13. Jusak, J.; Puspasari, I.; Susanto, P. Heart murmurs extraction using the complete ensemble empirical mode decomposition and the
Pearson distance metric. In Proceedings of the 2016 International Conference on Information & Communication Technology and
Systems (ICTS), Surabaya, Indonesia, 12 October 2016; pp. 140–145.
14. Papadaniil, C.D.; Hadjileontiadis, L.J. Efficient Heart Sound Segmentation and Extraction Using Ensemble Empirical Mode
Decomposition and Kurtosis Features. IEEE J. Biomed. Health Inform. 2014, 18, 1138–1152. [CrossRef] [PubMed]
15. Varghees, V.N.; Ramachandran, K.I. Effective Heart Sound Segmentation and Murmur Classification Using Empirical Wavelet
Transform and Instantaneous Phase for Electronic Stethoscope. IEEE Sens. J. 2017, 17, 3861–3872. [CrossRef]
16. Ntalampiras, S. Collaborative Framework for Automatic Classification of Respiratory Sounds. IET Signal Process. 2020, 14,
223–228. [CrossRef]
17. Potes, C.; Parvaneh, S.; Rahman, A.; Conroy, B. Ensemble of feature-based and deep learning-based classifiers for detection
of abnormal heart sounds. In Proceedings of the 2016 Computing in Cardiology Conference, Vancouver, BC, Canada, 11–14
September 2016; pp. 621–624.
18. Chowdhury, T.H.; Poudel, K.N.; Hu, Y. Time-frequency Analysis, Denoising, Compression, Segmentation, and Classification of
PCG Signals. IEEE Access 2020, 8, 160882–160890. [CrossRef]
19. Kumar, D.; Carvalho, P.; Antunes, M.; Paiva, R.P.; Henriques, J. Heart murmur classification with feature selection. In Proceedings
of the 32nd Annual International Conference of the IEEE Engineering Medicine and Biology Society, Buenos Aires, Argentina, 31
August–4 September 2010.
Sensors 2022, 22, 4263 25 of 25

20. Li, J.; Ke, L.; Du, Q.; Ding, X.; Chen, X.; Wang, D. Heart Sound Signal Classification Algorithm: A Combination of Wavelet
Scattering Transform and Twin Support Vector Machine. IEEE Access 2019, 7, 179339–179348. [CrossRef]
21. Gjoreski, M.; Gradisek, A.; Budna, B.; Gams, M. Machine Learning and End-to-end Deep Learning for the Detection of Chronic
Heart Failure from Heart Sounds. IEEE Access 2020, 8, 20313–20324. [CrossRef]
22. Shuvo, S.B.; Ali, S.N.; Swapnil, S.I.; Al-Rakhami, M.S.; Gumaei, A. CardioXNet: A Novel Lightweight Deep Learning Framework
for Cardiovasculr Disease Classification Using Heart Sound Recordings. IEEE Access 2021, 9, 36955–36967. [CrossRef]
23. Liu, C.; Springer, D.; Li, Q.; Moody, B.; Juan, R.A.; Chorro, F.J.; Castells, F.; Roig, J.M.; Silva, I.; Johnson, A.E.; et al. An Open
Access Database for the Evaluation of Heart Sound Algorithms. Physiol. Meas. 2016, 37, 2181–2213. [CrossRef] [PubMed]
24. Wu, Y.-C.; Chang, F.-L. Development of an electronic stethoscope using raspberry. In Proceedings of the 2021 IEEE International
Conference on Consumer Electronics-Taiwan, Penghu, Taiwan, 15–17 September 2021.
25. Heart Sounds. Available online: [Link] (accessed on 11 April 2022). (In Chinese)
26. Ward Construction Noise. Available online: [Link] (accessed on 10 April 2022).
27. Airport Noise. Available online: [Link] (accessed on 10 April 2022).
28. Classification of Heart Sound Recordings: The PhysioNet/Computing in Cardiology Challenge 2016. Available online: https:
//[Link]/content/challenge-2016/1.0.0/ (accessed on 10 April 2022).
29. Respiratory Sound Database. Available online: [Link]
(accessed on 10 April 2022).
30. Jolliffe, I.T.; Cadima, J. Principal Component Analysis: A Review and Recent Developments. Philos. Trans. R. Soc. A 2016, 374,
20150202. [CrossRef] [PubMed]
31. Mel Frequency Cepstral Coefficient (MFCC) Tutorial. Available online: [Link]
machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/ (accessed on 11 April 2022).
32. Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Sugantha, P.N. Ensemble Deep Learning: A Review. arXiv 2022,
arXiv:2104.02395v2.
33. Zabihi, M.; Rad, A.B.; Kiranyaz, S.; Gabbouj, M.; Katsaggelos, A.K. Heart sound anomaly and quality detection using ensemble
of neural networks without segmentation. In Proceedings of the 2016 Computing in Cardiology Conference, Vancouver, BC,
Canada, 11–14 September 2016; pp. 613–616.
34. Kay, E.; Agarwal, A. DropConnected neural network trained with diverse features for classifying heart sounds. In Proceedings of
the 2016 Computing in Cardiology Conference, Vancouver, BC, Canada, 11–14 September 2016; pp. 617–620.

You might also like