1 s2.0 S2352047722000429 Main
1 s2.0 S2352047722000429 Main
Original article
Higher agreement between readers with deep learning CAD software for
reporting pulmonary nodules on CT
H.L. Hempel a, M.P. Engbersen b, J. Wakkie b, B.J. van Kelckhoven a, W. de Monyé a, *
a
Department of Radiology, Spaarne Gasthuis Hospital, Hoofddorp, the Netherlands
b
Aidence B.V., Amsterdam, the Netherlands
A R T I C L E I N F O A B S T R A C T
Keywords: Purpose: The aim was to evaluate the impact of CAD software on the pulmonary nodule management recom
Lung nodules mendations of radiologists in a cohort of patients with incidentally detected nodules on CT.
BTS Methods: For this retrospective study, two radiologists independently assessed 50 chest CT cases for pulmonary
Computer aided detection
nodules to determine the appropriate management recommendation, twice, unaided and aided by CAD with a 6-
CT
Deep-learning
month washout period. Management recommendations were given in a 4-point grade based on the BTS guide
lines. Both reading sessions were recorded to determine the reading times per case. A reduction in reading times
per session was tested with a one-tailed paired t-test, and a linear weighted kappa was calculated to assess
interobserver agreement.
Results: The mean age of the included patients was 65.0 ± 10.9. Twenty patients were male (40 %). For both
readers 1 and 2, a significant reduction of reading time was observed of 33.4 % and 42.6 % (p < 0.001, p <
0.001). The linear weighted kappa between readers unaided was 0.61. Readers showed a better agreement with
the aid of CAD, namely by a kappa of 0.84. The mean reading time per case was 226.4 ± 113.2 and 320.8 ±
164.2 s unaided and 150.8 ± 74.2 and 184.2 ± 125.3 s aided by CAD software for readers 1 and 2, respectively.
Conclusion: A dedicated CAD system for aiding in pulmonary nodule reporting may help improve the uniformity
of management recommendations in clinical practice.
1. Introduction at baseline [8]. Considering that more than 95 % of these findings are
benign, it is crucial that pulmonary nodules are managed safely and
The increasing demand for ultrasound, computed tomography (CT), cost-effectively to prevent unnecessary patient burden and healthcare
and magnetic resonance imaging (MRI) has dramatically increased the utilization but still allow for the early detection of lung cancer or lung
workload of radiologists over the last decades. The number of cross- metastases.
sectional studies needing reporting from radiologists increased by two- Specific nodule characteristics help radiologists stratify the risk of
fold in the period 1999–2010 [1], and for CT specifically, the radiolo malignancy. Characteristics such as size, composition, and location are
gist’s workload during on-call hours was reported to have quadrupled implemented in malignancy risk prediction methods, like the Brock or
from 2006 to 2020 [2]. This pressure on the radiologist’s practice can PanCan risk prediction model [9,10], to help determine the level of risk
increase missed cases and diagnostic errors [3,4]. for developing lung cancer. Then there are guidelines that give recom
Some of this increased workload can be attributed to pulmonary mendations regarding an appropriate follow-up such as the 2015 British
nodules, a prevalent CT finding. One or more pulmonary nodules have Thoracic Society (BTS) guidelines and the 2015 Fleischner society
been reported as an incidental finding in 14–31 % of patients under guidelines [11,12]. However, despite this, a low to moderate interob
going chest CT imaging for any clinical indication [5–7] and in 51 % of server agreement is often reported between radiologists on pulmonary
lung cancer screening trial participants, pulmonary nodules were found management recommendations [13–16].
Abbreviations: BTS, british thoracic society; CAD, computer assisted detection; CT, computed tomography; kVp, Peak kilovoltage; MRI, magnetic resonance
imaging; PACS, picture archiving and communication system.
* Corresponding author at: Department of Radiology, Spaarne Gasthuis Hospital, Spaarnepoort 1, 2134 TM, Hoofddorp, the Netherlands.
E-mail address: [email protected] (W. de Monyé).
https://doi.org/10.1016/j.ejro.2022.100435
Received 9 April 2022; Received in revised form 21 July 2022; Accepted 28 July 2022
Available online 2 August 2022
2352-0477/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
H.L. Hempel et al. European Journal of Radiology Open 9 (2022) 100435
Computer-aided detection (CAD) systems have been developed to volume doubling time calculator for this a web-based tool was available
support radiologists in several tasks for reporting pulmonary nodules on (http//:www.chest-xray.com/index.php/calculators/doublingtime).
chest CT, and some of these systems are commercially available. These The first reading session was performed without a CAD system (unaided)
CAD systems have shown high sensitivities on their own [17] and as a and the second session was performed with the availability of the CAD
second or concurrent reader and have been shown to improve a radi outputs (aided) (Veye Chest v2.15.3, Aidence B.V., Amsterdam, NL). The
ologist’s sensitivity for reporting pulmonary nodules [18–20]. How CAD CAD system automatically detects and segments pulmonary nodules and
software affects pulmonary management recommendations remains to provides information such as nodule composition (solid, sub-solid),
be determined. Therefore, this study aimed to evaluate the effect of CAD diameter, volume, and volumetric changes over time (growth percent
software on interobserver agreement of pulmonary nodule management age and volume doubling time). The CAD outputs are made available to
recommendations. the radiologists after processing within the reader’s workstation as two
separate DICOM series of the original scan study. One series contains a
2. Methods single summary image of the nodule findings and the other contains the
original axial chest series with an overlay highlighting the CAD’s nodule
Institutional review board approval was obtained for this single- findings. Each reading session was recorded with screen recording
center study and informed consent was waived due to its retrospective software (Camtasia, TechSmith, Okemos, Michigan, United States).
nature (reference number: 2018.0061). The study was performed in a The 50 main scans of each patient were assessed together with the 35
large teaching hospital in the Netherlands. To prevent any diagnostic or prior scans where applicable as one case. The readers were tasked to
treatment impact on patients as a result of the study, only scans older read the scans to determine the pulmonary nodule management
than 5 years before the start of the study were included. The image recommendation and report relevant pulmonary nodules that contrib
database of the institution was manually consulted for eligible studies uted to their management decision and disregard any concurrent ab
between July 2013 and September 2013 by a resident radiologist. Fifty normalities. The readers reported the relevant nodules’ location,
adult patients scanned with chest CT were selected for pulmonary composition, volume, and if applicable nodule growth percentage and
nodule assessment. Eligibility was determined based on the initial volume doubling time. If volumetry was not deemed reliable, the longest
radiology reports and the availability of prior scans in PACS. Pre axial diameters were reported. An actionable nodule was defined as a
determined stratification criteria ensured a patient cohort containing non-calcified pulmonary nodule with a volume of between 65 mm3 and
cases with and without nodules, as well as with or without prior imag 14.000 mm3 or with the largest axial diameter between 5 mm and 30
ing. The stratification criteria were as follows:(a) no pulmonary nodules, mm that requires follow-up according to the reader. Finally, a nodule
(b) pulmonary nodules without prior scans, (c) pulmonary nodules with management recommendation grade based on the 2015 British Thoracic
prior scans which do not contain actionable nodules, or (d) pulmonary Society guidelines was determined for each case [12]. Figs. S1 and S2,
nodules with prior scans which include actionable nodules that require included in the Supplementary materials present the flow diagrams used
follow-up. Five, ten, five, and thirty patients were included in groups a to come to the recommended patient management using on a 4-point
to d, respectively for a cohort size of 50 patients. Patients with CT scans grade (A-D). After both reading sessions had been completed, all cases
reporting more than 5 pulmonary nodules, a pulmonary mass (>30 mm with discrepant BTS grades between readers were re-evaluated during a
in largest axial diameter), or interstitial lung disease were excluded from consensus meeting and a consensus BTS grade was determined between
this study. the two readers.
The chest CT scans were performed on various multislice systems: Reading time was determined by at least two reviewers indepen
Aquilion One (n = 56), Toshiba Medical Systems, Otawara, Japan, dently from the screen recordings. The start of the reading was defined
Sensation 16 (n = 25), Siemens Medical Solutions, Forchheim, Germany, as the moment where the main scan is opened in the viewer and the end
and Gemini 16 (n = 4), Philips Medical Systems, Best, the Netherlands). was defined as the moment a new main scan is opened or the screen
Scans were performed at 100, 120, or 140 kVp at variable mAs. The recording has ended. Discrepant reading times were re-evaluated by
image data were reconstructed with a lung filter kernel at a slice another reviewer to determine a final reading time.
thickness setting of either 2.00 mm (n = 73), or 3.0 mm (n = 12). The
convolution kernels used were FC08 (n = 2), FC18 (n = 13), FC55 (n = 2.4. Statistical analysis
15) and FC56 (n = 26) for Toshiba systems, B31f (n = 1) and B24f (n =
24) for Siemens systems, and A (n = 2), B (n = 1), L (n = 1) for Philips To summarize patient demographics and radiological findings,
systems. Routine nonionic intravenous contrast was applied in 63/85 continuous or discrete variables are presented as mean and standard
(74.1 %) main and prior scans (300mgI/ml Omnipaque, GE healthcare, deviation or median and range, where appropriate. Categorical vari
IL, USA). ables are summarized in frequencies and percentages of the whole. To
determine whether the mean reading time per scan was reduced by CAD
2.2. CT assessment a one-sided paired t-test was performed. A linear weighted kappa was
used to assess the agreement of the BTS grade between readers and
All scans of the study cohort were anonymized and migrated to a consensus. Confusion matrix analysis with exact binomial confidence
local test workstation which was identical to the workstation used in limits of the BTS grades was performed to evaluate the diagnostic per
clinical practice. Two readers assessed all scans twice (two reading formance of readers versus the consensus reading. Statistical analyses
sessions) with a washout period of 6 months. The order in which the were performed with R statistical software (R.4.1.1, R Foundation for
scans were to be reported was randomized at the start of each reading Statistical Computing, Vienna, Austria) and Python programming lan
session. Reader 1 is a thoracic radiologist with 15 years of experience in guage (version 3.9.7, Python Software Foundation, Delaware, USA).
reporting pulmonary nodules on chest CTs and reader 2 is a general
radiologist with 13 years of experience in reporting pulmonary nodules 3. Results
on chest CTs. The workstation included AGFA enterprise imaging 8.1.2
(AGFA Healthcare N.V., Mortsel, Belgium) and Vitrea Enterprise Solu The mean age in years of the fifty included patients was 65.0 ± 10.9
tion (Vital Images Inc, Minnetonka, Minnesota, United States] (range 32–84) at the time of the main scan. 20 patients were male (40
(“VITREA”), which includes a semi-automated volumetry tool but no %). A total of 64 and 63 nodules were reported by readers 1 and 2
2
H.L. Hempel et al. European Journal of Radiology Open 9 (2022) 100435
3
H.L. Hempel et al. European Journal of Radiology Open 9 (2022) 100435
Fig. 1. Illustrative example of the CAD output as shown to the readers during the aided session of a growing part-solid nodule found in a 57 year old female patient.
5. Conclusion
Fig. 2. Boxplots of the reading times of each reader during unaided and aided
sessions. Each box represents the median (bold horizontal dash) and the A dedicated CAD system for pulmonary nodule reporting may
interquartile range. The tails and additional data points represent the full range improve the interobserver agreement on the management recommen
of the reading times. The notch represents the 95 % confidence interval of dations and which can contribute to the effectiveness of triage algo
the median. rithms for detecting early-stage lung cancer patients.
The reading times in this study were comparable to the reading times CRediT authorship contribution statement
reported by Hsu et al. and Beyer et al. and both studies reported a sig
nificant reduction of readings with CAD aided readings [19,24]. One HL Hempel: Writing – original draft preparation, Data analysis, Data
study demonstrated a reduction of 15.8–29 % in reading times aided by curation MP Engbersen: Writing- Original draft preparation, Visualiza
CAD by six radiologists [24] and the other only 6.9 % on average over tion, Reviewing and Editing J Wakkie: Conceptualization, Reviewing
four radiologists [19]. This study showed higher reductions aided by and Editing BJ van Kelckhoven: Reviewing and Editing, Methodology,
CAD (33–43 %). There could be several reasons for this. One is that our Investigation W de Monyé: Conceptualization, Methodology, Reviewing
cohort included 35 cases with prior scans to consider and only 5 cases and Editing, Investigation, Supervision.
without nodules. Beyer et al. and Hsu et al. included 50 % and 35 % of
cases without nodules, respectively, and no cases with prior imaging. Declaration of Competing Interest
The current study included 20 % of patients without nodules described
in the original report and 30 % of patients with prior imaging. Also, The authors declare the following financial interests/personal re
differences in the CAD systems used may have played a role. lationships which may be considered as potential competing interests:
A radiologist’s workload has substantially increased over the past MPE and JW declare being employed by Aidence BV, the other authors
decades due to higher demands of CT, among others. The prospect of have nothing to declare.
population screening programs for lung cancer with low-dose CT [25,
26] will introduce even more pressure. A reduction in reading time with Acknowledgements
CAD could help radiologists keep up with demand. At our institution,
approximately 11,200 new chest CTs are reported per year of which 55 We would like to extend our gratitude to C. de Monyé, T. Salimans,
% of cases have prior imaging. Although our research suggests an and G. Van Veenendaal for their support and expertise in making this
average reduction in reading time of about two minutes reporting pul study possible.
monary nodules, our cohort is not directly representative of the actual
radiologist’s workload and thus further research is warranted to deter
mine the cost-effectiveness of CAD systems in the clinic.
4
H.L. Hempel et al. European Journal of Radiology Open 9 (2022) 100435
Ethics statement [12] M.E.J. Callister, D.R. Baldwin, A.R. Akram, S. Barnard, P. Cane, J. Draffan, et al.,
British Thoracic Society guidelines for the investigation and management of
pulmonary nodules, Thorax 70 (Suppl 2) (2015) ii1–ii54.
Institutional review board approval was obtained for this single- [13] S.J. van Riel, C.I. Sánchez, A.A. Bankier, D.P. Naidich, J. Verschakelen, E.
center cohort study and informed consent was waived due to its retro T. Scholten, et al., Observer variability for classification of pulmonary nodules on
spective nature (reference number: 2018.0061). low-dose CT images and its effect on nodule management, Radiology 277 (2015)
863–871.
[14] S.J. van Riel, C. Jacobs, E.T. Scholten, R. Wittenberg, M.M. Winkler Wille, B. de
Funding statement Hoop, et al., Observer variability for Lung-RADS categorisation of lung cancer
screening CTs: impact on patient management, Eur. Radio. 29 (2019) 924–931.
[15] A. Penn, M. Ma, B.B. Chou, J.R. Tseng, P. Phan, Inter-reader variability when
No research funding was received for this study. Aidence BV pro applying the 2013 Fleischner guidelines for potential solitary subsolid lung
vided payment for the article processing charges (APC). nodules, Acta Radio. 56 (2015) 1180–1186.
[16] D.S. Gierada, T.K. Pilgram, M. Ford, R.M. Fagerstrom, T.R. Church, H. Nath, et al.,
Lung cancer: interobserver agreement on interpretation of pulmonary findings at
Appendix A. Supporting information low-dose CT screening, Radiology 246 (2008) 265–272.
[17] C.O. Martins Jarnalo, P.V.M. Linsen, S.P. Blazís, P.H.M. van der Valk, D.B.
Supplementary data associated with this article can be found in the M. Dickerscheid, Clinical evaluation of a deep-learning-based computer-aided
detection system for the detection of pulmonary nodules in a large teaching
online version at doi:10.1016/j.ejro.2022.100435. hospital, Clin. Radio. 76 (2021) 838–845.
[18] Y. Zhao, G.H. de Bock, R. Vliegenthart, R.J. van Klaveren, Y. Wang, L. Bogoni, et
References al., Performance of computer-aided detection of pulmonary nodules in low-dose
CT: comparison with double reading by nodule volume, Eur. Radio. 22 (2012)
2076–2084.
[1] R.J. McDonald, K.M. Schwartz, L.J. Eckel, F.E. Diehn, C.H. Hunt, B.J. Bartholmai,
[19] F. Beyer, L. Zierott, E.M. Fallenberg, K.U. Juergens, J. Stoeckel, W. Heindel, et al.,
et al., The effects of changes in utilization and technological advancements of
Comparison of sensitivity and reading time for the use of computer-aided detection
cross-sectional imaging on radiologist workload, Acad. Radio. 22 (2015)
(CAD) of pulmonary nodules at MDCT as concurrent or second reader, Eur. Radio.
1191–1198.
17 (2007) 2941–2947.
[2] R.J.M. Bruls, R.M. Kwee, Workload for radiologists during on-call hours: dramatic
[20] L. Vassallo, A. Traverso, M. Agnello, C. Bracco, D. Campanella, G. Chiara, et al.,
increase in the past 15 years, Insights Imaging 11 (2020) 121.
A cloud-based computer-aided detection system improves identification of lung
[3] E.A. Krupinski, K.S. Berbaum, R.T. Caldwell, K.M. Schartz, M.T. Madsen, D.
nodules on computed tomography scans of patients with extra-thoracic
J. Kramer, Do long radiology workdays affect nodule detection in dynamic CT
malignancies, Eur. Radio. 29 (2019) 144–152.
interpretation? J. Am. Coll. Radio. 9 (2012) 191–198.
[21] S.J. van Riel, C. Jacobs, E.T. Scholten, R. Wittenberg, M.M. Winkler Wille, B. de
[4] E. Sokolovskaya, T. Shinde, R.B. Ruchman, A.J. Kwak, S. Lu, Y.K. Shariff, et al., The
Hoop, et al., Observer variability for Lung-RADS categorisation of lung cancer
effect of faster reporting speed for imaging studies on the number of misses and
screening CTs: impact on patient management, Eur. Radio. 29 (2019) 924–931.
interpretation errors: a pilot study, J. Am. Coll. Radio. 12 (2015) 683–688.
[22] C.A. Ridge, A. Yildirim, P.M. Boiselle, T. Franquet, C.M. Schaefer-Prokop, D. Tack,
[5] J. Robertson, S. Nicholls, P. Bardin, R. Ptasznik, D. Steinfort, A. Miller, Incidental
et al., Differentiating between subsolid and solid pulmonary nodules at ct: inter-
pulmonary nodules are common on CT coronary angiogram and have a significant
and intraobserver agreement between experienced thoracic radiologists, Radiology
cost impact, Heart Lung Circ. 28 (2019) 295–301.
278 (2016) 888–896.
[6] M.K. Gould, T. Tang, I.-L.A. Liu, J. Lee, C. Zheng, K.N. Danforth, et al., Recent
[23] K. Martini, T. Ottilinger, B. Serrallach, S. Markart, N. Glaser-Gallion, C. Blüthgen,
TRENDS IN THE IDENTIFICATION OF INCIDENTAL PULMONARY NODules, Am.
et al., Lung cancer screening with submillisievert chest CT: Potential pitfalls of
J. Respir. Crit. Care Med. 192 (2015) 1208–1214.
pulmonary findings in different readers with various experience levels, Eur. J.
[7] C. Iribarren, M.A. Hlatky, M. Chandra, J.M. Fair, G.D. Rubin, A.S. Go, et al.,
Radio. 121 (2019), 108720.
Incidental pulmonary nodules on cardiac computed tomography: prognosis and
[24] H.-H. Hsu, K.-H. Ko, Y.-C. Chou, Y.-C. Wu, S.-H. Chiu, C.-K. Chang, et al.,
use, Am. J. Med. 121 (2008) 989–996.
Performance and reading time of lung nodule identification on multidetector CT
[8] P.B. Bach, J.N. Mirkin, T.K. Oliver, C.G. Azzoli, D.A. Berry, O.W. Brawley, et al.,
with or without an artificial intelligence-powered computer-aided detection
Benefits and harms of CT screening for lung cancer: a systematic review, JAMA 307
system, Clin. Radio. 76 (626) (2021) e23–626.e32.
(2012) 2418–2429.
[25] M. Oudkerk, A. Devaraj, R. Vliegenthart, T. Henzler, H. Prosch, C.P. Heussel, et al.,
[9] A. McWilliams, M.C. Tammemagi, J.R. Mayo, H. Roberts, G. Liu, K. Soghrati, et al.,
European position statement on lung cancer screening, Lancet Oncol. 18 (2017)
Probability of cancer in pulmonary nodules detected on first screening CT, N. Engl.
e754–e766.
J. Med. 369 (2013) 910–919.
[26] R.A. Smith, K.S. Andrews, D. Brooks, S.A. Fedewa, D. Manassaram-Baptiste,
[10] K. Chung, O.M. Mets, P.K. Gerke, C. Jacobs, A.M. den Harder, E.T. Scholten, et al.,
D. Saslow, et al., Cancer screening in the United States, 2017: a review of current
Brock malignancy risk calculator for pulmonary nodules: validation outside a lung
American Cancer Society guidelines and current issues in cancer screening, CA
cancer screening population, Thorax 73 (2018) 857–863.
Cancer J. Clin. 67 (2017) 100–121.
[11] A.A. Bankier, H. MacMahon, J.M. Goo, G.D. Rubin, C.M. Schaefer-Prokop, D.
P. Naidich, Recommendations for measuring pulmonary nodules at CT: A
statement from the fleischner society, Radiology 285 (2017) 584–600.