0% found this document useful (0 votes)
32 views11 pages

Lost in Digitization - A Systematic Review About The Diagnostic Test Accuracy in Digital Pathology Solution

Uploaded by

sk LO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views11 pages

Lost in Digitization - A Systematic Review About The Diagnostic Test Accuracy in Digital Pathology Solution

Uploaded by

sk LO
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Journal of Pathology Informatics 13 (2022) 100136

Contents lists available at ScienceDirect

Journal of Pathology Informatics


journal homepage: www.elsevier.com/locate/jpi

Review Article

Lost in digitization – A systematic review about the diagnostic test


accuracy of digital pathology solutions

Olsi Kusta a,b, , Charlotte Vestrup Rift c, Torsten Risør d,e, Eric Santoni-Rugiu f,g, John Brandt Brodersen h,i
a
Department of Public Health, University of Copenhagen, Øster Farimagsgade 5 opg. B, Building: 15-0-11, 1014 Copenhagen, Denmark
b
Centre for Research in Assessment and Digital Learning (CRADLE), Deakin University, Melbourne, Australia
c
Department of Pathology, Rigshospitalet (Copenhagen University Hospital), Blegdamsvej 9, 2100 Copenhagen, Denmark
d
Centre for General Practice, Department of Public Health, University of Copenhagen, Øster Farimagsgade 5 opg. Q, Building: 24-1, 1014 Copenhagen, Denmark
e
Norwegian Centre for E-health Research, UiT The Arctic University of Norway, Tromsø, Norway
f
Department of Pathology, Rigshospitalet (Copenhagen University Hospital), Blegdamsvej 9, 2100 Copenhagen, Denmark
g
Department of Clinical Medicine, University of Copenhagen, Blegdamsvej 9, 2100 Copenhagen, Denmark
h
Centre for General Practice, Department of Public Health, University of Copenhagen, Øster Farimagsgade 5 opg. Q, Building: 24-1-21, 1014 Copenhagen, Denmark
i
Primary Health Care Research Unit, Region Zealand, Øster Farimagsgade 5 opg. Q, Building: 24-1-21, 1014 Copenhagen, Denmark.

A R T I C L E I N F O A B S T R A C T

Keywords: Introduction: Digital pathology solutions are increasingly implemented for primary diagnostics in departments of
Human pathology pathology around the world. This has sparked a growing engagement on validation studies to evaluate the diagnostic
Whole slide imaging (WSI) performance of whole slide imaging (WSI) regarding safety, reliability, and accuracy. The aim of this review was to
Validation studies evaluate the performance of digital pathology for diagnostic purposes compared to light microscopy (LM) in human
Diagnostic test accuracy
pathology, based on validation studies designed to assess such technologies.
Diagnostic concordance
Overdiagnosis
Methods: In this systematic review based on PRISMA guidelines, we analyzed validation studies of WSI compared with
LM. We included studies of diagnostic performance of WSI regarding diagnostic test accuracy (DTA) indicators, degree
of overdiagnosis, diagnostic concordance, and observer variability as a secondary outcome. Overdiagnosis is (for
example) detecting a pathological condition that will either not progress or progress very slowly. Thus, the patient
will never get symptoms from this condition and the pathological condition will never be the cause of death. From a
search comprising four databases: PubMed, EMBASE, Cochrane Library, and Web of Science, encompassing the period
2010–2021, we selected and screened 12 peer-reviewed articles that fulfilled our selection criteria. Risk of bias was
conducted through QUADAS-2 tool, and data analysis and synthesis were performed in a qualitative format.
Results: We found that diagnostic performance of WSI was not inferior to LM for DTA indicators, concordance, and
observer variability. The degree of overdiagnosis was not explicitly reported in any of the studies, while the term itself
was used in one study and could be implicitly calculated in another.
Conclusion: WSI had an overall high diagnostic accuracy based on traditional accuracy measurements; however, the
degree of overdiagnosis is unknown.

Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Study characteristics and quality assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Primary and additional outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Diagnostic test accuracy indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Diagnostic concordance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Degree of overdiagnosis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Additional outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

⁎ Corresponding author.
E-mail addresses: [email protected] (O. Kusta), [email protected] (C.V. Rift), [email protected] (T. Risør), [email protected] (E. Santoni-Rugiu),
[email protected] (J.B. Brodersen).

http://dx.doi.org/10.1016/j.jpi.2022.100136
Received 3 June 2022; Received in revised form 30 August 2022; Accepted 31 August 2022
Available online 6 September 2022
2153-3539/© 2022 The Author(s). Published by Elsevier Inc. on behalf of Association for Pathology Informatics. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/by/4.0/).
O. Kusta et al. Journal of Pathology Informatics 13 (2022) 100136

Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Study design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Subspeciality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Sample preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Overdiagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Shortcomings of the systematic review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Implications for practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Funding support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Authors’ contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Conflicts of interests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Appendix A. Supplementary data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Introduction overdiagnosis, the traditional DTA measures would be distorted resulting


in biased performance of the diagnostic test.14,15 The main problem is
In the era of precision medicine, pathology departments face multiple that overdiagnosis cannot be captured in the traditional accuracy measure-
challenges in relation to the complexity of companion diagnostics, and ments based on the Bayesian (2x2) table as misdiagnosis or underdiagnosis,
strict deadlines for timely diagnoses within cancer, chronic inflammatory, as it fulfills the pathological criteria of abnormality.16
and degenerative diseases,1 yielding an increased workload. Many depart- Therefore, our research question was: what is the diagnostic perfor-
ments in different countries are using digital pathology for their routine mance, including the degree of overdiagnosis, of WSI compared to conven-
work as one potential solution to the above challenges.2 In Denmark, for tional LM? Thus, the aim of this study was to evaluate the performance
instance, healthcare policy documents claim that this digital solution through diagnostic test accuracy (DTA) indicators, degree of overdiagnosis,
could facilitate faster response rates, better collaboration with clinicians, diagnostic concordance, and observer variability as a secondary outcome.
and in the future the opportunity to use artificial intelligence to assist This was done through a systematic review of validation studies of WSI
diagnosis.3 versus LM.
Digital pathology, based on whole slide imaging (WSI) technologies,
encompasses mainly 3 major components: information systems, image
management system (IMS), and image analysis tools.4 There are several Materials and methods
advantages of using WSI for clinical purposes, such as fast consultations
(specialists providing second opinions or supervision of residents), remote This systematic review was based on PRISMA-P guidelines,17 with the
interpretation of frozen sections in surgical pathology, and telepathology protocol registered in PROSPERO (CRD42021243403). A PRISMA flow di-
for primary diagnosis.5 Other advantages that make digital pathology agram was created to present the selection process for this systematic re-
appealing are biomarker research6 and the potential advantages of using view (Fig. 1). Two authors (CVR and OK), independently from each
artificial intelligence (AI).7 other, screened the databases, extracted the data, assessed the quality of
Using this technology for in vitro diagnostics (IVD), entails a valida- the studies, analyzed, and provided a synthesis for the results. In cases of
tion process regarding the reliability, safety, and accuracy of these disagreements during these steps, JBB was consulted to arbitrate for these
devices.8 The new European regulation for IVD medical devices (2017/ cases.
746), stipulates that they require a performance evaluation to be The evaluation of WSI versus LM, was based on 3 main outcomes: DTA
approved for clinical use. This evaluation entails 3 main reported steps: indicators,9 diagnostic concordance, and degree of overdiagnosis. For the
scientific validity, analytical performance, and clinical performance.8 latter, we screened for its 2 main causes: overdetection and overdefinition.
The latter is based on diagnostic test accuracy (DTA) indicators as The first is defined as finding pathological abnormalities that will never
also elaborated in the Cochrane collaboration.9 The most commonly progress to do any harm or progress very slowly, thus not being the cause
referred measures of DTA are sensitivity, specificity, predictive values of death.16 Overdefinition, the other subtype, can either be lowering the
(of negative or positive test results), likelihood ratios, receiver operating threshold for a risk factor without evidence of any benefical effects or
characteristics (ROC) curves, and area under the ROC curve (AUC). expanding the disease definition including, e.g., milder symptoms.16 The
The Food and Drug Agency10 (FDA) puts forth additional guidelines for additional outcome included here was observer variability.
the validation process of WSI based on College of American Pathologists Our focus was only on human pathology, including all the tissue speci-
(CAP) recommendations,11 such as pathologists trained with WSI, a repre- men preparations such as biopsies, resected specimens, frozen sections, and
sentative number of cases, an adequate time interval between the use of cytology samples; and all the stains used for diagnostic purposes, such as
LM and WSI for the same case, diagnostic concordance (i.e., intraobserver hematoxylin and eosin (HE), immunohistochemical stains (IHC), and spe-
variability), and that all the material in the glass slide is present in the cial stains. Only WSI systems were considered and no additional system
digital format. In the evaluation and approval of the Philips IntelliSite tools, i.e., image analysis algorithms.4 We included only peer-reviewed
Pathology Solution (PIPS), FDA considered the diagnostic concordance articles regarding clinical evaluation or validation studies and no gray
(96.5%) of WSI as non-inferior to LM in the clinical performance report.12 literature.
We have selected the studies for review based on the accuracy measurements We searched 4 databases during May, and August–October 2021:
as elaborated in both European and US regulations. PubMed, EMBASE, Cochrane Library, and Web of Science – including arti-
However, the use of devices with high resolution potentially introduces cles published during the period 2010–2021. The main simplified search
a risk of overdiagnosis. Overdiagnosis is detecting a cancer, for instance, string was: Digital Pathology (whole slide imaging OR digital microscope
that will not progress (or progress very slowly) to harm the patient or be OR virtual microscope) OR Digital Slides (digitized slides OR virtual slides)
the cause of death.13 In relation to high resolution imaging devices, the AND Diagnostic Accuracy (DTA OR diagnostic performance OR accuracy)
presence of overdiagnosis will cause the sensitivity and the positive- AND NOT Image Processing, Computer Assisted [Mesh terms] (machine
predictive value to be artificially inflated. If there is a substantial risk of learning OR artificial intelligence OR algorithms).

2
O. Kusta et al. Journal of Pathology Informatics 13 (2022) 100136

Fig. 1. Flowchart based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMAa) guidelines.
a
The figure was drafted based on a freely available template at http://prisma-statement.org/documents/PRISMA%202009%20flow%20diagram.pdf.

The quality of the selected studies was assessed through the modified Table 1
Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool.18 Judgement for Risk of Bias summarized for domains (QUADAS 2)a.
The assessment of bias in the studies was based on 4 domains: patient selec- Authors Patient Index Reference Flow and
tion, index test, reference standard, flow of patients in the study, and timing selection test standard timing
of the intervention(s).19 Ammendola et al. 27
? ?
Primary and secondary outcomes are reported in a tabular form, while Brunyé et al.20
the other data extracted as supplementary material. We did not conduct a Cima et al. 31
meta-analysis because of the studies heterogeneity. Elmore et al. 29
Larghi et al. 24
Results Nielsen et al. 30 ?
Perez et al. 21 ?
Study characteristics and quality assessment Ribback et al. 25 ?
Tawfik et al. 28 ?
We identified 2402 unique records in our literature search of which 71 Tawfik et al. 26 ?
articles were included for full text reading and possible elegibility for the Tissier et al. 22 ? ?

study (Fig. 1). Among the 71 articles, 12 fulfilled the main selection criteria Zoroquiain et al. 23 ?

for our study that is reporting at least 2 of the primary outcomes (i.e., DTA a
Table adapted from the freely available template at https://view.officeapps.
indicators, diagnostic concordance, and overdiagnosis). From the 12 live.com/op/view.aspx?src=https%3A%2F%2F2.zoppoz.workers.dev%3A443%2Fhttp%2Fwww.bristol.ac.uk%2Fmedia-
studies in our review, 4 did not specify the kind of study20–23; 3 were retro- library%2Fsites%2Fquadas%2Fmigrated%2Fdocuments%2Ftable.
spective studies,24–26 2 comparative studies,27,28 and the remaining 3 docx&wdOrigin=BROWSELINK.

3
O. Kusta et al. Journal of Pathology Informatics 13 (2022) 100136

Table 2 Primary and additional outcomes


Applicability concerns for the respective domains (QUADAS 2)a.
Authors Patient Index Reference The primary outcomes that we extracted concerning diagnostic perfor-
selection test standard mance of WSI were DTA indicators, diagnostic concordance, and degree
Ammendola et al. 27 of overdiagnosis. As emphasized earlier, the main criteria for selecting the
Brunyé et al. 20 studies was the combination of at least 2 of these outcomes. The additional
Cima et al. 31 ?b outcome that is the observer variability, was extracted as an important ac-
Elmore et al. 29 curacy measure for validating WSI as elaborated by CAP guidelines.11
Larghi et al. 24 Four studies reported on the diagnostic performance of both LM and
Nielsen et al. 30 WSI.24,27,29,30 Below, we describe briefly these outcomes.
Perez et al. 21
Ribback et al. 25 Diagnostic test accuracy indicators
Tawfik et al. 28 c

Tawfik et al. 26 The main DTA indicators reported for WSI in 10 studies were sensitiv-
Tissier et al. 22 ? ? ity, specificity, positive-predictive values, and negative-predictive values
Zoroquiain et al. 23 while in 1 study AUC was reported as a probability.27 One study did not
a
specify any DTA indicators, but only diagnostic concordance.20 From the
Table adapted from the freely available templates at https://view.officeapps.
12 selected studies, 5 were based on histology preparations,22,23,27,29,30 3
live.com/op/view.aspx?src=https%3A%2F%2F2.zoppoz.workers.dev%3A443%2Fhttp%2Fwww.bristol.ac.uk%
2Fmedia-library%2Fsites%2Fquadas%2Fmigrated%2Fdocuments%2Ftable.
used cytology preparations,21,26,28 1 study both histology and cytology
docx&wdOrigin=BROWSELINK. samples,24 while 2 of them frozen sections.25,31 The studies selected encom-
b
Because final FS-FFPE diagnosis based on frozen sections (FS) or formalin-fixed passed several pathology subspecialties, with 2 of them reporting on
and paraffin embedded (FFPE) biopsies may differ from the original assessment multiple25,31 and 1 not specifying the subspecialty.21
even during routine use of LM with frozen section. All the results regarding the primary outcomes of accuracy measure-
c
This refers to the comparison of accuracy of WSI with LM to identify microor- ments are shown in Table 3. At least 7 studies reported a very good perfor-
ganisms and not human cells. mance of WSI based on DTA indicators.21–26,30,31 In these studies,
sensitivity ranged from 86% to 100%, specificity 75% to 100%, positive-
predictive values 92% to 99%, and negative-predictive values from 75%
randomized,29 evaluation,30 and validation study,31 respectively. The char- to 100%. Cima et al., examining frozen sections for intraoperative cancer
acteristics of the studies are presented in the Supplementary Tables 1 and 2. staging and transplant organs, had a drop in specificity and negative-predic-
Of emphasis concerning digitization of slides is that only 2 studies re- tive values (both 75%), due to 4 discordant cases (compared to LM) in ex-
ported minor technical discrepancies. One study elaborated on a technical amining kidney and liver donors transplant organs.31
issue where 11 of 124 slides needed a rescan and 4 were excluded due to In a study of pancreatic pathology, Larghi et al. besides the overall good
failed digitization31; while another stated that 6 slides had loss of diagnostic performance of WSI for sensitivity, specificity, and positive-predictive
material on the fine needle biopsy.21 The most used WSI scanner as values, also reported a poor performance for negative-predictive values
reported in 4 studies, was Aperio ScanScope XT (Aperio Technologies, for both LM and WSI (51% and 52%, respectively).24 However, the authors
Vista, Calif., USA),22,24,26,28 followed by iScan Coreo (Ventana, Tucson, do not explain the reasons for this poor performance.
Ariz., USA) used in 3 studies.20,23,29 In the remaining studies, there were di- One study of gynecological pathology, diagnosing several diseases ac-
verse scanners used such as Mirax scanner (Carl Zeiss MicroImaging, Jena, cording to the 2001 Bethesda Report, stated a poor sensitivity of WSI for
Germany),25,30 NanoZoomer S260 (Hamamatsu photonics, Japan),27 each of the individual diseases (23.5%–58.3%, see Table 3 for more
Navigo (Visia Imaging, Arezzo, Italy),31 and digital camera with NetCam details).28 However, they report a higher average sensitivity (82.1%) that
software (Olympus America, Center Valley, PA).21 is adjusted to the number of cases for each diagnostic category. Similarly,
Regarding the quality assessment of the selected studies, overall there in a study of surgical neuropathology, Ammendola et al. reported a poor
was a low risk of bias and applicability concerns (for more details see performance of both LM and WSI based on AUC (from 0.50 to 0.72) for sev-
Tables 1 and 2, and Fig. 2). eral diagnostic features of meningioma.27

Fig. 2. The proportion of the Risk of Bias and Applicability Concerns (QUADAS 2)a.
a
The drafted figure is a template freely available at https://view.officeapps.live.com/op/view.aspx?src=https%3A%2F%2F2.zoppoz.workers.dev%3A443%2Fhttp%2Fwww.bristol.ac.uk%2Fmedia-library%2Fsites%
2Fquadas%2Fmigrated%2Fdocuments%2Fgraphs.xlsx&wdOrigin=BROWSELINK.
4
Table 3
Primary outcomes of diagnostic test accuracy (DTA) indicators and diagnostic concordance.
Source Subspecialty Diagnostic purpose Primary outcomes
O. Kusta et al.

Ammendola Surgical Neuropathology Grading of meningioma Area Under the Curve (AUC)a
et al.27
Observer 1 Observer 2 Observer 3 Observer 4

Histopathological featuresb LM WSI LM WSI LM WSI LM WSI

Brain invasion 0.50 0.50 0.51 0.51 0.53 0.55 0.50 0.55
High mitotic index 0.64 0.72 0.60 0.61 0.58 0.65 0.56 0.68
Hypercellularity 0.54 0.52 0.58 0.58 0.50 0.50 0.54 0.50
Sheeting 0.57 0.52 0.59 0.59 0.55 0.59 0.50 0.62
Macronucleoli 0.53 0.51 0.55 0.53 0.51 0.53 0.53 0.53
Small cells 0.55 0.51 0.63 0.61 0.54 0.53 0.52 0.54
Spontaneous necrosis 0.51 0.52 0.61 0.61 0.51 0.51 0.56 0.54

Brunyé et al.20 Breast pathology Classification of breast neoplasms Diagnostic concordance (95% CI)

Consensus diagnosis Mean concordance Abovec Belowd

Benign 71% (61–82%) 29% (20–40%) -


Atypia 37% (29–45%) 21% (15–28%) 43% (35–50%)
Ductal Carcinoma in Situ 52% (43–61%) 17% (12–23%) 31% (25–39%)
(DCIS)
Invasive breast cancer 94% (88–99%) – 6% (2–14%)

Cima et al.31 Multiple subspecialties and organs Cancer staging (surgical margins, tumor biology, Primary outcomes Cancer (WSI) Transplant (WSI)
lymph node status) and organ quality for transplantation
Sensitivity 100% 96%
Specificity 96% 75%
Positive-predictive values 95% 96%
Negative-predictive values 100% 75%

5
Diagnostic concordance 97% (к=0.96, CI: 86% (к=0.91, CI:
0.941–0.985) 0.877–0.958)

Elmore et al.29 Breast pathology Diagnosis of breast cancer Predictive values

Pathologist interpretatione LM (95% CI) WSI (95% CI)

Benign without atypia 97.1% 95.7% (95.0–96.4%)


(96.7–97.4%)
Atypia 37.8% 27.8% (23.9–32.5%)
(33.6–42.7%)
Ductal Carcinoma in situ (DCIS) 69.6% 57.1% (50.6–64.8%)
(64.4–75.3%)
Invasive breast cancer 97.7% 97.2% (95.6–98.6%)
(96.5–98.7%)

Larghi et al.24 Pancreatic pathology Diagnostic classification according to the Papanicolau Society of Primary outcomes LM (95% CI) WSI (95% CI)
Cytopathology system for reporting pancreatobiliary cytology
Sensitivity 92% 93%
Specificity 96% 88%
Positive-predictive values 99% 99%
Negative-predictive values 51% 52%
Diagnostic concordance 92% 92%

Nielsen et al.30 Dermatopathology Diagnosing neoplasms of the skin: benign, premalignant, and Primary outcomes LM WSI
malignant
Sensitivity 92% (85–96%) 86% (78–91%)
Specificity 99.5% (97–99.5%) 99% (97–99.5%)
Positive-predictive values 93% (86–96.5%) 92% (84.5–95.5%)
Negative-predictive values 98% (97–99%) 97% (96–98%)
Diagnostic concordancef 72.4% 69.6%

(continued on next page)


Journal of Pathology Informatics 13 (2022) 100136
Perez et al.21 Not specified Diagnosing neoplasms: benign, suspicious, and malignant Primary outcomes WSI

Sensitivity 87.9%
Specificity 95.7%
O. Kusta et al.

Positive-predictive values 97.1%


Negative-predictive values 82.7%.
Diagnostic concordance 87% (163/186)g

Ribback et al.25 Urology, gynecology, and Tumor diagnosis and assessment of surgical margin Primary outcomes WSI
dermatopathology
Sensitivity 92.6%
Specificity 99.0%
Positive-predictive values 98.3%
Negative-predictive values 97.7%
Diagnostic concordance 98.35%

Tawfik et al.26 Gynecological pathology Assessing if negative for intraepithelial lesion or malignancy Sensitivity (95% CI)

Diagnosis WSI

Bacterial vaginosis 92%


Trichomona vaginalis 91%
Fungi 95%

Tawfik et al.28 Gynecological pathology Diagnosing for neoplasms, cellular changes, and infectious agents Weighted average for WSI (95% CI)
according to 2001 Bethesda reporting system and terminology
Diagnosis Sensitivity Specificity

Atypical squamous cells of undetermined significance 58.3% 85.1%


(ASCUS)
Low-grade squamous intraepithelial lesions (LSIL) 54.1% 93.9%
High-grade squamous intraepithelial lesions (HSIL) 51.8% 98.8%
Atypical glandular cells of undetermined significance 32.8% 99.1%
(AGUS)

6
Atypical squamous cells, cannot exclude high-grade 23.5% 99.5%
squamous intraepithelial lesion (ASC-H)
Any conditionh 82.1% 86.2%
22 i j
Tissier et al. Nephropathology Classification of adrenocortical tumor by Weiss score Primary outcomes Reading 1 Reading 2

Sensitivity (95% CI) 86% 94%


Specificity (95% CI) 100% 93%

Zoroquiain et al.23 Ocular pathology Identification of prognostic factors for retinoblastoma Morphological risk factors Classic morphological features

Primary outcomes Optic nerve Invasion Growth Calcification


invasion and spread pattern of
retinoblastoma

Sensitivity 100% 100% 100% 97.8%


Specificity 100% 100% 100% 100%
a
Area under the curve (AUC) is the probability where the test with the target condition will have a higher value than the test without the target condition. It is represented with values from 0 to 1 and not in percentage23.
b
Histopathological features are the main diagnostic findings that help to grade meningioma.
c
Above consensus means over-interpretation of the test to a higher breast cancer stage.
d
Below consensus is the opposite, under-interpretation to a lower stage.
e
Pathologist interpretation is used to denote the comparison during the validation study between WSI and LM, where pathologists have used both technologies.
f
Range of percentages in diagnostic concordance not reported.
g
Range of diagnostic concordance consists in the ratio of the cases that agreed with the consensus diagnosis and the total number of cases.
h
This is the average performance of WSI for all the above diagnostic categories but adjusted for the number of cases for each of the category.
i
Weiss score is a reference method to distinguish between a benign and a malignant adrenocortical tumor (ACT).
j
The study was designed in two stages of using WSI for the examination of the sample and the term ‘reading’ is used by the authors.
Journal of Pathology Informatics 13 (2022) 100136
Table 4
Additional outcomes for intra- and interobserver variability
Source Secondary outcome
O. Kusta et al.

Ammendola et al.27 Surgical Intraobserver variability between LM & WSI


neuropathology
Histopathological features Observer 1 Observer 2 Observer 3 Observer 4 Median

Atypical meningioma 91% 86% 74% 94% 89%


Brain invasion 100% 91% 86% 97% 94%
High mitotic index 80% 79% 77% 71% 78%
Hypercellularity 94% 82% 97% 91% 93%
Sheeting 97% 97% 77% 94% 96%
Macronucleoli 94% 82% 100% 83% 89%
Small cells 97% 94% 97% 91% 96%
Spontaneous necrosis 97% 91% 94% 94% 94%

Interobserver variability between all observers (AO) and senior pathologists (SP)a

LM WSI

Parameter All observers Senior pathologists All observers Senior pathologists

Atypical meningioma 54% 63% 60% 74%


Atypical for major criteria 69% 86% 80% 86%
Atypical for minor criteria 46% 60% 63% 77%
Brain invasion 83% 97% 93% 97%
High mitotic index 80% 86% 69% 80%
Hypercellularity 74% 77% 86% 86%
Sheeting 57% 74% 66% 77%
Macronucleoli 37% 49% 40% 51%
Small cells 34% 49% 34% 49%
Spontaneous necrosis 26% 51% 31% 54%

7
Interobserver variability for all observers

Parameter LM WSI

Brain invasion 83% 89%


High mitotic index 80% 69%
Hypercellularity 74% 86%
Sheeting 57% 66%
Macronucleoli 37% 40%
Small cells 34% 34%
Spontaneous necrosis 27% 31%

Elmore et al. 201729 Breast pathology Interventionb Intraobserver variability

LM VS LM 79%
WSI VS WSI 73%
LM VS WSI 77%
WSI VS LM 76%

Larghi et al.24 Pancreatic pathology Intraobserver variability Interobserver variability

Parametersc LM-WSI LM WSI


d
Diagnostic classification к = 0.87, 95% CI 84.5% [к 0.79; CI 0.71–0.88] 83.5% [к 0.78; CI 0.69–0.87]
0.81−0.93
Presence of core tissue к = 0.68, 95% CI 79.3% [к 0.59; CI 0.45–0.72] 76.3% [к 0.53; CI 0.40–0.66]
0.59−0.77
Number of lesional cells к = 0.67, 95% CI 74.3% [к 0.62; CI 0.52–0.71] 68.7% [к 0.53; CI 0.43–0.63]
0.56−0.77
Percentage of lesional cells к = 0.77, 95% CI 50.2% [к 0.40; CI 0.30–0.50] 50.2% [к 0.38; CI 0.28–0.47]
0.71−0.83
Mean 78.3% [к 0.67; CI 0.57–0.78] 77.8% [к 0.67; CI 0.57–0.77]

(continued on next page)


Journal of Pathology Informatics 13 (2022) 100136
Nielsen et al.30 Dermatopathology Intraobserver variability Interobserver variability

Intervention (к statistics) Pathologist 1 Pathologist 2 Pathologist 3 Pathologist 4 Reading 1e Reading 2

LM 0.91 0.94 0.91 0.97 0.84 0.81


O. Kusta et al.

WSI 0.97 0.86 0.95 0.95 0.85 0.82

Tawfik et al.26 Gynecological Interobserver variability (к statistics LM VS WSI)


pathology
Diagnosis Reviewer 1 Reviewer 2 Reviewer 3 Reviewer 4 Reviewer 5 Weighted mean

Negativef (95% CI) 0.74 0.49 (0.39–0.60) 0.63 0.79 0.61 0.68
(0.67–0.80) (0.52–0.73) (0.70–0.87) (0.52–0.70)
Atypical squamous cells of undetermined significance (ASCUS) 0.46 0.21 (0.10–0.32) 0.36 0.45 0.33 0.39
(95% CI) (0.39–0.52) (0.25–0.46) (0.36–0.44) (0.24–0.43)
Low-grade squamous intraepithelial lesions (LSIL) (95% CI) 0.53 0.41 (0.31–0.52) 0.52 0.55 0.51 0.51
(0.47–0.59) (0.42–0.63) (0.46–0.64) (0.42–0.60)
High-grade squamous intraepithelial lesions (HSIL) (95% CI) 0.58 0.36 (0.26–0.46) 0.42 0.58 0.54 0.52
(0.52–0.64) (0.31–0.52) (0.49–0.67) (0.45–0.63)

Tissier et al.22 Nephropathology Intraobserver variability (Weiss scoreg criteria Interobserver variability (Weiss score criteria reading)
reading)

Diagnostic features Reading 1 Reading 1 Reading 2

Weiss≥3 vs 0–2 0.83 0.70 (0.67–0.74) 0.75 (0.72–0.79)


Necrosis 0.75 0.78 (0.74–0.81) 0.83 (0.79–0.86)
≤25% clear cells 0.42 0.71 (0.68–0.75) 0.80 (0.77–0.83)
Venous Invasion 0.58 0.54 (0.50–0.57) 0.54 (0.50–0.57)
Mitotic figures 0.42 0.54 (0.50–0.57) 0.65 (0.62–0.69)
Capsular Invasion 0.25 0.49 (0.45–0.52) 0.50 (0.47–0.54)
Diffuse architecture 0.33 0.41 (0.37–0.44) 0.50 (0.46–0.53)
Nuclear grade 0.25 0.39 (0.36–0.43) 0.45 (0.41–0.48)
Atypical mitotic figures 0.25 0.29 (0.26–0.33) 0.46 (0.43–0.50)

8
Sinusoidal invasion 0 0.40 (0.37–0.44) 0.30 (0.27–0.33)
Weiss modified by Aubert et al ≥3 vs 0–2 0.50 0.67 (0.64–0.70) 0.75 (0.72–0.78)
a
Interobserver concordance was measured between all the observers (pathologists), but also between senior pathologists versus all the observers that participated in the validation study.
b
Here all the possible combination of comparisons between LM and WSI were tried based on intraobserver agreement.
c
Beside the diagnostic classification, in this study other diagnostic features were considered, therefore we use the term “parameters”.
d
Kappa (к) statistics is used to assess observer agreement for intervention(s).
e
At Nielsen et al., they use the term ‘review’ instead of ‘reading’. We have chosen the latter for a consistent terminology (as it is used e.g. in Tissier et al.).
f
The case does not have the target condition.
g
Weiss score is a reference method to distinguish between a benign and a malignant adrenocortical tumor (ACT).
Journal of Pathology Informatics 13 (2022) 100136
O. Kusta et al. Journal of Pathology Informatics 13 (2022) 100136

Elmore et al., focusing on breast cancer, report a high predictive value, The 12 studies included in the present review displayed a high hetero-
for both LM and WSI, in identifying benign without atypia (97.1% vs geneity and from the analysis of the data extracted, it seems that this has
95.7%) and invasive breast cancer (97.7% vs 97.2%).29 However, they re- implications for the diagnostic performance of WSI in the validation studies
port an average performance for Ductal Carcinoma in Situ (DCIS) (69.6% of pathology. There are 3 main aspects, in addition to the risk of overdiag-
LM vs 57.1% WSI) and a poor performance for atypia (37.8% vs 27.8%). nosis, where heterogeneity played an important role regarding perfor-
mance: study design, subspeciality, and sample preparation.
Diagnostic concordance
Study design
Six studies out of 12 reported the diagnostic concordance of WSI
with LM20,21,24,25,30,31 (Table 3). Four of these, reported a high diagnos- The included studies design were quite diverse regarding the main
tic concordance for WSI in the range 86%–98.35%. Nielsen et al. con- CAP recommendations such as the number of samples, pathologists, wash-
ducting a study in dermatopathology, report an average concordance out period, order of examination with LM and WSI, and the comparison
for both LM and WSI, 72.4% vs 69.6%, respectively. 30 The authors between them. Therefore, a reliable diagnostic performance is directly
briefly elaborate on the poor performance of WSI for premalignant related to the quality of the validation study, as also remarked in another
changes, where the main problems with accuracy (and concordance) systematic review comparing WSI with LM.33 In line with Goacher et al.,
were observed. This might explain the average concordance as opposed the quality of the evidence regarding WSI performance is hampered by
to an otherwise very good performance for DTA indicators (see the the heterogeneity of the study design, despite the evidence that WSI was
subsection above and Table 3). Finally, a study of breast cancer reported not inferior to LM.34 Thus, in our review 4 studies did not have a sufficient
a varying mean concordance for different stages of breast cancer. 20 (60 cases) number of samples as recommended by CAP,20,22,23,27 which
Similarly with the other breast cancer study,29 the poor concordance might have increased the uncertainty due to broader confidence intervals.
was observed for atypia (37%), the very good concordance in invasive Notwithstanding the low risk of bias and applicability, 6 studies did not
breast cancer (94%).20 report on the confidence intervals regarding the diagnostic performance
of WSI or LM.21,23,25,27,30,31 This brings further questions about the sample
Degree of overdiagnosis size and whether it is representative of the population.

The degree of overdiagnosis was not explicitly reported in any of the 12 Subspeciality
studies. There are ongoing and recent discussions whether overdiagnosis
should be defined as a diagnostic error,32, thereby captured by the Bayesian The included 12 studies represent different pathology subspecialties,
reasoning (2x2 table). As Brodersen et al. remark, overdiagnosis is not a and 2 even reporting on multiple subspecialties.25,31 Each subspecialty
false-positive result classified as diagnostic error that with further investiga- involves specific challenges regarding the number and type of diagnostic
tion can be determined as such; it is an abnormality that meets the patho- categories, as well as those cases requiring additional molecular tests for
logical criteria of a disease.16 In one of the selected studies, Elmore and the final diagnosis.
colleagues elaborate on overinterpretation for several grades of breast For instance, Ammendola et al. reported AUC values (for both LM and
cancer on both WSI and LM.29 The term overinterpretation was used to WSI) evaluating atypical meningioma mostly in the range of 0.50–0.60.27
denote the incorrect classification of a lesion to a higher stage. The authors These values indicate a poor performance regarding test accuracy. None-
of this study, calculated that 3% of the cases were overinterpreted as theless, the authors concluded that the suboptimal performance regarding
invasive breast cancer with WSI, thereby overdiagnosed. the grading of meningioma was due to the diagnostic challenges that this
disease poses for pathologists. In this case, more experienced senior pathol-
Additional outcomes ogists performed significantly better than younger ones. This finding has
implications about the role of clinical reasoning in diagnostic accuracy,
Six studies out of 12 reported on observer variability22,24,26,27,29,30 where the literature suggests expertise might be related with experience
(Table 4). Of these, 4 studies tested intra or interobserver variability with especially with pattern recognition of importance in visual diagnostics.32,35,36
Cohen’s kappa (к) statistics,22,24,26,30 and 2 in percentage.27,29 Two studies Parallel to the increasing complexity of examinations, the subspecialty of
calculating intra- and interobserver variability based on к statistics, where gynecological pathology was challenged by a high diagnostic workload.37
the values for both LM and WSI were within к 0.67–0.97.24,30 The 2 other In 2 studies of this subspecialty, the authors assessing the performance of
studies calculated к jointly for LM-WSI for different diagnostic features or WSI based on DTA indicators, evaluated 33528 and 111026 slides. In one of
categories, where interobserver variability was from к 0.21–0.83.22,26 the studies, the WSI showed high sensitivity for assessing intraepithelial
Two studies reported the percentage of observer variability for LM and lesions or malignancies.28 While, the other study displayed an inconsistent
WSI, where intraobserver variability was from 73% to 100% for both.27,29 sensitivity for multiple diagnostic categories, but stated that their method
While, Ammendola et al. calculated also interobserver variability for senior of assessment was as sensitive as the standard reference method.26
pathologists (range 49%–97%) vs all observers (range 26%–93%) and all Girolami et al. asserted that diagnostic performance is related to the
observers for LM (range 27%–83%) and WSI (31%–89%).27 time for making the diagnosis in cytology-based subspecialties.37 In this
regard, Tawfik et al. reported an average scanning and reviewing time of
Discussion 5.5 min with WSI for cytology-based gynecological pathology.26 In 3
other studies measuring the time for diagnosis with WSI, 2 stated that turn-
The selected studies in this systematic review displayed a low risk of bias around time (time of the arrival of the specimen until the communication of
and applicability concerns as measured with the QUADAS-2.18,19 We found diagnosis) was comparable between LM and WSI,25,31 while Larghi et al.
that WSI was not inferior to LM regarding diagnostic performance. In addi- reported a comparable time for reviewing slides with LM and WSI, 84
tion, in 4 studies reporting both LM and WSI, their performances were and 108 s, respectively.24
comparable.24,27,29,30 Moreover, 8 out of 12 studies state an overall very
good performance of WSI regarding DTA and diagnostic concordance. How- Sample preparation
ever, the degree of overdiagnosis was not reported in any of the selected stud-
ies, which might have an impact on artificially increasing the performance of Sample preparation techniques pose specific challenges for slide digiti-
WSI like other newer imaging tests. In this regard, Heleno et al. assessing the zation that might affect the performance of WSI, both regarding accuracy
accuracy of low-dose CT scans for lung cancer screening, found that overdiag- and time. One such example are cytology preparations – where smear thick-
nosis inflated sensitivity and positive-predictive values.13 ness, overlapping cells, and obscuring backgrounds require multiplane

9
O. Kusta et al. Journal of Pathology Informatics 13 (2022) 100136

(z-stacking) focusing for digital slides.28 From the selected articles, 3 of presented as a good solution to address the lack of pathologists and a grow-
them were based on cytology preparations,21,26,28 1 involved both cell- ing workload. Following this, the possibility to train residents and patholo-
blocks (cytology) and histology samples,24 while 2 of them used frozen gists with this digital solution adds to the capacity building in order to
sections.25 Despite the difficulties of sample preparation, all these studies tackle these challenges.2 Finally, the prospect of using AI algorithms for
reported a comparable performance of WSI with LM. quantitive measuring, counting, and computer-assisted diagnosis might
This important aspect of using WSI with z-stacking for routine work with contribute in better diagnostic accuracy and saving time for pathologists.4,7
cytology preparations was also emphasized in a systematic review of digital
pathology for cytopathology.37 However, one study of surgical neuropathol- Conclusion
ogy based on histology preparations used 7 z-stack planes and a technique
for optimizing the digital slide.27 Notwithstanding the fact that histology is We found that WSI was not inferior to LM regarding DTA and diagnostic
less challenging for digitization, the performance of pathologists was not concordance. However, the degree of overdiagnosis was not systematically
more accurate than with LM. However, even with single or multiple z- reported and is thereby unknown. The diverse subspecialties and their labo-
stacking, cytopathology and frozen sections are still difficult to digitize ratory tasks pose important questions whether it is possible to compare LM
with a high quality of image as it can be achieved with histopathology slides. and WSI across all these subspecialties, or that perhaps LM has advantages
in some and WSI in others. When considering the implementation of digital
Overdiagnosis pathology, departments should also take into account the advantages for
remote diagnosis and consultations, cancer research, digital multidisciplinary
Adding to the challenges relating to diagnostic performance and the case conferences, supervision of residents, and storage of digital slides. How-
role of heterogeneity, overdiagnosis poses other difficulties. Although its ever, the designers of the validation studies and the participating pathologists
degree was not reported explicitly, it was briefly addressed in the 2 breast should be careful in those areas where the risk of overdiagnosis exists.
cancer studies.20,29 Brunyé et al. mention the notion of overdiagnosis, by
elaborating on its unnecessary and costly treatment and intervention proce-
Funding support
dures, for instance, when a biopsy is interpreted as ductal carcinoma in situ
(DCIS) when in fact is atypia.20 Conversely, Elmore et al. calculated the
This research did not receive any specific grant from funding agencies in
number of cases incorrectly classified to a higher stage (per hundred
the public, commercial, or not-for-profit sectors.
cases), showing that 3% with WSI and 2% with LM (as the reference stan-
dard) of cases were overinterpreted as invasive breast cancer.29 However,
this was a validation study scenario, where clinical outcomes were not cal- Authors’ contributions
culated, but only the performance of the pathologists involved in this study.
In this regard, future studies should evaluate the DTA of WSI by including OK and JBB conceptualized the systematic review. The other authors
patient-relevant outcomes, and thereby overdiagnosis in a randomized design helped to refine conceptualization before submitting the protocol. Database
to encompass the full spectrum of cases.29 search, screening, data extraction, risk of bias, data analysis and synthesis,
While there are 5 cancers documented with high risk of overdiagnosis, were conducted independently by CVR and OK. JBB acted as an arbiter in
the reasons for each of them are different such as screening (i.e., breast can- cases of disagreement. ESR helped with the terminology in the study and
cer, prostate cancer, and melanoma), incidental findings (renal cancer), or his expertise as a senior pathologist throughout different steps. TR helped
both incidental findings and excessive investigation (thyroid cancer).38 with the writing and reviewing the manuscript of the review. OK and
However, there are other cases such as lung cancer, where overdiagnosis CVR wrote the first draft and all the other authors helped during the
is possible if screening for lung cancer is implemented.39 In this review, writing, editing, and reviewing process.
we focused on pathological diagnostics by comparing WSI to LM and not
on the above factors for overdiagnosis. In this regard, the Cochrane Collab- Conflicts of interests
oration has launched a new research field regarding the use of evidence to
tackle overdiagnosis and its consequences.40 The authors declare no conflicts of interests.

Shortcomings of the systematic review


Acknowledgements
The heterogeneity of the included studies hindered the possibility
We are grateful to Susie Rimborg for her help on how to conduct
of conducting a meta-analysis, thereby limiting the comparative power of
advanced search in the medical databases. We thank also Klaus Høyer,
our study. While this could have provided a quantitative summary of the di-
Margaret Bearman, and Radhika Gorur for their help and support during
agnostic performance of WSI in comparison to LM, the descriptive analysis
the process of conducting this review. Finally, we would like to thank all
in this review provided a qualitative account for it. The combination of at
those that helped with their comments to improve this review.
least 2 primary outcomes as the main criteria for selection, limited the num-
ber of the included studies. However, this was a methodological choice to
Appendix A. Supplementary data
include several accuracy measurements (i.e., DTA indicators, diagnostic
concordance, and observer variability) for assessing the diagnostic perfor-
Supplementary data to this article can be found online at https://doi.
mance of WSI. Ultimately, the question whether WSI should be imple-
org/10.1016/j.jpi.2022.100136.
mented for routine work in pathology depends on how WSI addresses the
logistical and organizational challenges that pathology departments face
and the opportunities they afford. While, the opportunities of using digital References
pathology solutions are increasingly related with the use of AI for image
analysi,s6,7 in this review, we do not address this aspect. 1. Williams BJ, Bottoms D, Clark D, Treanor D. Future-proofing pathology part 2: building a
business case for digital pathology. J Clin Pathol Mar 2019;72(3):198–205. https://doi.
org/10.1136/jclinpath-2017-204926.
Implications for practice 2. Bongaerts O, Clevers C, Debets M, et al. Conventional microscopical versus digital whole-
slide imaging-based diagnosis of thin-layer cervical specimens: a validation study. J
With a continuing shortage of pathologists and the multiple challenges Pathol Inform 2018;9:29. https://doi.org/10.4103/jpi.jpi_28_18.
3. Meeting of Executive. Committee of Capital Region, Denmark. 2019.
that these departments face, digital pathology presents some opportunities 4. Hanna MG, Pantanowitz L. Digital pathology. Encyclopedia of Biomedical Engineering;
to address them. Remote work and consultations5 through WSI are often 2019. p. 524–532.

10
O. Kusta et al. Journal of Pathology Informatics 13 (2022) 100136

5. Pantanowitz L, Farahani N, Parwani A. Whole slide imaging in pathology: advantages, 23. Zoroquiain P, Logan P, Bravo-Filho V, et al. Diagnosing pathological prognostic factors in
limitations, and emerging perspectives. Pathol Lab Med Int 2015. https://doi.org/10. retinoblastoma: correlation between traditional microscopy and digital slides. Ocul
2147/plmi.S59826. Oncol Pathol Jun 2015;1(4):259–265. https://doi.org/10.1159/000381155.
6. Aeffner F, Zarella MD, Buchbinder N, et al. Introduction to digital image analysis in 24. Larghi A, Fornelli A, Lega S, et al. Concordance, intra- and inter-observer agreements be-
whole-slide imaging: a white paper from the digital pathology association. J Pathol In- tween light microscopy and whole slide imaging for samples acquired by EUS in pancre-
form 2019;10:9. https://doi.org/10.4103/jpi.jpi_82_18. atic solid lesions. Dig Liver Dis Nov 2019;51(11):1574–1579. https://doi.org/10.1016/j.
7. Niazi MKK, Parwani AV, Gurcan MN. Digital pathology and artificial intelligence. Lancet dld.2019.04.019.
Oncol 2019;20(5):e253–e261. https://doi.org/10.1016/s1470-2045(19)30154-8. 25. Ribback S, Flessa S, Gromoll-Bergmann K, Evert M, Dombrowski F. Virtual slide
8. Garcia-Rojo M, De Mena D, Muriel-Cueto P, Atienza-Cuevas L, Dominguez-Gomez M, telepathology with scanner systems for intraoperative frozen-section consultation. Pathol
Bueno G. New European union regulations related to whole slide image scanners and Res Pract Jun 2014;210(6):377–382. https://doi.org/10.1016/j.prp.2014.02.007.
image analysis software. J Pathol Inform 2019;10:2. https://doi.org/10.4103/jpi.jpi_ 26. Tawfik O, Davis M, Dillon S, et al. Whole-slide imaging of pap cellblock preparations is a
33_18. potentially valid screening method. Acta Cytol 2015;59(2):187–200. https://doi.org/10.
9. Deeks JJTY, Macaskill P, Bossuyt PM. Chapter 5: Understanding test accuracy measures. 1159/000430082.
Draft version (29 October 2021). Cochrane Handbook for Systematic Reviews of Diag- 27. Ammendola S, Bariani E, Eccher A, et al. The histopathological diagnosis of atypical me-
nostic Test Accuracy. Cochrane; 2021. ningioma: glass slide versus whole slide imaging for grading assessment. Virchows Arch
10. FaD Administration. Medical devices; hematology and pathology devices. Classifi- Apr 2021;478(4):747–756. https://doi.org/10.1007/s00428-020-02988-1.
cation of Blood Establishment Computer Software and Accessories, 83. ; 2018. 28. Tawfik O, Davis M, Dillon S, Tawfik L, Diaz FJ, Fan F. Whole slide imaging of pap cell
p. 23212.0097-6326. block preparations versus liquid-based thin-layer cervical cytology: a comparative
11. Pantanowitz L, Sinard JH, Henricks WH, et al. Validating whole slide imaging for diag- study evaluating the detection of organisms and nonneoplastic findings. Acta Cytol
nostic purposes in pathology: guideline from the College of American Pathologists Pa- 2014;58(4):388–397. https://doi.org/10.1159/000365046.
thology and Laboratory Quality Center. Arch Pathol Lab Med Dec 2013;137(12):1710– 29. Elmore JG, Longton GM, Pepe MS, et al. A randomized study comparing digital imaging
1722. https://doi.org/10.5858/arpa.2013-0093-CP. to traditional glass slide microscopy for breast biopsy and cancer diagnosis. J Pathol In-
12. Evaluation of Automatic. CLass III Designation for Philips IntelliSite Pathology Solution form 2017;8:12. https://doi.org/10.4103/2153-3539.201920.
(PIPS) (FDA). 2017:1-19. 30. Nielsen PS, Lindebjerg J, Rasmussen J, Starklint H, Waldstrom M, Nielsen B. Virtual
13. Heleno B. Quantification of harms in cancer screening: are numbers available and what microscopy: an evaluation of its validity and diagnostic performance in routine histologic
do they mean?. PhD thesis. Faculty of Health and Medical Sciences, University of diagnosis of skin tumors. Hum Pathol Dec 2010;41(12):1770–1776. https://doi.org/10.
Copenhagen. 2015. 1016/j.humpath.2010.05.015.
14. Rogers WA, Mintzker Y. Casting the net too wide on overdiagnosis: benefits, burdens and 31. Cima L, Brunelli M, Parwani A, et al. Validation of remote digital frozen sections for can-
non-harmful disease. J Med Ethics Nov 2016;42(11):717–719. https://doi.org/10.1136/ cer and transplant intraoperative services. J Pathol Inform 2018;9:34. https://doi.org/
medethics-2016-103715. 10.4103/jpi.jpi_52_18.
15. Brodersen J, Schwartz LM, Woloshin S. Overdiagnosis: how cancer screening can turn in- 32. Balogh EP, Miller BT. In: Ball JR, ed. Improving Diagnosis in Health Care/Committee on
dolent pathology into illness. APMIS Aug 2014;122(8):683–689. https://doi.org/10. Diagnostic Error in Health Care. Washington (DC): The National Academies Press; 2015.
1111/apm.12278. 33. Araujo ALD, Arboleda LPA, Palmier NR, et al. The performance of digital microscopy for
16. Brodersen J, Schwartz LM, Heneghan C, O’Sullivan JW, Aronson JK, Woloshin S. Overdi- primary diagnosis in human pathology: a systematic review. Virchows Arch Mar
agnosis: what it is and what it isn’t. BMJ Evid-Based Med 2018;23(1):1–3. https://doi. 2019;474(3):269–287. https://doi.org/10.1007/s00428-018-02519-z.
org/10.1136/ebmed-2017-110886. 34. Goacher E, Randell R, Williams B, Treanor D. The diagnostic concordance of whole slide
17. Shamseer L, Moher D, Clarke M, et al. Preferred reporting items for systematic review imaging and light microscopy: a systematic review. Arch Pathol Lab Med Jan 2017;141
and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation. BMJ Jan 2 (1):151–161. https://doi.org/10.5858/arpa.2016-0025-RA.
2015;350:g7647. https://doi.org/10.1136/bmj.g7647. 35. Eva KW. What every teacher needs to know about clinical reasoning. Med Educ Jan
18. Whiting PF, Rutjes AWS, Westwood ME, et al. QUADAS-2: a revised tool for the quality 2005;39(1):98-106. https://doi.org/10.1111/j.1365-2929.2004.01972.x.
assessment of diagnostic accuracy studies. Ann Intern Med 2011;155(8):529–536. 36. Norman GR, Eva KW. Diagnostic error and clinical reasoning. Med Educ Jan 2010;44(1):
https://doi.org/10.7326/0003-4819-155-8-201110180-00009. 94-100. https://doi.org/10.1111/j.1365-2923.2009.03507.x.
19. Uo Bristol. QUADAS2: background document. 2014. 37. Girolami I, Pantanowitz L, Marletta S, et al. Diagnostic concordance between whole slide
20. Brunye TT, Mercan E, Weaver DL, Elmore JG. Accuracy is in the eyes of the pathologist: imaging and conventional light microscopy in cytopathology: a systematic review. Can-
the visual interpretive process and diagnostic accuracy with digital whole slide images. J cer Cytopathol Jan 2020;128(1):17–28. https://doi.org/10.1002/cncy.22195.
Biomed Inform Feb 2017;66:171–179. https://doi.org/10.1016/j.jbi.2017.01.004. 38. Glasziou PP, Jones MA, Pathirana T, Barratt AL, Bell KJ. Estimating the magnitude of can-
21. Perez D, Stemmer MN, Khurana KK. Utilization of dynamic telecytopathology for rapid cer overdiagnosis in Australia. Med J Aust Mar 2020;212(4):163–168. https://doi.org/
onsite evaluation of touch imprint cytology of needle core biopsy: diagnostic accuracy 10.5694/mja2.50455.
and pitfalls. Telemed J E Health May 2021;27(5):525–531. https://doi.org/10.1089/ 39. Brodersen J, Voss T, Martiny F, Siersma V, Barratt A, Heleno B. Overdiagnosis of lung
tmj.2020.0117. cancer with low-dose computed tomography screening: meta-analysis of the randomised
22. Tissier F, Aubert S, Leteurtre E, et al. Adrenocortical tumors: improving the practice of clinical trials. Breathe (Sheff) Mar 2020;16(1), 200013. https://doi.org/10.1183/
the Weiss system through virtual microscopy: a National Program of the French Network 20734735.0013-2020.
INCa-COMETE. Am J Surg Pathol 2012;36(8):1194–1201. https://doi.org/10.1097/PAS. 40. Mahase E. Cochrane launches new research field to tackle overdiagnosis and medical
0b013e31825a6308. excess. BMJ Dec 6 2019;367:l6817. https://doi.org/10.1136/bmj.l6817.

11

You might also like