0% found this document useful (0 votes)

12 views13 pages

8 Artificial Intelligence (AI) in Restorative Dentistry Performance of AI Models Designed For Detection of Interproximal Carious Lesions On Primary and Permanent Dentition

This study evaluates the effectiveness of deep learning methods for detecting dental caries in bitewing radiographs, utilizing a dataset of 771 images from both adults and children. The results indicate high accuracy in detecting advanced caries, with intersection over union (IoU) scores averaging 98% for adults and 97% for children, although misclassification between primary and moderate caries was noted. The findings suggest that AI can significantly enhance the accuracy of caries detection, potentially improving clinical outcomes in restorative dentistry.

Uploaded by

MILLER FRANKLIN ROMERO AGUILAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views13 pages

8 Artificial Intelligence (AI) in Restorative Dentistry Performance of AI Models Designed For Detection of Interproximal Carious Lesions On Primary and Permanent Dentition

Uploaded by

MILLER FRANKLIN ROMERO AGUILAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Original Research

DIGITAL HEALTH
Volume 9: 1–13
Artiﬁcial intelligence (AI) in restorative dentistry: © The Author(s) 2023
Article reuse guidelines:
Performance of AI models designed for detection sagepub.com/journals-permissions
DOI: 10.1177/20552076231216681
of interproximal carious lesions on primary and journals.sagepub.com/home/dhj

permanent dentition

Amr Ahmed Azhari1 , Narmin Helal2, Leena M Sabri3

and Abeer Abduljawad3

Abstract

Objective: The objective of this study was to evaluate the effectiveness of deep learning methods in detecting dental caries
from radiographic images.
Methods: A total of 771 bitewing radiographs were divided into two groups: adult (n = 554) and pediatric (n = 217). Two
distinct semantic segmentation models were constructed for each group. They were manually labeled by general dentists
for semantic segmentation. The inter-examiner reliability of the two examiners was also measured. Finally, the models
were trained using transfer learning methodology along with computer science advanced tools, such as ensemble U-Nets
with ResNet50, ResNext101, and Vgg19 as the encoders, which were all pretrained on ImageNet weights using a training
dataset.
Results: Intersection over union (IoU) score was used to evaluate the outcomes of the deep learning model. For the adult
dataset, the IoU averaged 98%, 23%, 19%, and 51% for zero, primary, moderate, and advanced carious lesions, respect-
ively. For pediatric bitewings, the IoU averaged 97%, 8%, 17%, and 25% for zero, primary, moderate, and advanced caries,
respectively. Advanced caries was more accurately detected than primary caries on adults and pediatric bitewings P < 0.05.
Conclusions: The proposed deep learning models can accurately detect advanced caries in permanent or primary bitewing
radiographs. Misclassiﬁcation mostly occurs between primary and moderate caries. Although the model performed well in
correctly classifying the lesions, it can misclassify one as the other or does not accurately capture the depth of the lesion at
this early stage.

Keywords

Deep learning, artiﬁcial intelligence, bitewings, caries, x-rays, dental caries, bioinformatics
Submission date: 20 March 2023; Acceptance date: 8 November 2023

Introduction 1
Department of Restorative Dentistry, Faculty of Dentistry, King Abdulaziz
Dental caries is one of the most common chronic human University, Jeddah, Saudi Arabia
diseases in the world1; it is a multifactorial, infectious 2
Department of Pediatric Dentistry, Faculty of Dentistry, King Abdulaziz
oral disease caused primarily by the complex interaction University, Jeddah, Saudi Arabia
3
Internship Training Program, Faculty of Dentistry, King Abdulaziz
of cariogenic oral bioﬁlm on the tooth surface with ferment-
University, Jeddah, Saudi Arabia
able dietary carbohydrates over time.2a The occurrence of
Corresponding author:
dental caries is the ramiﬁcation of the breakdown of food Amr Ahmed Azhari, Department of Restorative Dentistry, Faculty of
remains by bacteria. Acidic substances eventually degrade Dentistry, King Abdulaziz University, Jeddah, Saudi Arabia.
tooth structure, creating cavities. Two types of bacteria Email: [email protected]

Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial
4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work
without further permission provided the original work is attributed as speciﬁed on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/
open-access-at-sage).
2 DIGITAL HEALTH

are responsible for this phenomenon: streptococcus mutans automatic caries detection and segmentation models using
and lactobacilli.2b In severe cases, dental caries can lead to VGG-16 and U-Net architecture, achieving sensitivity, pre-
toothache, tooth structure loss, intra-oral abscess, or facial cision, and F-measure rates of 0.84, 0.81, 0.84, 0.86, and
swelling.3 Although dental radiography (including pano- 0.84, 0.84, respectively. However, the model faced limita-
ramic, periapical, and bitewing views) and explorers tions, such as being trained with the same parameters and
(or dental probes), which are broadly utilized and respected having a smaller sample size. Casalegno et al..18 presented
to be exceptionally dependable indicative devices for the a deep-learning model based on CNN for automated tooth
recognition of dental caries,4 they are subject to practitioner caries perception in NILT images, achieving an average
exposure, experience, and fatigue levels. IOU rate of 72.7%. Devito et al. evaluated the success of
The importance of AI in caries detection lies in its poten- radiographic diagnosis of proximal caries using extracted
tial to assist clinicians in detecting tooth caries quickly and teeth on bitewing radiographs, finding a diagnostic improve-
reliably in routine clinical practice. The accuracy of early ment of 39.4% using the AI model.
diagnosis of dental caries remains a challenge for dentists.5 Because of the different anatomical morphologies of
Machine learning is a computational tool that utilizes algo- teeth and shapes of restorations, no substantial improve-
rithms and data inputs to self-improve and learn automatic- ment can be achieved in the demonstrative strategy for dis-
ally via experience and exposure to a wide range of samples tinguishing dental caries. Therefore, using deep learning in
and variables. These algorithms are based on a specific caries detection, dental professionals can potentially detect
computational model that directs them to retrieve caries at an earlier stage, leading to more effective and less
results related to pre-determined tasks.7,8 Lee et al. devel- invasive treatment. It can also help to reduce the workload
oping a deep CNN model called GoogLeNet for detecting on dental professionals, allowing them to focus on other
dental caries in periapical radiographs.9 The deep CNN aspects of patient care.
algorithm demonstrated good detection and diagnostic Early stage caries detection interproximally can be
performance, with molar models achieving diagnostic missed by visual examination in both primary and perman-
accuracies of 88.0%, 89.0%, and 82.0%, respectively. ent dentition. Consequently, it can progress into an irrevers-
The premolar model provided the highest AUC, signifi- ible situation that preventive measures can be difficult to
cantly higher than other models. Mao et al. developed a demineralize it. Thus, artificial intelligence and machine
conventional network model called AlexNet for restor- learning have been used to increase the accuracy of dental
ation and caries determinations, with accuracies of practitioners’ diagnosis.6 In this research paper, the objec-
95.56% and 90.30%, respectively.10 These studies high- tives of the study were to evaluate the effectiveness of
light the potential of deep learning in dental caries detec- deep learning methods in the detection of dental caries
tion and restoration. from bitewing radiographic images collected from King
Cantu et al. developed a CNN (U-Net) to evaluate deep AbdulAziz University Hospital data base in primary and
learning models against individual dentists in detecting permanent dentition and to determine whether the applica-
carious lesions11 with mean accuracy of 0.80 compared to tion of deep learning methods improved the dentists’ accur-
dentists’ mean accuracy 0.71. Furthermore, Lian et al. acy in detecting proximal dental carious lesions from an
developed CNN models, nnU-Net and DenseNet121, to intra-oral bitewings radiograph.
detect caries lesions and classify radiographic extensions
on panoramic films.12 The results showed tantamount
between expert dentists and neural networks. Moran et al. Materials and methods
evaluated the effectiveness of deep CNN algorithms for
detecting and diagnosing dental caries on periapical radio- Datasets
graphs.13 Within 480 teeth images obtained, the CNN iden- This research study was conducted at the Faculty of Dentistry
tified 18 incipient and 16 advanced lesions, with less of King Abdulaziz University Dental Hospital in Jeddah,
experienced dentists reporting statistically indistinguishable Saudi Arabia. The bitewing radiographic image datasets
results. Singh et al. developed a CNN-LSTM model using were obtained from dental practitioners’ historians and
1500 dental images as training data and 300 as testing the Electronic Medical Record System (CS R4 Practice
data.14 The CNN-LSTM model demonstrated high accur- Management Software) of the University Hospital between
acy and reliability. Likewise, Lee et al. developed a CNN August 2021 and April 2022. All images were obtained
model for early dental caries detection on bitewing radio- with permission from the radiology department of King
graphs, using 304 bitewing radiographs and 50 radiographs Abdulaziz University Dental Hospital (proposal number.032-
for training.15 The model’s performance evaluation showed 02-22). Labeling was performed after the dataset was
improved diagnostic accuracy, but more stable results are collected. The dataset consisted of adult and pediatric bitew-
needed. Bayrakdar et al. explored the use of CNN-based ings, which included primary, moderate, and advanced
AI algorithms for accurate tooth caries detection and seg- dental caries. All images were selected using the follow-
mentation in bitewing radiographs.17 They developed ing criteria.
Azhari et al. 3

1. The inclusion criteria were bitewing images, enamel 3. Advanced caries: radiolucent, extending into the inner
and dentin carious lesions, and primary and permanent one-third of dentin D3.
teeth.
2. The exclusion criteria were periapical images, overlap-
ping images, radiographs with distortions and shadows, Tools and environment
and images with full crowns only, as well as bridges. The obtained radiographic images were processed using
Python in addition to Python-friendly environments
Criteria for excluding images from the training dataset were such as Google Colab. Python is a widely used program-
as the following: (1) Images with poor quality, such as those ming language. It is a complete language and platform
with artifacts, blurriness, or extreme overexposure, (2) that can be used for research and production system
Images that contain identifiable patient information to development. Google Colab enables programmers to
comply with privacy regulations and ethical standards, (3) write and execute arbitrary Python code through their
Duplicate images was excluded to prevent overrepresenta- web browsers.
tion and potential bias in the dataset. (4) Bitewings It was used throughout this study because it is well-
images with very low resolution, (5) Images with confound- suited to machine learning and data analysis. Additionally,
ing factors, such as the presence of dental crowns/fixed the Keras and Segmentation_models Library were used to
dental prothesis in all the teeth or orthodontic appliances develop the current model. The main features of this library
that might interfere with caries detection. can be summarized as having a high-level API, implying
The collected data were used in a manner consistent with that complicated tasks can be performed with extremely few
ethical principles and legal requirements. Data were secured lines of code, which significantly reduces the time needed
in a separate computer at the hospital and never accessed by for code writing and enables the implementation of more
any external sources. Participants were informed and con- experiments.
sented during file opening stage to use their radiographic In addition, the library has four model architectures for
images for research and education purposes. The collected binary and multiclass image segmentation, which have 25
and labeled dataset consisted of nearly 554 adult bitewings available backbones for each architecture, all trained on
and 217 pediatric bitewings. All images were converted image weights, and finally, it has a helpful set of segmenta-
from the original format to JPEG file format to unify the tion losses (Jaccard, Dice, Focal) and metrics (IoU,
quality of the image processing procedure. The dataset F-score).
was split into three subsets: training, validation, and test
sets for adults as well as pediatrics. Using adult and pediat-
ric bitewings to train the same model would lead to mis- Cleaning, augmentation and labeling
classification owing to differences in tooth density Duplicated images as well as images where caries are chal-
between adults and pediatrics. Therefore, we decided to lenging to identify because of technical errors that might
train each individual group separately. The label distribu- interfere with caries identification were omitted from the
tion for each group is shown in the following figure dataset to ensure better accuracy. The process of image aug-
(Figures 1–3): mentation was applied to images with low brightness or
After collecting the radiographic images, we measured contrast. Then, all images in the dataset were manually
the reliability of the two examiners, blinded from the labeled by dentists.
study, to ensure that all examiners were calibrated by label- Labeling is a process in which data are structured in a
ing the radiographic image and to avoid bias. Inter- and way that a computer can understand, and the labels corres-
intra-rater reliability tests were performed between two pond to the intended outcome of the machine learning
examiners and graded by a restorative dentist consultant. model. Labeling tasks can be classified into four main
Examiners were tested by labeling 10 radiographs per types: categorization, segmentation, sequencing, and
examiner. Radiographs were containing a mixture of mapping.
primary, moderate and advanced caries and Kappa was Categorization refers to the task in which an image is
over 0.8 between them. All images were manually labeled assigned to a category, which could be binary or multiclass
according to the International Caries Detection and labeling; for example, classifying image data of moles into
Assessment System (ICIDAS) stratification according to cancer or noncancer.
lesion depth: Segmentation is a task in which data are divided
into segments. This approach can be applied to
1. Primary caries: radiolucent extending into inner enamel various data types. Using image data, segmentation
junction E1, E2. identifies the pixels in an image that belong to a spe-
2. Moderate caries: radiolucent extending into the cific object or object class. For example, in a medical
dentin-enamel junction and middle one-third of the scan, segmentation can label different organs
dentin D1 and D2. separately.
4 DIGITAL HEALTH

Figure 1. Adults’ labels distribution: primary: 879, moderate: 559, advanced: 422. According to International Caries Detection and
Assessment System (ICIDAS), primary caries is initial enamel caries, moderate caries is deﬁned by the extension of proximal caries into the
outer 2/3 of dentin and advanced caries is deﬁned by the extension of proximal caries into the inner 1/3 of dentin.

Figure 2. Pediatrics’ labels distribution: primary: 116, moderate: 294, advanced: 321. According to the International Caries Detection and
Assessment System (ICIDAS), primary caries is initial enamel caries, moderate caries is deﬁned by the extension of proximal caries into the
outer 2/3 of dentin and advanced caries is deﬁned by the extension of proximal caries into the inner 1/3 of dentin.

Sequencing describes the progression of items in a series an image with a corresponding label, as shown in the follow-
of data. This approach is particularly common when time- ing labeled radiographic image (Figure 4).
series modeling is used to predict future events.
Mapping operates by mapping one piece of data to
another. This labeling technique is common in language- Research model training
to-language translation, wherein a word in one language A semantic segmentation model was then developed. All
is mapped to a similar word in another language. training was performed within Google Colab using GPU
In this paper, segmentation was used to achieve the runtime. Notably, the method of training is highly dependent
required outcome. Speciﬁcally, a type of segmentation tasks on the size and quality of the data. Therefore, transfer learning,
that is called “semantic segmentation” enables us to determine which is deﬁned as the use of previous outcomes as a reference
the location, size, and shape of an object in a given image. The for future activities, was employed to obtain better accuracy
goal of semantic image segmentation is to label each pixel of with limited time and resources. To elaborate further, a
Azhari et al. 5

Figure 3. Training speciﬁcations of reliability.

pretrained model, which is a saved network that was previ-

ously trained on a sufficiently large and general dataset, was
utilized; this model effectively served as a generic model of
the visual world. Finally, these learned feature maps can be
leveraged without starting from scratch by training a large
model on a large dataset.
In this study, training was performed using the U-Net.
U-Net is an architecture that is used for semantic segmenta-
tion. This consists of a contracting path and an expansive
path. The contracting path follows the typical architecture of
a convolutional network. It consists of the repeated application
of two 3 × 3 convolutions (unpadded convolutions), each fol-
lowed by a rectified linear unit (ReLU) and a 2 × 2 max
pooling operation with stride 2 for downsampling. At each
downsampling step, we doubled the number of feature chan-
nels. Every step in the expansive path consists of an Figure 4. Semantic image segmentation during labeling of
up-sampling of the feature map followed by a 2 × 2 convolu- bitewing radiographs; “blue” is early caries, “yellow” is moderate
tion (“upconvolution”) that halves the number of feature chan- caries, and “red” is advanced caries.
nels, a concatenation with the correspondingly cropped feature
map from the contracting path, and two 3 × 3 convolutions,
each followed by a ReLU. Cropping is necessary should stop when the validation loss becomes signifi-
because of the loss of border pixels in each convolution. cantly less than the training loss (overfitting).
In the final layer, a 1 × 1 convolution was used to map 2. Callbacks: Based on the Keras documentation, a call-
each 64-component feature vector to the desired number of back is a set of functions to be applied at given stages
classes. The network had 23 convolutional layers (Figure 5). of the training procedure. During training, callbacks
can be used to obtain a view of the internal states and
Prior to training the models, all the images were resized to 512
statistics of the model. These are defined to automate
× 512 pixels. Additionally, the contrast was increased to
some of the training processes.
enable primary caries to be easier to detect using the model.
3. Learning Rate: The learning rate defines the rate at
which a network updates its parameters. A Lower learn-
The training specifications are presented in the following ing rate helps the model converge smoothly and may
table (Table 1): reduce overfitting, although it decelerates the training
process. By contrast, a higher learning rate accelerates
Hyperparameters are training variables that are set manually the training process but can miss the global minima
with a predetermined value before starting the training. and increase the chance of overfitting. A learning rate
Each hyperparameter is defined as follows. of 0.0001 was used for the previously stated reasons
and the default value was 0.01.
1. Epochs: This is the number of times a network is to be 4. Optimizer: Optimization is the process of finding a set
trained during the entire training process; the training of inputs for an objective function that results in a
6 DIGITAL HEALTH

Figure 5. U-Net architecture. Three U-Net models were assembled to obtain the ﬁnal results. ResNet50, ResNext101, and Vgg19. were
used as encoders, which were all pretrained on the ImageNet weights. Ensemble learning is an approach that combines several weak
models to produce a model with stronger predictive power.

Table 1. Hyperparameters are used in the individual and ensemble dynamic and alter depending on the job at hand and
models. desired outcome.
6. Batch size: The batch size is the size of sub-samples
Epochs 60
given to the network after which parameter updates
occur. Here, a batch size of two was used because of
Callbacks Early stopping
limited resources.
Learning rate 0.0001 7. Encoders: The encoder is the first half of U-Net archi-
tecture. It is usually a pretrained classification
Optimizer Adam network, such as VGG or ResNet, wherein convolution
blocks are applied followed by max pool downsampling
Loss Focal, and Dice loss. to encode the input image into feature representations at
multiple different levels.
Batch size 2
Another model configuration that was used as a callback
Enchoders ResNet50, ResNext101, Vgg19 is “early stopping”. This is a well-established regularization
technique that reduces overfitting (model memorizes data
instead of learning it). Another benefit of early stopping is
maximum or minimum function evaluation. Many that it provides a mechanism for preventing the waste of
machine learning algorithms, ranging from fitting logis- resources when training is not improving.
tic regression models to training artificial neural net- The two types of losses were combined to address class
works, are based on this difficult problem. imbalance, which cannot be avoided because of the differ-
5. Loss: The loss functions measure the distance between ence in lesion size across classes. For example, it is natural
an estimated value and its true value. A loss function for advanced caries to have more pixels classified because
connects decisions to costs. Loss functions are of its depth.
Azhari et al. 7

First, focal loss applies a modulating term to the cross- by the union of the two areas. It can be rephrased in
entropy loss (the most common loss used in classification terms of true/false positives/negatives, as follows:
problems) to provide more weight to the difficult-to-classify
TP
examples instead of the easy-to-classify ones. This is a Jaccard = IoU =
TP + FP + FN
dynamically scaled cross-entropy loss, where the scaling
factor decays to zero as the confidence in the correct class The following metrics were also calculated to evaluate
increases. model performance:
Second, Dice loss is widely used in medical image seg-
mentation tasks to address the data imbalance problem. The 1. Recall (sensitivity) = TP/TP + FN
issue with Dice loss is that it only handles the imbalance 2. Precision = TP/TP + FP
problem between the foreground and background, but 3. F1-score = (2 × Precision × Recall) / (Precision +
ignores another imbalance between easy and difficult exam- Recall)
ples that also severely affects the training process of a learn-
ing model. Therefore, in this study, a combination of both
Notably, the following definition explains the aforemen-
was implemented to overcome for these weaknesses.
tioned correlations:
To elaborate on encoders, ResNet50 is a variant of the
ResNet model that has 48 convolution layers, one max
pooling layer, and one average pooling layer. It has 3.8 × 1. TP: It represents true positive results.
109 floating-point operations. ResNeXt repeats the building 2. FP: It represents false positive results.
block that aggregates a set of transformations with the same 3. FN: It represents false negative results.
topology. Compared to ResNet, it exposes a new dimen- 4. Recall: This metric quantifies the number of correct
sion, cardinality (the size of the set of transformations) C, positive predictions out of all positive predictions.
as an essential factor in addition to the dimensions of 5. Precision: This metric quantifies the number of correct
depth and width. The CNN Inception-ResNet-v2 was positive predictions.
trained on over a million photos from the ImageNet collec- 6. F1-score: This metric combines recall and precision into
tion [1]. The 164-layer network can classify photos into a single score by calculating the harmonic mean of the
1000 object categories, including keyboards, mice, precision and recall. It is used instead of accuracy
pencils, and a variety of animals. owing to the imbalance observed in the datasets,
which is natural and inevitable because primary caries
are more common in adult bitewings and advanced
Statistical analysis caries are more common in pediatric bitewings, as dis-
cussed in the following sections.
Model performance evaluation
To evaluate the model’s performance, the data of the test set
and statistical analysis tools were utilized, including true Results
positive (TP), false positive (FP), false negative (FN), and
IoU. To illustrate this, each individual radiograph was com- Two separate models were created during model creation.
pared to the ground truth labeling provided previously. One model was used for adult bitewings and the other
Notably, the testing dataset was not observed by the was used for pediatric bitewings. This approach is import-
model during the training phase. This step is important to ant to avoid misclassification that usually occurs because
ensure the ability of the model to generalize to all future of the difference in teeth density.
data. In this study, all evaluation metrics were derived The two models have the same architecture, although
from the confusion matrix, which can be represented as they are trained on different datasets, namely, the adults’
follows for each label (C): and pediatrics’ datasets. The results were evaluated using
IoU score and F1 score.
1. TPs of C are all C instances classified as C.
2. True negatives of C are all non-C instances not classi- 1. For adults, the IoU averaged 98%, 23%, 19%, and 51%
fied as C. for no caries, primary caries, moderate caries, and
3. FPs of C are all non-C instances classified as C. advanced caries, respectively.
4. FNs of C are all C instances not classified as C. 2. For pediatric bitewings, the IoU averaged 97%, 8%,
17%, and 25% for no caries, primary caries, moderate
IoU, also known as the Jaccard index, was leveraged as the caries, and advanced caries, respectively. The following
primary evaluation metric. It is a widely used metric that tables show the scores of the three models separately
quantifies the similarity between the predicted area and and when ensemble learning was implemented. All cal-
the ground truth area in which the intersection is divided culations were performed on the test set.
8 DIGITAL HEALTH

diagnosis of dental caries is not properly performed, the

potential lesions may progress to reach the pulp, causing
extreme pain. It could subsequently progress to a stage
requiring extensive procedures and clinical time, which
could have been avoided if dental care was discovered in
advance, accurately, and appropriately.
One of the most important challenges with the traditional
caries detection approach is that it is performed by dentists
without any technical aid from available advanced techno-
logical resources. Therefore, detection could potentially
become lengthy and inconsistent in the long run. To elabor-
ate on this, more than one criterion exist to define caries
stages, with a variance in the level of experience that
affects the ability of accurate detection and the effect of
the quality of radiographic images, including technical
errors, brightness, shadow, and contrast. Additionally, the
level of fatigue that the dental practitioner might experience
in one day could significantly affect caries detection during
dental examinations.
Figure 6. (Kingma, D.P., & Ba, J. (2014). Adam: A Method for One of the available solutions is to capitalize on auto-
Stochastic Optimization. CoRR, abs/1412.6980.), the difference mated assistance systems. These systems can provide the
between the performances of different optimizers is shown in
comparison with the Adam optimizer, which was used in this study.
desired consistency and agility. Therefore, this potential
As per Kingma et al., a Multilayer Neural Networks was trained on solution was the focus of this study. In this study, bitewing
the MNIST dataset with AdaGrad, RMSProp, SGDNesterov, radiographic images were used as an input to the model
AdaDelta, and Adam optimizers each time, and the values of the because they are primarily used to detect the presence of
loss function (training cost) after each iteration were plotted. From or monitor the progression of interproximal caries. This
figure 6, we can see that the Adam optimizer reaches convergence study has a range of strengths and limitations, which are
much earlier than other commonly used optimizers. This is due to discussed comprehensively in this section.
its adaptive learning rate, combining momentum and RMSprop
techniques, bias correction, computational efficiency, widespread
The model presented provided a unique contribution to
adoption and support, and robustness to hyperparameter selection. the domain of caries detection by involving the integration
Adam’s adaptive learning rate adjusts parameters based on of multiple modalities of deep learning architectures to
historical gradients, enabling efficient convergence. By combining detect the caries in both primary and permeants teeth. The
momentum and RMSprop, Adam achieves superior performance. combination of those two architectural frameworks is an
Bias correction addresses initialization biases. Adam’s innovative methodology that has received limited attention
computational efficiency and wide support make it convenient for in prior research endeavors. The other novelty in our
large-scale applications. Additionally, Adam is robust to
hyperparameter selection, performing well with default settings.
research is the unique dataset used that represents our
Thus, we chose Adam as our optimizer. patient’s demographic and clinical diversity that represents
our population. One of the strengths is the detailed labeling
of lesion staging. The caries presented in the dataset were
Furthermore, as shown by the scores and in screenshots shown classified into three types: primary, moderate, and
in Figures 6–8, the model can recognize whether a shadow advanced. Furthermore, the model is a semantic segmenta-
represents caries. In addition, it can accurately detect advanced tion model, which means that the model can detect the size,
caries in both adult and pediatric bitewings. Misclassification depth, location, and type of caries, as shown in Figures 7
occurs mostly between primary and moderate caries. and 8 in the results section. An additional strength is the
Although it can classify the lesions correctly many times, it comparison between detection accuracy in adult and pediat-
can occasionally misclassify one as the other or does not ric bitewings through the utilization of statistical analysis
capture the depth of the lesion accurately. As a standalone tools, as listed in Tables 2–5.
model, the model with ResNext101 as its encoder has the OpenCV, Keras and segmentation_models libraries
best performance on both datasets. were chosen to perform our research. OpenCV enables us
to open images, resize them or introduce augmentation in
Discussion images very easily through library functions. Keras was
In the field of dentistry, the ability to identify caries in the chosen for the pretrained models as the segmentation_mo-
early stages as well as to be able to follow the progression dels library uses Keras/Tensorflow backbones for the
of such caries are crucial factors to implement the most U-Net architecture and any other library would have been
appropriate prevention and treatment methods. When the incompatible with it. Segmentation_models library was
Azhari et al. 9

Figure 7. Advanced caries were more accurately detected than primary caries on adult bitewings P < 0.05.

chosen because it provides us with the U-Net architecture teeth, making it more robust in recognizing caries from
without having to design it from scratch. The study utilized various perspectives. Scaling reduced overfitting by ensur-
augmentation techniques to enhance dataset diversity and ing the model was not overly sensitive to specific image
performance in a model for caries detection. By rotating sizes, encouraging the model to recognize caries regardless
x-ray images, resizing images, and flipping images, the of image resolution. Flipping mitigated overfitting by learn-
model was able to simulate variations in image resolution ing features invariant to left-right or up-down reflections,
and magnification, addressing differences in x-ray settings. enhancing the model’s generalization ability. This approach
Adjustments to brightness and contrast were made to simu- reduced the risk of overfitting and allowed the model to
late exposure and lighting conditions, ensuring the model focus on key features in radiographic images.
could generalize well to unseen data. Rotation reduced By contrast, one common limitation imposed upon
overfitting by exposing the model to different angles of obtaining accurate results using deep learning in the field
10 DIGITAL HEALTH

Figure 8. Advanced caries were more accurately detected than primary caries on pediatric bitewings P < 0.05.

Table 2. Model performance on adult bitewings.

Mean Mean IoU for No Mean IoU for Mean IoU for Mean IoU for
Encoder F1 Mean IoU Caries Advanced Caries Moderate Caries Primary Caries

Resnet50 0.517 0.4555288 0.981 0.44 0.22 0.19

ResNext101 0.552 0.4760605 0.972 0.50 0.18 0.22

Inception-ResNet-v2 0.458 0.4338291 0.98 0.34 0.22 0.19

Azhari et al. 11

of medicine, in general, is the small size of datasets owing pretrained encoders were used: ResNet50, ResNeXt101,
to privacy concerns by the patients. However, in this study, and inceptionresnetv2. First, ResNet50 is a variant of the
this limitation was addressed by using transfer learning. This ResNet model, which has 48 convolution layers, along
technique enables the use of previously learned information with one max pooling and one average pooling layer. It
to retrieve insights from newly collected data. Consequently, has 3.8 × 109 floating-point operations. Second, ResNeXt
the time and resources required to achieve the goals of the repeats the building block that aggregates a set of transfor-
model are significantly reduced. Another limitation is the mations with the same topology. In comparison to ResNet,
imbalance in classes. This is because moderate and it adds a new dimension, cardinality (the size of the set of
primary caries are more prevalent in adult radiographs, transformations) C, as an essential factor in addition to
whereas advanced caries are more prevalent in pediatric the dimensions of depth and width. Finally, Inception-
radiographs owing to negligence and delayed detection. In ResNet-v2 is a CNN that is trained on more than a million
addition, the size of the labels is another cause of class imbal- images from the ImageNet database. The 164-layer network
ance. We observed that advanced caries would cover more can classify photos into 1000 object categories, including
pixel points than the other labels. In other words, advanced keyboards, mice, pencils, and a variety of animals.
caries are considerably easier to learn and detect, and this Dental Caries is a common oral health problem that
cannot be avoided. Nevertheless, one approach that was affects people of all ages. Early detection of caries is crucial
used in this study to address this limitation was to use a for preventing the progression of the disease, which can lead
combination of focal and Dice loss, which is a common to pain, infection, and tooth loss. Traditional methods of
technique to address such a problem in the literature. caries detection rely on visual and tactile examinations,
Focal loss is used for multiclass classification, wherein which can be subjective and prone to errors. Recently,
some classes are harder to detect than others. deep learning has emerged as a promising tool for caries
Furthermore, the U-Net architecture was used to perform detection. This approach can potentially improve the
semantic segmentation. The U-Net architecture is a seman- accuracy and speed of caries detection, leading to earlier
tic segmentation architecture. It has two paths: one that con- interventions and better outcomes for patients. One of
tracts, and one that expands. The contracting path of the the clinical implications of using deep learning in caries
CNN follows a standard architecture. To improve the detection is the potential to reduce the need for radio-
model’s accuracy, ensemble learning techniques were graphs. Thus reduce radiation exposure and costs.
employed, which operate on the principle that a weak Another clinical implication of using deep learning in
learner predicts poorly when alone. However, when com- caries detection is the potential to improve patient out-
bined with other weak learners, they create a strong learner. comes by allowing less invasive treatments, such as fluor-
The model weakness is related to the small dataset size ide application or sealants, which can prevent the need for
and the similarity between classes. In this study, three more extensive restorative treatments like fillings or

Table 3. Statistical results summary for model performance on Table 5. Statistical results summary for model performance on
adult bitewings. pediatric bitewings.

Mean Mean IoU Mean IoU Mean Mean Mean IoU Mean IoU Mean
IoU for for for IoU for IoU for for for IoU for
Mean Mean No Advanced Moderate Primary Mean Mean No Advanced Moderate Primary
F1 IoU Caries Caries Caries Caries F1 IoU Caries Caries Caries Caries

0.543 0.478 0.982 0.51 0.19 0.231 0.44 0.377 0.979 0.255 0.17 0.08

Table 4. Model performance on pediatric bitewings.

Mean Mean Mean IoU for No Mean IoU for Mean IoU for Mean IoU for
Encoder F1 IoU Caries Advanced Caries Moderate Caries Primary Caries

Resnet50 0.44 0.36 0.98 0.22 0.19 0.05

ResNext101 0.44 0.39 0.981 0.30 0.16 0.11

Inception-ResNet-v2 0.444 0.37 0.978 0.246 0.16 0.09

12 DIGITAL HEALTH

crowns. By detecting caries at an early stage, deep learn- whether a shadow represents a caries in a faster
ing algorithms can help prevent the progression of the manner than traditional methods.
disease and improve patient outcomes.16 • Using the model increased the accuracy of detecting
The number of included images was limited to 554 adult advanced caries in both adult and pediatric bitewing
bitewings and 217 pediatric bitewings. Increasing this number radiographs.
will provide us with better estimation of the accuracy of DL in • Misclassifications mostly occur between primary and
caries detection and will reduce the risk of biases. Moreover, moderate caries. Although the model showed a high
separating the primary from the permanent teeth during the ability to classify the lesions correctly, it can misclassify
deep learning process can limit the applicability of this one as the other or does not capture the depth of the
approach to pediatric patients at mix dentition stage. lesion accurately at this stage.
After comparing the two aforementioned results, we
could postulate that the deep learning model is capable of
predicting the size, shape, and location of caries, although
Clinical significance
it is highly dependent on the amount of data used to train Advancements in the fields of AI and machine learning are
it and the consistency of labeling. For future research on unprecedented. Therefore, applying deep learning to dental
the same subject, it is recommended to include balancing practices, such as dental caries detection could potentially
the number of collected labels, which could be accomplished increase the efficiency of the dentists while also resulting in
by collecting more data. Using substantially larger amount of a society with considerably better dental health. This study
data would significantly increase detection accuracy, particu- has proven the ability of deep learning to be used as an assist-
larly for pediatric radiographs. Furthermore, with respect to ive automated diagnostic tool. If the limitations are addressed,
this research study, another method that may be introduced which can be achieved if not limited by time, deep learning
for future work is a new label, root caries, to reduce the con- can accurately detect caries type, size, and location, providing
fusion of the model that may potentially occur because of consistency and reducing the workload on dentists.
unlabeled pixels that are similar to currently present labels.
As the use of AI in caries detection continues to evolve, Acknowledgments: The authors received no financial support for
more studies are needed to validate its efficacy and accuracy the research, authorship, or publication of this article. The authors
compared to traditional methods. Future studies should also would like to express their appreciation and gratitude to the
explore the use of AI in different populations and settings Faculty of Dentistry, King Abdulaziz University for their
and AI algorithms should be integrated and continued to continual support throughout the study. Authors would like to
improve with current clinical practice guidelines to ensure thank Editage for their language editing services, and Mariam
appropriate use and interpretation of results. This will require Hussien for her amazing work with Deep learning. “All authors
collaboration between dental professionals and AI experts to gave their final approval and agree to be accountable for all
develop standardized protocols and guidelines. Future recom- aspects of the work.”
mendations also should include guidelines for the responsible
use of AI in dentistry, including informed consent and protec- Author contributions: Conceptualization, AAA, NH; methodology,
tion of patient data. Strategies for reducing the cost of AI tech- AAA, NH, LS, and AA; software, AAA and NH; validation, AAA,
nology and ensuring its availability should be tackled. By NH, LS, and AA; formal analysis, AAA and NH; investigation,
addressing these recommendations, AI can potentially revolu- AAA, NH, LS, and AA; resources, LS, and AA; data curation,
tionize caries detection and improve oral health outcomes for AAA, NH, LS, and AA; writing—original draft preparation, AAA,
patients. It would be appropriate to consider collecting the NH, LS, and AA; writing—AAA, NH; supervision, AAA, NH;
dataset of mixed dentition stage (6–12 years old) and label project administration, AAA; All authors have read and agreed to
the primary and permanent teeth in each image un-separately the published version of the manuscript.
and compare the accuracy of caries detection in compression
to when we train the machine separately. The result will be Declaration of conflicting interests: The authors declared no
more clinically relevant in terms of practicality. potential conflicts of interest with respect to the research,
authorship, and/or publication of this article.
Conclusion
When evaluated the effectiveness of deep learning methods Ethical approval: The study was conducted in accordance with
in dental caries detection by applying deep learning tech- the guidelines approved by the Ethics Review Committee of
King Abdulaziz University, Department of Pediatric Dentistry
nology to bitewing x-ray radiographs, the following conclu-
(Proposal No. 032-02-22, 28/02/2022).
sions were withdrawn:

• Model’s score results show that capitalizing on machine Funding: The authors received no ﬁnancial support for the
learning for dental caries detection helps recognize research, authorship, and/or publication of this article.
Azhari et al. 13

Guarantor: AAA 8. Lee JH, Kim DH, Jeong SN, et al. Detection and diagnosis of
dental caries using a deep learning-based convolutional neural
network algorithm. J Dent 2018 Oct 1; 77: 106–111. (https://www.
ORCID iD: Amr Ahmed Azhari https://orcid.org/0000-0002- sciencedirect.com/science/article/pii/S0300571218302252).
8749-4714 9. Schwendicke F, Rossi J, Göstemeyer G, et al.
Cost-effectiveness of artificial intelligence for proximal
caries detection. J Dent Res 2021; 100: 369–376.
References 10. Mao YC, Chen TY, Chou HS, et al. Caries and restoration
1. Selwitz R, Ismail A and Pitts N. Dental caries. Lancet 2007; detection using bitewing film based on transfer learning
369: 51–59. with CNNs. Sensors 2021; 21: 4613.
2. (a) Keyes PH. Research in dental caries. J Am Dent Assoc 11. Cantu A, Gehrung S, Krois J, et al. Detecting caries lesions of
1968; 76: 1357–1373. (b) Bacterial Diseases of the Mouth | different radiographic extension on bitewings using deep
Boundless Microbiology [Internet]. Courses.lumenlearning. learning. J Dent 2020; 100: 103425.
com. 2021 [cited 1 November 2021]. Available from: https:// 12. Lian L, Zhu T, Zhu F, et al. Deep learning for caries detection
courses.lumenlearning.com/boundless-microbiology/chapter/ and classification. Diagnostics 2021; 11: 1672.
bacterial-diseases-of-the-mouth/. 13. Moran M, Faria M, Giraldi G, et al. Classification of approx-
3. Selwitz R, Ismail A and Pitts N. Dental caries. Lancet 2007; imal caries in bitewing radiographs using convolutional
369: 51–59. neural networks. Sensors 2021; 21: 5192.
4. Lee J, Kim D, Jeong S, et al. Detection and diagnosis of dental 14. Singh P and Sehgal P. GV Black dental caries classification
caries using a deep learning-based convolutional neural and preparation technique using optimal CNN-LSTM classi-
network algorithm. J Dent 2018; 77: 106–111. fier. Multimed Tools Appl 2021; 80: 5255–5272.
5. Mao Y, Chen T, Chou H, et al. Caries and restoration detec- 15. Lee S, Oh SI, Jo J, et al. Deep learning for early dental caries
tion using bitewing film based on transfer learning with detection in bitewing radiographs. Sci Rep 2021; 11: 16807.
CNNs. Sensors 2021; 21: 4613. 16. Karobari MI, et al. Evaluation of the diagnostic and prognos-
6. Lee JH, Kim DH, Jeong SN, et al. Detection and diagnosis of tic accuracy of artificial intelligence in endodontic dentistry:
dental caries using a deep learning-based convolutional neural A comprehensive review of literature. Comput Math
network algorithm. J Dent 2018 Oct 1; 77: 106–111. (https:// Methods Med 2023; 2023: 7049360.
www.sciencedirect.com/science/article/pii/ 17. Bayrakdar IS, Orhan K, Akarsu S, et al. Deep-learning
S0300571218302252). approach for caries detection and segmentation on dental
7. Bengio Y, Courville A and Vincent P. Representation learning: bitewing radiographs. Oral Radiol 2022; 38: 468–479.
A review and new perspectives. IEEE Trans Pattern Anal Mach 18. Casalegno F, Newton T, Daher R, et al. Caries detection with
Intell 2013; 35: 1798–1828. arXiv:1206.5538. doi:10.1109/ near-infrared transillumination using deep learning. J Dent
tpami.2013.50. PMID 23787338. S2CID 393948. Res 2019; 98: 1227–1233.