Borneo Mobile App
Borneo Mobile App
Article
Automated Real-Time Identification of Medicinal Plants
Species in Natural Environment Using Deep Learning
Models—A Case Study from Borneo Region
Owais A. Malik 1, * , Nazrul Ismail 1 , Burhan R. Hussein 1 and Umar Yahya 2
1 School of Digital Science, Universiti Brunei Darussalam, Jln Tungku Link, Gadong BE1410, Brunei;
[email protected] (N.I.); [email protected] (B.R.H.)
2 Department of Computer Science and Information Technology, Islamic University in Uganda,
Kampala P.O. Box 7689, Uganda; [email protected]
* Correspondence: [email protected]
Abstract: The identification of plant species is fundamental for the effective study and management
of biodiversity. In a manual identification process, different characteristics of plants are measured
as identification keys which are examined sequentially and adaptively to identify plant species.
However, the manual process is laborious and time-consuming. Recently, technological development
has called for more efficient methods to meet species’ identification requirements, such as developing
digital-image-processing and pattern-recognition techniques. Despite several existing studies, there
are still challenges in automating the identification of plant species accurately. This study proposed
designing and developing an automated real-time plant species identification system of medicinal
Citation: Malik, O.A.; Ismail, N.;
plants found across the Borneo region. The system is composed of a computer vision system that
Hussein, B.R.; Yahya, U. Automated
is used for training and testing a deep learning model, a knowledge base that acts as a dynamic
Real-Time Identification of Medicinal
database for storing plant images, together with auxiliary data, and a front-end mobile application as
Plants Species in Natural
Environment Using Deep Learning
a user interface to the identification and feedback system. For the plant species identification task, an
Models—A Case Study from Borneo EfficientNet-B1-based deep learning model was adapted and trained/tested on a combined public
Region. Plants 2022, 11, 1952. and private plant species dataset. The proposed model achieved 87% and 84% Top-1 accuracies
https://doi.org/10.3390/plants on a test set for the private and public datasets, respectively, which is more than a 10% accuracy
11151952 improvement compared to the baseline model. During real-time system testing on the actual samples,
using our mobile application, the accuracy slightly dropped to 78.5% (Top-1) and 82.6% (Top-5),
Academic Editors: Ana Barradas,
Jorge Marques da Silva, Pedro
which may be related to training data and testing conditions variability. A unique feature of the study
Mariano, Tae-Hyuk Ahn is the provision of crowdsourcing feedback and geo-mapping of the species in the Borneo region,
and Georgios Koubouris with the help of the mobile application. Nevertheless, the proposed system showed a promising
direction toward real-time plant species identification system.
Received: 15 June 2022
Accepted: 22 July 2022
Keywords: deep learning; medicinal plants; species identification; computer vision; real-time system;
Published: 27 July 2022
mobile application
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations. 1. Introduction
Plant species’ identification is a challenging task that has a key role in effectively
studying biodiversity and investigating unknown species. The manual identification of
Copyright: © 2022 by the authors.
plant species is a time-consuming process and requires a lot of expertise in the field. Auto-
Licensee MDPI, Basel, Switzerland. mated identification systems based on computer vision and machine learning techniques
This article is an open access article provide an alternative and assistive approach for this task. These systems are useful, but
distributed under the terms and their accuracy varies due to the diversity of the species. A recent survey highlighted the
conditions of the Creative Commons growing application of machine learning (ML) and deep learning (DL) for plants’ identi-
Attribution (CC BY) license (https:// fication through leaves [1]. Integrated with Mobile applications, ML and DL techniques
creativecommons.org/licenses/by/ are increasingly being applied to distinguish between diseased and healthy plants [2–7],
4.0/). identification and classification of herbs and medicinal plants [3–10], classification of both
generic plants and specific plant species [11–19], identification of crop-specific diseases [20],
and general identification of plants to guide field tours [21].
Convolutional neural networks (CNNs) and the various deep CNN models have
been reported to be the most commonly used methods in the automation process of plant-
classification tasks [1]. An automated diagnosis of the 10 most common tomato leaf diseases,
using a mobile application, was conducted in Reference [2], using the MobileNet model
of CNN, and an accuracy of up to 89% was reported. Similar to Reference [2], where a
single crop (tomato) was used, researchers in Reference [7] trained and deployed Residual
Neural Networks (ResNets), a deep version of CNN, in a custom-built mobile application
to classify wheat disease in the wild, reporting classification accuracy of up to 96%. ResNet
and Xception Networks, in combination with the YOLO object detection framework, have
also been used to detect early blight tomato disease, with an accuracy of over 99% reported
in Reference [3]. DL Neural Networks have also been deployed through an android-based
mobile application to detect different common diseases in terrestrial plants found in the
Philippines, with an accuracy of 80% [4]. B5 and B4 models of the EfficientNet DL model are
reported in Reference [5], which achieved an accuracy of over 99% in the classification for
38 different diseases in various plants. However, the classification was performed in offline
mode, and it was not deployed and tested over a mobile application. More recently, transfer
learning has been successfully deployed in a mobile application to enable the detection of
tomato leaf diseases, reporting detection accuracy of over 96%, using the EffecientNet-B0
DL model [6]. The results of these studies demonstrate the efficacy of deploying ML and
DL techniques in mobile applications to detect, classify, and identify plant diseases by
using plant leaves.
Medicinal plants and herbs have continued to be used for the traditional management
of various illnesses in many societies since time immemorial [22]. With advancements
in artificial intelligence (AI) and different Information and Communication Technologies
(ICTs), the need to automatically identify medicinal plants from the thousands of plant
species can only continue to grow. Researchers in Indonesia [8] utilized local binary patterns
to extract the leaf texture of 30 different medicinal plants and then applied probabilistic
Neural Networks to automatically classify the herbal leaves, achieving a classification
accuracy of just over 56%. In Reference [9], the researchers utilized a support vector
machine (SVM) and DL Neural Networks to automatically classify 20 different herbs found
in Malaysia, using a mobile application. The mobile application reported spending only
2 s for processing the input leaf image and returning the classification results, with a
classification accuracy of 93%. Similarly, in Reference [10], a fusion of fuzzy local binary
pattern and fuzzy color histogram, using product decision rules, was performed to enable
the automatic identification of 51 medicinal plant species commonly found in Indonesia.
Probabilistic Neural Networks were used to classify color histograms, reportedly achieving
an accuracy of just over 74%. The promising results achieved in these studies clearly
highlight the plausibility of utilizing ML and DL models in mobile applications to automate
the classification of medicinal plants.
It has been reported that there could be over 450,000 different plants, with one-third
of them facing extinction [23]. The easy and automatic identification of plant species
is, therefore, a crucial step toward their preservation. The scientific research commu-
nity continues to make efforts toward the realization of this step. In Reference [11],
the researchers built a joint-classifier by using Backpropagation Neural Networks and
weighted K-NN and deployed it in an android-based mobile application to enable auto-
matic classification of 220 plant species (angiosperms) found in China. The joint classi-
fier reportedly achieved a classification accuracy of nearly 93%. K-NN was similarly
utilized in Reference [12] to identify 32 different plant species common in Mauritius,
using leaf-shape features and a color histogram; accuracy of just over 87% was report-
edly achieved. In Reference [13], a custom-built android-based mobile application was
able to identify a tree from 126 tree species common in the French Mediterranean area,
using tree leaves, with an accuracy of up to 90%. Researchers achieved this through seg-
Plants 2022, 11, 1952 3 of 17
menting tree leaf images to form feature space and then used histogram intersection to
predict the class. Similarly, a custom-built mobile application with a back-end classifier,
using SIFT features with the Bag of Words model and SVM, was able to classify 20 dif-
ferent plant species common in Sri Lanka with an accuracy of 96.48% [14]. Furthermore,
CNN models, including VGG19, MobileNet, and MobileNetV2, deployed in a mobile
application were able to identify 33 different types of leaves common in East Hokkaido
in Japan, with MobileNetV2 achieving an accuracy of over 99% [15]. In Reference [16],
the researchers also deployed a CNN model in an android-based mobile application
to classify natural images of leaves belonging to 129 different species crowdsourced
from all over the world. Single image classification took 2.5 s, obtaining a Top-3 test
error of above 60%. Similarly, in Reference [17], the researchers presented a mobile
application that was able to classify plant leaves belonging to five different plant species
common in India, based on leaf color and shape. To perform classification by using leaf
shapes, the extracted morphological features of the leaves were categorized by using
the Sobel and Otsu methods, while the color-based classification was performed by
using the dominant-color method. Finally, but not least, in Reference [18], the results
of a mobile application for leaf classification utilizing a CNN were presented. This
study experimented with different CNN architectures’ performances in classifying
15 leaf species of plants common in California and North Carolina. Deep CNNs are
reported to have achieved the highest classification accuracy of 81.6%, with the mobile
application completing the classification task in 2.5 s after spending about 5.2 s loading
the query leaf image in the NN classifier from the gallery. The results reported in the
abovementioned works from the literature demonstrate the promising efficiency of
mobile application classifiers that were developed by using the ML and DL models to
facilitate the automatic identification and classification of plant species. Not only is this
a crucial step toward the conservation of species risking extinction, but such mobile
applications could eventually serve as field guides during tours to various plantations
in the wild. In Reference [16], the researchers demonstrated how a mobile application
utilizing leaf morphological features and angle code histogram was able to serve as a
tour guide in a wild field with six different plant species, in the USA, with an error rate
between 17% and 53%. While this error rate appears to be high, it is a promising result
to begin with for a field tour in the open wild.
The current study aimed to develop a mobile application to enable the automatic
identification of medicinal plants and plants predominantly found in the Borneo region in
real time. The present study builds on the strengths of promising methods established in
related works from the literature, while also addressing the weaknesses (gaps) identified
in the same. Most of the reviewed studies either utilized their own generated small-sized
datasets for training and assessing their ML and DL models or entirely (i.e., training and
testing) used some of the open existing image datasets (such as ImageCLEF, Flavia, ICL
Plantae, PlantVillage, CVIP100, Flavia, Swedish, etc.), potentially resulting in overfitted and
underfitted models, respectively. Additionally, many of the reviewed studies considered
clean leaf images captured against a white background, a condition that is unlikely in a
real-world setting in which leaf images are often captured with a non-clean background.
Moreover, studies that reported excellent (over 95%) classification accuracies mainly in-
volved classifying a single plant (e.g., tomato), thereby limiting their application to a wide
range of species. Additionally, the feedback of the mobile application end-user on observed
classification results was not taken into account to further enrich the classifier’s knowledge
base for future classification. The development of the mobile application proposed in the
current study is, therefore, a valuable addition to the existing efforts toward the preser-
vation of the herbs and medicinal plants native to the Borneo region, a region primarily
believed to be the most species-rich area in the world [24], yet with the immense threat of
extinction to some of its plant species [25,26]. The current study experimented with and
optimized different EffecientNet deep learning models, as transfer learning has mainly
been singled-out to produce the best classification results for real-time multiclass image
Plants 2022, 11, x FOR PEER REVIEW 4 of 17
time multiclass image classification tasks [1,5]. A unique feature of the study is the provi-
sion of crowdsourcing feedback and geo-mapping of the species in the Borneo region with
classification tasks [1,5]. A unique feature of the study is the provision of crowdsourcing
the help of the mobile application. This is important for the region since several local plant
feedback and geo-mapping of the species in the Borneo region with the help of the mobile
species (with their local names and benefits) are unknown to the experts. The system pro-
application. This is important for the region since several local plant species (with their
vides an adaptive learning approach where the models can be updated based on the
local names and benefits) are unknown to the experts. The system provides an adaptive
newly collected
learning approachdata and people’s
where feedback.
the models can be updated based on the newly collected data
Our contributions
and people’s feedback. in this study are highlighted as follows:
• We have
Our proposedina this
contributions machine
studyvision system that
are highlighted asisfollows:
capable of automating the identi-
• fication of medicinal plant species in real time.
We have proposed a machine vision system that is capable of automating the identifi-
• We have
cation developedplant
of medicinal an end-to-end
species incomputer
real time. vision system with a convolutional neu-
• ral network
We (CNN) an
have developed model to identify
end-to-end medicinal
computer visionplant species
system withwhen given an image.
a convolutional neural
• network (CNN) model to identify medicinal plant species when given an
The system works in real time and can accurately identify different plant species image.
• The
givensystem workstaking
by simply in realatime andwith
picture can accurately identifyor
a mobile camera different plant
uploading anspecies given
existing im-
by simply taking
age from a device. a picture with a mobile camera or uploading an existing image from
a device.
• The system provides a feedback mechanism and a knowledge base as a means to
• The system provides a feedback mechanism and a knowledge base as a means
continuous lifelong learning of the models to produce a robust plant species identifi-
to continuous lifelong learning of the models to produce a robust plant species
cation system.
identification system.
2. Materials
2. Materials and
and Methods
Methods
2.1.
2.1. Proposed System
The overall flowflowof ofthe
theproposed
proposedsystem
system is presented in Figure
is presented 1. The
in Figure 1. system is com-
The system is
posed of three
composed main
of three maincomponents:
components: (1)(1)
a computer
a computervision
visionand
anddeep-learning-based
deep-learning-based plantplant
species
species classifier,
classifier,(2)
(2)a knowledge
a knowledge base
baseas aascentral repository
a central for plant
repository information
for plant together
information to-
with auxiliary
gether and feedback
with auxiliary data, and
and feedback (3) aand
data, mobile
(3) afront-end that provides
mobile front-end a user interface
that provides a user
to the end-user
interface to interacttowith
to the end-user the system
interact with theand displaying
system of classification
and displaying results. Details
of classification results.
for eachfor
Details component of the system
each component of the are explained
system in the subsequent
are explained subsections.
in the subsequent subsections.
2.2.1. Datasets
PlantCLEF 2015
PlantCLEF 2015 [27] is a plant identification challenge dataset that aims to build an
image-based plant identification system and evaluates methods and systems at a very
large scale that adapts to real-world conditions. The dataset was constructed through
a participatory community platform in 2011, consisting of thousands of collaborators.
PlantCLEF 2015 consists of curated images from many different contributors, cameras,
areas, periods of the year, and individual plants. More precisely, the PlantCLEF 2015 dataset
comprises 113,205 pictures belonging to 41,794 observations of 1000 species of trees, herbs,
and ferns living in Western European regions. Each image corresponds to one of the seven
types of views in the meta-data (entire plant, fruit, leaf, flower, stem, branch, and leaf scan)
and is associated with a scientific name.
In this study, the PlantCLEF 2015 dataset was used as an auxiliary dataset to create our
plant identification model to improve our classification result. We have further extracted
images relevant to our application, that is, the images containing only leaf-related infor-
mation (e.g., leaf, leaf-scan, and the entire plant). Thus, the extracted dataset consisted of
23,708 images. The hold-out set is built by partitioning 90% of the dataset for training and
testing, while the validation data are set to be roughly 10% of the whole dataset (please see
Table 1 for more detailed statistics).
Train 1691
Test 157
Validation 249
Total 2097
Plants2022,
Plants 11, x1952
2022, 11, FOR PEER REVIEW 6 6ofof17
17
To counter the
To counter the imbalanced
imbalanced distribution
distribution of samples from
of samples our datasets,
from our datasets, we
we employed
employed
two
two cost-sensitive learning methods to train our model: computing the class weights for
cost-sensitive learning methods to train our model: computing the class weights for
each
each class
class and focal loss
and focal loss [29].
[29]. To
To summarize,
summarize, we we trained
trained our
our models
models by
by using
using transfer
transfer
learning,
learning, using
using ImageNet
ImageNet weights
weights with
with AutoAugment
AutoAugment optimal
optimal augmentation
augmentation policy
policy and
and
cost-sensitive learning methods applied. Figure 3 highlights the imbalance distribution
cost-sensitive learning methods applied. Figure 3 highlights the imbalance distribution for
for
bothboth datasets
datasets with with a ratio.
a ratio.
Class Weighted
Class Weighted Function
Function
The classical
The classical way wayofoftraining
trainingneural
neural networks
networks byby using
using backpropagation
backpropagation involves
involves up-
updating the model weights with respect to errors being made
dating the model weights with respect to errors being made by the model. This method by the model. This
method
fails when fails
wewhen
have we have imbalanced
imbalanced training where
training samples samples where examples
examples from eachfrom classeach
are
class are treated the same, meaning that, for imbalanced datasets,
treated the same, meaning that, for imbalanced datasets, the model is prone to performing the model is prone
to performing
well only for the well only for
majority thethat
class; majority
is, it isclass;
biasedthat is, it the
toward is biased
majoritytoward the majority
class samples.
classThe
samples.
backpropagation algorithm can be updated to consider the misclassification of
each classbackpropagation
The algorithm
by using a cost-sensitive can
loss be updated
function to consider the
that incorporates the misclassification
error in proportion of
each class by using a cost-sensitive loss function that incorporates the
to the number of samples present in the training samples. The effect of adding this class- error in proportion
to the number
weighted of samples
function allows present
the neural in the training
network samples.
to learn the The effectclasses
minority of addingsuchthis class-
that the
weighted function allows the neural network to learn the minority classes such that the
model will be penalized more when misclassification of the minority class occurs. We de-
model will be penalized more when misclassification of the minority class occurs. We
fined the following algorithm in computing the class weights for our dataset as follows:
defined the following algorithm in computing the class weights for our dataset as follows:
def class_weights(Y , α) #returns a dict. of class weights.
def class_weights(Ytrain , α) #returns a dict. of class weights.
counter = dict. containing the no. of samples per class from Y .
counter = dict. containing the no. of samples per class from Ytrain .
if α > 0:
if α > 0:
p = max(counter.values()) * α
p = max(counter.values()) * α
for class in counter.keys():
for class in counter.keys():
counter[class] += p
counter[class] += p
majority = max(counter.values())
majority = max(counter.values())
return
return {class: majority/count ∀class ∈ Y }
{class: majority/count class ∀class ∈ Ytrain }
The
The ααparameter
parameterrepresents
represents thethe smoothing
smoothing parameter,
parameter, whichwhich
balances balances
the classthe class
weights
weights between the majority and minority classes. It determines how
between the majority and minority classes. It determines how much it is to penalize the much it is to penal-
ize the when
model modelmisclassification
when misclassification for the minority
for the minority classSetting
class occurs. occurs.αSetting α = 1toequates
= 1 equates applying to
applying
equal class equal class weightage
weightage with respect withtorespect to the of
the number number
speciesofpresent
species in present in the and
the dataset da-
taset and setting α > 0 equates to making the minority class weights higher.
setting α > 0 equates to making the minority class weights higher. We defined α to be 0.4 for We defined α
to be 0.4 for both the PlantCLEF 2015 and UBD Botanical datasets
both the PlantCLEF 2015 and UBD Botanical datasets and trained by using the multiclass and trained by using
the multiclass loss.
cross-entropy cross-entropy loss.
Focal Loss
The focal loss was introduced in Reference [29] for the object detection task, which
deals with the sparse set of foreground examples present in the datasets and prevents the
Plants 2022, 11, 1952 8 of 17
Focal Loss
The focal loss was introduced in Reference [29] for the object detection task, which
deals with the sparse set of foreground examples present in the datasets and prevents the
vast number of easy negatives from overwhelming the detector during training. The loss
function was reshaped to down-weight easy examples and, thus, focused training on hard
negatives by adding a modulating factor (1 − pt )γ to the cross-entropy loss, with tunable
focusing parameter γ ≥ 0 (Equation (1)). In Reference [29], the authors experimented with
γ ∈ [0, 5], where γ = 2 worked best in their experiment.
Top-1 Acc. (%) Top-5 Acc. (%) Sensitivity (%) Specificity (%)
Baseline 73.5 79.4 43.2 53.5
Class weighted
81.9 87.4 61.5 64.6
(α = 0.2 )
Class weighted
83.2 92.4 79.5 77.6
(α = 0.4 )
Focal Loss
83.8 92.2 76.5 74.6
(γ = 2.0 )
Focal Loss
84.0 89.4 76.5 74.6
(γ = 5.0 )
Figure4.
Figure
Figure 4.Learning
4. Learning rate
Learning rate range
range test
testwith
withthe
with theexponential
the exponential moving
exponential moving average
average for
for PlantCLEF
PlantCLEF 2015
2015 (Left)
(Left)
and
and UBD
UBD Botanical
Botanical datasets
datasets (right).
(right).
and UBD Botanical datasets (right).
2.3.
2.3. Mobile
2.3. Mobile Application
Mobile ApplicationDetails
Application Details
Details
We
We developed
We developed a mobile
developed a mobile applicationdedicated
mobile application
application dedicatedto
dedicated toto the
the
the production
production
production of
ourour
of our
of collected
collected
collected bo-
bo-
botanical
tanical dataset
tanical dataset
dataset through
through
through image-based
image-based
image-based plant
plant
plant identification.
identification.
identification. The The
The application
application
application currently
currently
currently sup-
sup-
supports
ports Android
ports Android
Android (API(API
(API level
level
level 16 16
andand
16 and above),
above),
above), allowing
allowing
allowing images
images
images of aof
of a plant
a plant
plant viavia
via gallery
gallery
gallery upload
upload
upload or
or
or captured
captured byby the
the camera
camera toto
bebe sent
sent to
to the
the web
web server
server to
to retrieve
retrieve a
a list
list of
of
captured by the camera to be sent to the web server to retrieve a list of predicted speciespredicted
predicted species
species
made
made by
made byour
by ourmodel.
our model. Figure
model. Figure 555shows
Figure showsaaagraphical
shows graphicaluser
graphical userinterface
user interfaceof
interface ofour
of ourmobile
our mobileapplication
mobile application
application
system. Below
system. Below
system. are
Below are the
are the details
the details of
details of the
of the architecture
the architecture of
architecture of the
of the mobile
the mobile application
applicationsystem.
mobile application system.
system.
Figure5.
Figure
Figure 5.Screenshot
5. Screenshotof
ofGUI
of GUImobile
GUI mobileapplication.
mobile application.
application.
Plants 2022, 11, 1952 10 of 17
Trainingand
Figure6.6.Training
Figure andvalidation
validationloss
losscurves
curvesfor
forthe
thedeep
deeplearning
learningmodel.
model.
Performanceevaluation
Table4.4.Performance
Table evaluationon
onUBD
UBDBotanical
Botanical++PlantCLEF
PlantCLEFtest
testset.
set.
Top-1
Top-1Accuracy (%) Top-5
Accuracy (%) Top-5Accuracy (%) Sensitivity
Accuracy (%) (%) Specificity
Sensitivity (%) Specificity (%)
(%)
Baseline
Baseline 63.4
63.4 72.4
72.4 43.2
43.2 53.5
53.5
Class
Classweighted
weighted 83.5
83.5 87.4
87.4 61.5
61.5 64.6
64.6
(𝛼(α== 0.2)
0.2 )
Class weighted
Class weighted 85.5
85.5 92.4
92.4 73.5
73.5 77.6
77.6
(𝛼(α== 0.4)
0.4 )
Focal Loss
Focal Loss
83.5
83.5 92.4
92.4 71.5
71.5 77.6
77.6
(𝛾(γ==2.0)
2.0 )
Focal
FocalLoss
Loss
87.5
87.5 86.4
86.4 70.5
70.5 74.6
74.6
(𝛾(γ==5.0)
5.0 )
Plants 2022, 11, 1952 12 of 17
The baseline model showed roughly 80% in Top-5 accuracy. However, it performs
poorly in classifying the true-positive and true-negative samples as visible from the low
sensitivity and specificity rate. The baseline model was found to be biased toward the
majority classes; hence, it had a high accuracy rate (80%) and a low rate of sensitivity or
recall (43.2%). Both cost-sensitive learning methods, including the class weighted and focal
loss, showed an improvement in the classification performance for the imbalanced dataset,
as indicated by the significant increase in the sensitivity and specificity values. With the
class weighted function applied, the model is being penalized more and able to make a
correct prediction with an improvement of sensitivity rate from 43% to 79%, while using
focal loss has the effect of having the model recognize classes that do not correspond to its
labels. We later performed an evaluation on the test samples on the merged dataset and
ran a total of three runs. The results shown in Table 4 are based on a total of 251 species
with 2537 images.
The baseline model appeared to show no improvement, albeit adding more samples
from our collected dataset. This is mainly due to the imbalance in the nature of the
dataset, as shown in the distribution graph in Figure 3. After applying the cost-sensitive
learning methods and adding more external samples, a slight incremental improvement in
performance was found. In the next section, we outlined the test performed on real-life
plant observation and reported its performance.
Top-1 Accuracy (%) Top-5 Accuracy (%) Sensitivity (%) Specificity (%)
Focal Loss
78.5 82.6 75.2 77.7
(γ = 2.0 )
Class weighted
62.8 70.2 66.1 68.6
( a = 0.4 )
Some examples of the real-time testing of the developed mobile application are
shown in Figure 7. In the real-time classification system, the image taken through the
mobile camera is uploaded to the cloud, and the classification results are sent back
to the mobile device. The corresponding information, including the plant’s scientific
name, description, medical usage, precautions, local family name, common name, and
origin, is retrieved from the online database and displayed to the user. Since the system
also facilitates the geo-mapping of the species, the longitude and latitude information,
where the user took the image, can also be saved in the knowledge base. Moreover, a
confirmation option is provided for the users to give feedback to improve the system
further, and the domain expert verifies this at the end. This option helps us continuously
improve the classification model for plant species identification. Figure 8 depicts an
example of the system when used in the offline mode. It also shows a scenario where
Plants 2022, 11, x FOR PEER REVIEW 13 of 17
when used in the offline mode. It also shows a scenario where a plant image can be clas-
sified into multiple species with different probability values. The class with the highest
a plant image
probability canisbe
value classified
shown into multiple
first, while species
the rest are with
shown different probability
in decreasing values.
order of probabil-
The class with
ity values. theto
Similar highest probability
the real-time value
system, is shown
a feedback first, while
mechanism is the rest are
provided toshown
report
in decreasing
the order of
misclassification of the
probability values. Similar to the real-time system, a feedback
plant species.
mechanism is provided to report the misclassification of the plant species.
Selected Image
(a)
In
In contrast
contrasttoto
several previous
several promising
previous studies
promising in which
studies in either
whichopen-access datasets
either open-access
[2,3,5] or localized primary datasets [4,20] were exclusively utilized, the deep
datasets [2,3,5] or localized primary datasets [4,20] were exclusively utilized, the deep learning
models
learningtrained
modelsand tested
trained intested
and the current
in thestudy
currentused over
study 25,000
used overimages, consisting
25,000 images, of both
consisting
secondary open-access datasets, as well as a primary dataset of images collected
of both secondary open-access datasets, as well as a primary dataset of images collected from a
Plants 2022, 11, 1952 15 of 17
from a local botanical garden in Brunei with medicinal plants native to Borneo region.
Therefore, this dataset diversity adds to the reproducibility confidence of the obtained
results under real-time conditions and the likely generalizability of the trained models.
Additionally, the average real-time in-the-wild classification accuracy of 80% for both Top-1
and Top-5 accuracies obtained in the current study is lower than the classification accuracies
of 88.4% and 99.9% previously reported in References [2,3], respectively; it should be noted,
however, that the latter studies were dealing with diseased leaf detection for only one type
of crop (tomatoes) in contrast to the present study, with over 250 medicinal plants species.
Furthermore, as a way of building on previous related works [2,4,6,17,20,21] that aimed to
train and deploy deep learning plants identification models on mobile applications, the
current work has introduced a new feature by enabling confirmation and, thus, verification
of the classification results by the mobile application user, as shown in Figure 5. This new
feature introduced in this work aids the continuous improvement of the cloud-hosted
knowledge base, particularly by domain experts and native users of medicinal plants,
thereby facilitating incremental learning of the deployed deep learning model.
Author Contributions: Conceptualization, O.A.M.; methodology, O.A.M., N.I. and B.R.H.; soft-
ware, N.I. and B.R.H.; validation, N.I., O.A.M., B.R.H. and U.Y.; formal analysis, N.I., B.R.H.
and O.A.M.; investigation, N.I. and O.A.M.; resources, N.I. and O.A.M.; data curation, N.I. and
O.A.M.; writing—original draft preparation, U.Y., N.I., B.R.H. and O.A.M.; writing—review and
editing, U.Y., N.I., O.A.M. and B.R.H.; visualization, N.I. and B.R.H.; supervision, O.A.M.; project
administration, O.A.M.; funding acquisition, O.A.M. All authors have read and agreed to the
published version of the manuscript.
Funding: This work was supported by Universiti Brunei Darussalam under research grant number
[UBD/RSCH/1.4/FICBF(b)/2018/011].
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data that support the findings of this study are available
upon request.
Acknowledgments: We express our gratitude to Ferry Slik from Universiti Brunei Darussalam for
accessing the UBD Botanical Garden data.
Conflicts of Interest: The authors declare no conflict of interest.
Plants 2022, 11, 1952 16 of 17
References
1. Sachar, S.; Kumar, A. Survey of feature extraction and classification techniques to identify plant through leaves. Expert Syst. Appl.
2021, 167, 114181. [CrossRef]
2. Elhassouny, A.; Smarandache, F. Smart mobile application to recognize tomato leaf diseases using Convolutional Neural Networks.
In Proceedings of the International Conference of Computer Science and Renewable Energies (ICCSRE 2019), Agadir, Morocco,
22–24 July 2019; IEEE: Piscatway, NJ, USA, 2019. [CrossRef]
3. Chakravarthy, A.S.; Raman, S. Early Blight Identification in Tomato Leaves using Deep Learning. In Proceedings of the 2020
International Conference on Contemporary Computing and Applications (IC3A), Lucknow, India, 5–7 February 2020; IEEE:
Piscatway, NJ, USA, 2020; pp. 154–158. [CrossRef]
4. Valdoria, J.C.; Caballeo, A.R.; Fernandez, B.I.D.; Condino, J.M.M. iDahon: An Android Based Terrestrial Plant Disease Detection
Mobile Application through Digital Image Processing Using Deep Learning Neural Network Algorithm. In Proceedings of the
4th International Conference on Information Technology, Bangkok, Thailand, 24–25 October 2019; IEEE: Piscatway, NJ, USA, 2019;
pp. 94–98. [CrossRef]
5. Dyrmann, M.; Karstoft, H.; Midtiby, H.S. Plant species classification using deep convolutional neural network. Biosyst. Eng. 2016,
151, 72–80. [CrossRef]
6. Bir, P.; Kumar, R.; Singh, G. Transfer Learning based Tomato Leaf Disease Detection for mobile applications. In Proceedings of the
International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, India, 2–4 October
2020; IEEE: Piscatway, NJ, USA, 2020; pp. 34–39. [CrossRef]
7. Picon, A.; Alvarez-Gila, A.; Seitz, M.; Ortiz-Barredo, A.; Echazarra, J.; Johannes, A. Deep convolutional neural networks for
mobile capture device-based crop disease classification in the wild. Comput. Electron. Agric. 2019, 161, 280–290. [CrossRef]
8. Prasvita, D.S.; Herdiyeni, Y. MedLeaf: Mobile Application for Medicinal Plant Identification Based on Leaf Image. Int. J. Adv. Sci.
Eng. Inf. Technol. 2013, 3, 103. [CrossRef]
9. Muneer, A.; Fati, S.M. Efficient and Automated Herbs Classification Approach Based on Shape and Texture Features using Deep
Learning. IEEE Access 2020, 8, 196747. [CrossRef]
10. Herdiyeni, Y.; Wahyuni, N.K.S. Mobile application for Indonesian medicinal plants identification using fuzzy local binary pattern
and fuzzy color histogram. In Proceedings of the International Conference on Advanced Computer Science and Information
Systems (ICACSIS), Depok, Indonesia, 1–2 December 2012; IEEE: Piscatway, NJ, USA, 2012; pp. 301–306.
11. Cheng, Q.; Zhao, H.; Wang, C.; Du, H. An Android Application for Plant Identification. In Proceedings of the 4th Information
Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 14–16 December 2018; IEEE: Piscatway, NJ,
USA, 2018; pp. 60–64. [CrossRef]
12. Munisami, T.; Ramsurn, M.; Kishnah, S.; Pudaruth, S. Plant Leaf Recognition Using Shape Features and Colour Histogram with
K-nearest Neighbour Classifiers. Procedia Comput. Sci. 2015, 58, 740–747. [CrossRef]
13. Zhao, Z.-Q.; Ma, L.-H.; Cheung, Y.; Wu, X.; Tang, Y.; Chen, C. ApLeaf: An efficient android-based plant leaf identification system.
Neurocomputing 2015, 151, 1112–1119. [CrossRef]
14. Priyankara, H.A.C.; Withanage, D.K. Computer assisted plant identification system for Android. In Proceedings of the Moratuwa
Engineering Research Conference (MERCon), Moratuwa, Sri Lanka, 7–8 April 2015; IEEE: Piscatway, NJ, USA, 2015; pp. 148–153.
[CrossRef]
15. Akiyama, T.; Kobayashi, Y.; Sasaki, Y.; Sasaki, K.; Kawaguchi, T.; Kishigami, J. Mobile Leaf Identification System using CNN
applied to plants in Hokkaido. In Proceedings of the 8th Global Conference on Consumer Electronics (GCCE), Osaka, Japan,
15–18 October 2019; IEEE: Piscatway, NJ, USA, 2019; pp. 324–325. [CrossRef]
16. Jassmann, T.J.; Tashakkori, R.; Parry, R.M. Leaf classification utilizing a convolutional neural network. In Proceedings of the
SoutheastCon 2015, Fort Lauderdale, FL, USA, 9–12 April 2015; IEEE: Piscatway, NJ, USA, 2015; pp. 1–3. [CrossRef]
17. Zaid, M.; Akhtar, S.; Patekar, S.A.; Sohani, M.G. A Mobile Application for Quick Classification of Plant Leaf Based on Color and
Shape. Int. J. Mod. Trends Eng. Res. 2015, 2, 2393–8161.
18. Jassmann, T.J. Mobile Leaf Classification Application Utilizing A CNN. Master’s Thesis, Appalachian State University, Boone,
NC, USA, 2015.
19. Hussein, B.R.; Malik, O.A.; Ong, W.-H.; Slik, J.W.F. Automated Classification of Tropical Plant Species Data Based on Machine
Learning Techniques and Leaf Trait Measurements. In Computational Science and Technology; Springer: Singapore, 2020; pp. 85–94.
[CrossRef]
20. Ngugi, L.C.; Abelwahab, M.; Abo-Zahhad, M. Tomato leaf segmentation algorithms for mobile phone applications using deep
learning. Comput. Electron. Agric. 2020, 178, 105788. [CrossRef]
21. Knight, D.; Painter, J.; Potter, M. Automatic Plant Leaf Classification for a Mobile Field Guide—An Android Application; Stanford
University: Stanford, CA, USA, 2010.
22. Petrovska, B.B. Historical review of medicinal plants’ usage. Pharmacogn. Rev. 2012, 6, 1–5. [CrossRef] [PubMed]
23. Pimm, S.L.; Joppa, L.N. How many plant species are there, where are they, and at what rate are they going extinct? Ann. MO Bot.
Gard. 2015, 100, 170–176. [CrossRef]
24. Barthlott, W. Borneo—The Most Species-Rich Area in the World! 2005. Available online: https://phys.org/news/2005-05-borneo-
species-rich-area-world.html (accessed on 19 January 2021).
Plants 2022, 11, 1952 17 of 17
25. Saw, L.G.; Chua, L.S.L.; Suhaida, M.; Yong, W.S.Y.; Hamidah, M. Conservation of some rare and endangered plants from
Peninsular Malaysia. Kew Bull. 2010, 65, 681–689. [CrossRef]
26. Rautner, M.; Hardiono, M. Borneo: Treasure Island at Risk; WWF: Gland, Switzerland, 2005; pp. 1–80.
27. Joly, A.; Göeau, H.; Glotin, H.; Spampinato, C.; Bonnet, P.; Vellinga, W.P.; Planqué, R.; Rauber, A.; Palazzo, S.; Fisher, B.; et al.
LifeCLEF 2015: Multimedia life species identification challenges. In Proceedings of the 6th International Conference of the
CLEF Association (CLEF’15), Toulouse, France, 8–11 September 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 462–483.
[CrossRef]
28. Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. ImageNet: A large-scale hierarchical image database. In Proceedings of
the Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscatway, NJ, USA, 2009;
pp. 248–255. [CrossRef]
29. Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection (RetinaNet). IEEE Trans. Pattern Anal.
Mach. Intell. 2017, 42, 318–327. [CrossRef] [PubMed]
30. Cubuk, E.D.; Zoph, B.; Mane, D.; Vasudevan, V.; Le, Q.V. AutoAugment: Learning Augmentation Policies from Data. arXiv 2019,
arXiv:1805.09501.
31. Smith, L.N. Cyclical Learning Rates for Training Neural Networks. In Proceedings of the Winter Conference on Applications of
Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; IEEE: Piscatway, NJ, USA, 2017; pp. 464–472.
32. Loshchilov, I.; Hutter, F. Fixing Weight Decay Regularization in Adam. arXiv 2017, arXiv:1711.05101.