0% found this document useful (0 votes)
20 views18 pages

Minerals 12 00455 v2

Uploaded by

Raphael Noriega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views18 pages

Minerals 12 00455 v2

Uploaded by

Raphael Noriega
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

minerals

Article
Deep-Learning-Based Automatic Mineral Grain Segmentation
and Recognition
Ghazanfar Latif 1, *, Kévin Bouchard 1 , Julien Maitre 1 , Arnaud Back 2 and Léo Paul Bédard 2

1 Department of Computer Sciences and Mathematics, Université du Québec à Chicoutimi, 555 Boulevard de
l’Université, Chicoutimi, QC G7H 2B1, Canada; [email protected] (K.B.);
[email protected] (J.M.)
2 LabMaTer, Sciences de la Terre, Université du Québec à Chicoutimi, 555 Boulevard de l’Université,
Chicoutimi, QC G7H 2B1, Canada; [email protected] (A.B.); [email protected] (L.P.B.)
* Correspondence: [email protected]

Abstract: A multitude of applications in engineering, ore processing, mineral exploration, and


environmental science require grain recognition and the counting of minerals. Typically, this task
is performed manually with the drawback of monopolizing both time and resources. Moreover, it
requires highly trained personnel with a wealth of knowledge and equipment, such as scanning
electron microscopes and optical microscopes. Advances in machine learning and deep learning
make it possible to envision the automation of many complex tasks in various fields of science at
an accuracy equal to human performance, thereby, avoiding placing human resources into tedious
and repetitive tasks, improving time efficiency, and lowering costs. Here, we develop deep-learning
algorithms to automate the recognition of minerals directly from the grains captured from optical
microscopes. Building upon our previous work and applying state-of-the-art technology, we modify
 a superpixel segmentation method to prepare data for the deep-learning algorithms. We compare
 two residual network architectures (ResNet 1 and ResNet 2) for the classification and identification
Citation: Latif, G.; Bouchard, K.; processes. We achieve a validation accuracy of 90.5% using the ResNet 2 architecture with 47 layers.
Maitre, J.; Back, A.; Bédard, L.P. Our approach produces an effective application of deep learning to automate mineral recognition
Deep-Learning-Based Automatic and counting from grains while also achieving a better recognition rate than reported thus far in the
Mineral Grain Segmentation and
literature for this process and other well-known, deep-learning-based models, including AlexNet,
Recognition. Minerals 2022, 12, 455.
GoogleNet, and LeNet.
https://doi.org/10.3390/
min12040455
Keywords: grain segmentation; deep learning; convolutional neural networks; ResNet; mineral
Academic Editor: Amin Beiranvand recognition
Pour

Received: 21 February 2022


Accepted: 2 April 2022
Published: 7 April 2022 1. Introduction
The advent of machine learning and automated classification has demonstrated the
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
potential of technology in many fields, such as medical/health, legal, transportation, and
published maps and institutional affil-
mining [1–6]. For example, in exploration geology and mining, the process of identifying
iations. economic minerals has always been done manually, where a specialized and trained
individual (a mineralogist) is required to identify minerals grains, such as gold, diamond
indicator minerals, or sulfides, to discover new deposits [7–9]. This manual process has
many limitations, including errors in identification and mineralogist fatigue and its time-
Copyright: © 2022 by the authors. consuming and, hence, costly [10]. Moreover, trained mineralogists are able to count
Licensee MDPI, Basel, Switzerland. around 60 grains per minute without distractions to provide grain percentage rather than
This article is an open access article the more useful area percentage [11]. With new advances in technology, mineral grain
distributed under the terms and identification and counting can now be performed using optical microscopy and scanning
conditions of the Creative Commons electron microscopy (SEM). However, even with SEM technology, the process remains
Attribution (CC BY) license (https:// expensive and time consuming. A scanning electron microscope costs between USD 0.5
creativecommons.org/licenses/by/
and USD 2 million and requires highly qualified personnel to operate.
4.0/).

Minerals 2022, 12, 455. https://doi.org/10.3390/min12040455 https://www.mdpi.com/journal/minerals


Minerals 2022, 12, 455 2 of 18

Nevertheless, the process of identifying and counting mineral grains in sands or


sediments is a crucial step for many mineral exploration and engineering projects, environ-
mental studies, and mining (extractive metallurgy); for example, minerals can be economic
(e.g., ore, building materials) or toxic (e.g., acid mine drainage production or release of toxic
elements such as lead or arsenic) [12,13]. The use of certain sands in building materials can
be a major problem, and identifying such grains is crucial in engineering projects [14,15].
In glacial sediments (tills) and soils, the number and count of certain mineral grains can
indicate the proximity of a potential deposit; in diamond exploration, for instance, certain
minerals, such as chromium-bearing pyrope or diopside, are used to confirm the presence
of proximal diamond deposits [16,17].
Machine learning offers an alternative to manual identification. Recent advances in
deep learning for image-based tasks offer the possibility of automating, at least partially,
grain identification and counting, saving time and money. Moreover, as opposed to relying
on SEM [18], the deep-learning-based approach can potentially be carried out in the field,
in remote areas where mineral potential is high. Such a method would allow a more rapid
identification of economic minerals or toxic minerals to allow effective environmental
surveys [19]. This automated approach could work in real time to sort minerals moving
along on a conveyor [20]. A robot with specialized tools and equipment could be used to
capture images of grain as it explores the terrains [21]. If images are tagged with a location,
real-time processing is not obligatory, thus, simplifying the challenge of embedding a
deep-learning model in a remote and potentially smaller computer.
In this paper, we propose an automated machine-learning approach to classify grains
from a sample using optical microscopy that builds upon our previous work published
in [22]. With this approach, the task of mineral identification requires minimal human inter-
vention. The images of grains are collected using inexpensive photomicrographic systems
or through the use of robotic machines or automated microscopes. The images (photomi-
crographs) can then be processed to isolate the grain images within the complete image
and, thus, classify and count these grains. Our approach uses an improved superpixel
method that segments the grains quickly and automatically. To solve it using deep learning,
the segmentation must be very accurate for the model to automatically learn the features
representing each class. Then, the segmented grains can be used as an input into the trained
deep-learning model. Although deep learning frequently outperforms classical machine
learning, it is only recently that mineral identification has been investigated with deep learn-
ing. With the new segmentation and state-of-the-art deep-learning models, we can achieve
better results than observed in published classical machine-learning-based approaches.

2. Literature Review
Currently, there are mainly two distinct methods for grain recognition: traditional
engineering devices [23,24] and computational methods [25].

2.1. Traditional, Device-Based Methods


Traditional methods for classifying and counting mineral grains rely on the use of
SEM or optical microscopes. The use of an optical microscope is the most common method
for estimating mineral abundance in sediments or milled rock, although this requires
highly trained personnel to sort the mineral grains. Mineral sorting is possible using the
specific polarized transmitted and reflected light properties of minerals and the morpho-
logical properties of the grains. Advances in the use of optical microscopes have been
successfully applied to mineral grain analyses, although the main limitations discussed
above remain [26–29]. Significant improvement of this method will require a technical
breakthrough. Automated SEM provides an alternative means of counting minerals [11,30],
and the SEM-based approaches include QEMSCAN, TIMA-X, and MLA [31]. SEM uses
a focused electron beam to scan the material and generate an image of the grains. The
interaction of the electrons with atoms on the grain surface provides additional informa-
tion captured by the various sensors (e.g., X-ray fluorescence) to determine the chemical
Minerals 2022, 12, 455 3 of 18

composition of the mineral. SEM output includes the chemical composition with grain size,
shape, and proportion. Grain counting can be performed using an electron microprobe [32];
however, this method is time consuming [33].
In [34], the authors presented an image processing workflow for characterizing pore-
and grain-size distributions in porous geological samples using SEM images and X-ray
microcomputed tomography (µCT). Their samples included the Buff Berea, Berea, Nugget,
Bentheimer, and Castlegate sandstones and the carbonate Indiana Limestone. The produced
2D distribution from the SEM appeared biased toward smaller sizes. In [35], the authors
developed a grain count technique using a laser particle counter sensor (Wenglor) to count
stainless-steel beads and sand grains of different size classes. They compared the count
with that obtained using high-speed cameras. They found that the Wenglor can count grain
sizes between 210 ± 3 µm and 495 ± 10 µm and that only grains passing through the center
of the beam were counted. In [36], the authors used a less expensive light microscope
able to produce images of grain shape profiles sufficient in quality for identification and
counting. Their key finding was that roundness, sphericity, circularity, ModRatio, and
aspect ratio were the key shape parameters for differentiating grains.

2.2. Computer Vision-Based Computational Methods


Computational or machine-learning methods are increasingly applied in a multitude
of spheres, including automated driving and navigation, automated image recognition,
automated medical diagnosis, and agricultural processes [37–39]. The ability to apply
machine-learning tools to a vast suite of applications also extends to the environmental and
geological sciences.
The integration of machine learning to automate the process of mineral grain recog-
nition was first explored by Maitre et al. [22]. The authors used linear iterative clustering
segmentation to generate superpixels, thereby, isolating individual grains. The applied
feature extraction method, using a series of classifiers, produced an 89% recognition rate.
In [25], cluster analysis through a k-means algorithm for mineral recognition divided the
data set into categories according to the similarity, computed by distance, e.g., Euclidean
distance. Baklanova and Shvets extracted the colors and textures of grains using a stereo-
scopic binocular microscope. However, the authors failed to compare clusters found with
labeled clusters that actually belonged to a certain species of minerals. In fact, their work
was used only to classify rocks and not minerals and, thus, their work is only applicable to
petrography. Other methods of mineral classification, although limited to copper minerals,
have produced an acceptable, approximate 75% accuracy using laser-induced breakdown
spectroscopy (LIBS) analyzers [40]. In [41], the authors classified heavy minerals collected
from rivers. Using 3067 grains in 22 classes, they achieved 98.8% accuracy using 26 decision
attributes and a random forest algorithm.

3. Materials and Methods


Our approach consisted of four main stages (Figure 1). The first stage involved data
collection followed by preprocessing the original mosaic and SEM images to remove noise
and outlier objects. In the third stage, the grains were segmented by utilizing the contours
and superpixel-based techniques. We selected five classes for recognition on the basis of
classes with the greatest number of grains. In the final stage, we input the segmented grains
into various convolutional neural network (CNN) models.
Minerals 2022, 12, x FOR PEER REVIEW 4 of 18

Minerals2022,
Minerals 2022,12,
12,455
x FOR PEER REVIEW 4 of 184 of 18

Figure 1. Workflow of the grains recognition method used in this paper.


Workflowofofthe
Figure1.1.Workflow
Figure thegrains
grains recognition
recognition method
method used
used in this
in this paper.
paper.

3.1. Data
3.1. Data SetSet
Acquisition
Acquisition
3.1. Data Set Acquisition
We We collected
collected 1010kgkgofof tilltill
grains
grainsfrom
from the field,
the and
field, and sediments
sediments wereweresieved
sieved toto
less
lessthan
than
We collected 10 kg of till grains from the field, and sediments were sieved to less than
1 1mm.
1 mm.
mm. TheThesamples
samples were
were then
then processed
processed with
with a fluidized
a fluidized bedbedto
The samples were then processed with a fluidized bed to obtain a superconcentrate obtain
to obtain a superconcentrate
a superconcentrate
ofheavy
ofof heavy
heavymineralsminerals(approximately
minerals (approximately
(approximately 100
100
100 mg)
mg)
mg) containing
containing
containing approximately
approximately
approximately 2 2million
2 million million grains
grains
grains
smaller
smaller
smaller than
than
than 505050 µm.
µµm.m. TheThe
The superconcentrate
superconcentrate
superconcentrate was
was
was sprinkled
sprinkled
sprinkled onto
onto
onto carbon
carbon
carbon tape
tape tape
to to to provide
provide
provide a a a
black
black backdrop
backdrop forfor the
the images.
images. Images
Images were
were then
then obtained
obtained
black backdrop for the images. Images were then obtained using a camera mounted onto using
using a a camera
camera mounted
mounted onto
onto a
a abinocular microscope,
microscope, and
and wewe created
created a photomosaic.
a photomosaic.
binocular microscope, and we created a photomosaic. To acquire the groundtruthed ToToacquire
acquire the groundtruthed
the groundtruthed data,
i.e., i.e.,
data,
data, mineral
i.e.,mineralgrain
mineral identities,
grain
grain identities,
identities, we we
acquired
we acquired
acquired a backscattered
aabackscattered
backscattered imageimage
image ofofthe
of grains
thethe using
grains
grains SEM
using
using
SEMwith
SEMwith X-ray
withX-rayfluorescence
X-ray fluorescence [42]. The
fluorescence [42]. The groundtruthed
The groundtruthed data
groundtrutheddata were the
datawere mineral
werethe the map
mineral
mineral and
map referenced
map andand
with the RGB
referenced
referenced withmosaic.
with the RGBThe
the RGB end result,
mosaic.
mosaic. The end
The after
end using
result,
result, theusing
after
after motorized
usingthe conventional
themotorized
motorized microscope
conventional
conventional
and 6-megapixel
microscope
microscope and camera, was
and6-megapixel
6-megapixel an approximate
camera,
camera, was
was an 2 GB mosaic
anapproximate
approximate image
2 2GBGBmosaic (34,674
mosaic image × (34,674
image 33,720
(34,674pixels)
× ×
to bepixels)
33,720
33,720 used
pixels) asto
tothe data
usedset
beused
be for data
as the
as the machine-learning
data set
set for
forthe algorithm.algorithm.
themachine-learning
machine-learning We acquired
algorithm. WeWe 238
acquiredfields of
acquired
238view
238 fieldswith
fields aview
ofof 10%
viewoverlap
with aa 10%
with between adjacent
overlap
overlap fields
between
between in the images.
adjacent
adjacent fieldsinFigure
fields inthethe 2images.
images.shows the
Figure
Figuresample
2 2
of the
shows grains
the sampleand the
of corresponding,
the grains and the annotated
corresponding,
shows the sample of the grains and the corresponding, annotated SEM image. SEM image.
annotated SEM image.

(a) (b)
(a) (b)
Figure 2. Sample segment of the original (a) and SEM-based groundtruth image (b).
Figure 2. 2.
Figure Sample segment
Sample of of
segment thethe
original (a)(a)
original and SEM-based
and groundtruth
SEM-based image
groundtruth (b).
image (b).
Minerals 2022, 12, x FOR PEER REVIEW 5 of 18
Minerals 2022, 12, 455 5 of 18

3.2. Data Preprocessing


3.2.The
Dataoriginal
Preprocessing
image background consisted of outlier grains that are not part of the
SEM annotated image;
The original image therefore,
background preprocessing,
consisted ofusing various
outlier grainsmorphological
that are not part operations,
of the SEM
annotated
served image;outlier
to remove therefore, preprocessing,
particles. An outlier using
grainvarious morphological
is a phantom image operations, served
of a grain lying
to remove
outside outlier
of the field particles.
of view. To Anreduce
outlierprocessing
grain is a phantom
time, we image
cropped of athe
grain lyingimage
original outside toof
the field
include of view.
only 1/3 of To
thereduce
originalprocessing
image bytime, we cropped
discarding 12,000the original
border image
pixels on to
allinclude only
sides that
1/3
did of contain
not the original image
grains. Thisbynewdiscarding
image was 12,000 border
further pixelsinto
divided on all sides
5608 that equally
× 5608 did not contain
sized
grains. This new image was further divided into 5608 × 5608 equally
subimages. We considered only five classes for classification because of unbalanced data sized subimages. We
considered only five classes for classification because
and a low number of instances for some of the discarded classes. of unbalanced data and a low number
of instances for some ofimage
The groundtruthed the discarded classes.into a binary image and morphological op-
was converted
erationsThe groundtruthed
were image was
applied, i.e., dilation, converted
filling holes, and into a binary
erosion, image and
to remove morphological
the outlier grains,
operations
the background,wereand applied, i.e., dilation,
other noise. fillingfilled
The largest holes, and erosion,
segment of the to remove the
SEM-based, outlier
labeled
grains, the background, and other noise. The largest filled segment
image was extracted by discarding all outlier grains and other noise. The erosion and di- of the SEM-based,
labeled
lation workimage
was was
based extracted
on kernel by size
discarding
to reduce all the
outlier
sizegrains
of the and
input other noise.
image. The erosion
Similarly, di-
lation increased the size of the input image on the basis of kernel size. We applied aSimilarly,
and dilation work was based on kernel size to reduce the size of the input image. kernel
dilation
size of 7 ×increased the size
7. The erosion andofdilation
the input image
for the binaryon the basis
image of kernel
were calculatedsize.using
We applied
Equa-
a kernel
tion (1) andsize of 7 ×(2),
Equation 7. respectively,
The erosion where and dilation for thethe
A represents binary image
original were
binary calculated
image, and
using Equations (1) and (2), respectively, where A represents the original
B represents the kernel. In Equation (1), 𝐵𝑧 is translation B by vector z. Similarly, in Equa- binary image,
and
tion B represents
(2), the kernel.
𝐴𝑏 is translation A by In Equation
vector b. (1), Bz is translation B by vector z. Similarly, in
Equation (2), Ab is translation A by vector b.
𝐴 ⊖ 𝐵 = 𝑧 𝜖 𝐸 | 𝐵𝑧 , (1)
A B = z e E | Bz , (1)
𝐴 ⊕ 𝐵 = ⋃𝑏 𝜖 𝐸 𝐴𝑏 . (2)
A ⊕ B = ∪b e E Ab . (2)
Figure 3 shows
Figure the
3 shows outcome
the outcomeofof
different preprocessing
different preprocessingsteps and
steps andmapping
mappingofof
the SEM
the SEM
ground truth image and the Original Image based on the processed SEM binary image
ground truth image and the Original Image based on the processed SEM binary image.

(a) (b)

Figure 3. Cont.
Minerals 2022, 12, x FOR PEER REVIEW 6 of 18
Minerals 2022, 12, 455 6 of 18

(c) (d)

Figure
Figure 3. Outcome
3. Outcome of of
thethe preprocessing
preprocessing steps.
steps. Images
Images show
show thethe original
original SEMSEM images
images (a),(a),
thethe outliers
outliers
removed using the morphological and contours-based method from the binary converted image (b),(b),
removed using the morphological and contours-based method from the binary converted image
thethe
preprocessed
preprocessed outcome
outcome SEM
SEMimage
imageafter removing
after removingoutliers
outliers(c),(c),
and thethe
and preprocessed
preprocessed outcome
outcomeof of
thethe
original
originalimage after
image removing
after removingoutliers
outliers(d).
(d).Wrong
Wrongformat.
format.

3.3.
3.3. Grain
Grain Segmentation
Segmentation
We We used
used superpixel
superpixel segmentation
segmentation toto separate
separate mineral
mineral grain
grain data
data (see
(see Algorithm
Algorithm 1).1).
The image was first converted to binary, and morphological operations—erosion
The image was first converted to binary, and morphological operations—erosion and di- and
dilation—were applied to the image to separate the grains from each other.
lation—were applied to the image to separate the grains from each other. To convert the To convert the
image into binary, the image threshold was calculated using Otsu’s
image into binary, the image threshold was calculated using Otsu’s method [43]. Using method [43]. Using
the
the resulting
resulting binary
binary image,
image, wewe calculated
calculated thethe total
total number
number ofof external,
external, closed
closed contours
contours toto
represent the possible grains in the image. Contours are closed curves that
represent the possible grains in the image. Contours are closed curves that are calculated are calculated
using
using thethe edges
edges ofof objects
objects withwith the
the same
same values
values oror pixel
pixel intensities.
intensities. The
The contour
contour count
count CC
then serves as a seed for the superpixel segmentation method rather
then serves as a seed for the superpixel segmentation method rather than using a fixed than using a fixed
number K as a seed. We applied Equation (3) to calculate the superpixel center grid interval
number K as a seed. We applied Equation (3) to calculate the superpixel center grid inter-
of approximately equal-sized superpixels of an input image of size N.
val of approximately equal-sized superpixels of an input image of size N.
r
𝑁N
=√ . .
S𝑆 = (3)(3)
𝐶C

TheThesuperpixel
superpixelsegmentation
segmentationmethodmethodrelies
relieson
onoversegmenting
oversegmentingthe theimage
imagewhile
while sim-
simul-
ultaneously
taneously decreasing
decreasing thethe complexity
complexityof ofthe
theimage
imageprocessing
processingtasks.
tasks.WeWeapplied
applied a simple
a simple
linear iterative
linear clustering
iterative clustering (SLIC)
(SLIC)method
method toto
produce
produce high-quality
high-quality segmentation
segmentation inin
a timely
a timely
manner [44]. The method performs local k-mean clustering of the image
manner [44]. The method performs local k-mean clustering of the image pixels using pixels using color
color
similarity and proximity in the subimages. The method also uses the
similarity and proximity in the subimages. The method also uses the five-dimensional five-dimensional
spaces
spaces provided
provided byby
thethelabxy
labxyimage
image plane,
plane,where
where l, a, and
l, a, andb are
b arethe
thepixel
pixelvector
vectorcolors
colors
provided
provided byby
the
theCIELAB
CIELAB color
colorspace, and
space, thethe
and x and
x and y values
y values are
arethe
thecoordinates
coordinates ofofthe
the
pixels
pixelswhich
whichrepresent
representthe thespatial
spatialdistances.
distances.ToTomerge
merge the
thecolor
colorproximity
proximity andandspatial
spatial
proximity
proximity distances,
distances,wewe normalized
normalized the distances
the distances using
using Equations
Equations (4)(4)
and and(5). ToTo
(5). useusethe
the
labxy space to cluster the pixels, we required the distance measure D,
labxy space to cluster the pixels, we required the distance measure D, which considers which considers
approximately
approximately equal-sized
equal-sized superpixels.
superpixels.

𝐷𝑐 = √(𝑙𝑚 − 𝑙𝑛 )2 + (𝑎𝑚 − 𝑎𝑛 )2 + (𝑏𝑚 − 𝑏𝑛 )2 . (4)


q
Dc = ( l m − l n ) 2 + ( a m − a n ) 2 + ( bm − bn ) 2 . (4)
q − 𝑥 )2 + (𝑦 − 𝑦 )2 .
𝐷𝑠 = √(𝑥 (5)
𝑚 𝑛 𝑚 𝑛
Ds = ( x m − x n ) 2 + ( y m − y n ) 2 . (5)
s 𝐷 𝐷𝑠 2
𝐷 = √( D𝑐 )c 2 2+ ( D)s . 2 (6)
D = ( 𝑁𝑐 ) + (𝑁𝑠 ) . (6)
Nc Ns
Minerals
Minerals 2022,
2022, 12,
12, x FOR PEER REVIEW
455 7 7of
of 18
18
Minerals 2022, 12, x FOR PEER REVIEW 7 of 18

The
The segmentation
segmentation provided
provided the the xyxy coordinates
coordinates of each superpixel.
of each superpixel. TheThe method
method was was
The segmentation
further enhanced by provided the
increasing the xy coordinates
contrast of the of each superpixel.
images to allow the The method wasof
discrimination
further enhanced by increasing the contrast of the images to allow the discrimination of the
further
the enhanced
grain borders.byInincreasing
Maitre the
et[22], contrast
al. [22], of the images to allow the discrimination of
grain borders. In Maitre et al. the the superpixel
superpixel method
method waswas applied
applied usingusing a fixed-
a fixed-size
the grain
size
inputinputborders.
seedseed In
valuevalueMaitre et al.
forsuperpixels.
for the [22],
the superpixels. the superpixel
This This approach
approach method
workedworkedwas applied
well
well for using
forcolor
the the color a fixed-
feature-
feature-based
size input
based
method seed value
method
with withfor the superpixels.
classical
classical This
machine-learning
machine-learning approach
methods; worked
methods;
however, wellmethod
however,
this for themethod
this colornot
did feature-
did
relynot
on
based
rely
deeponmethod
learning.with classical
deep learning.
Thus, Thus,
we machine-learning
we proposed
proposed methods;
to automate
to automate however,
the of this
calculation
the calculation method
of
the seed didvalues
thevalues
seed not
in the
rely
in on segmentation
the deep learning.
segmentation method Thus,
method we proposed
to prepare
to prepare the data to
the
for automate
data the calculation
for deep-learning
deep-learning networks. of the
networks.
The seed valuesof
The compar-
comparisons
inisons
the segmentation
the superpixel method
of the superpixel
boundaries to prepare
boundaries
and the outcome the data
and thefor for
outcomedeep-learning networks.
for the grains
the segmented segmented The
grains
for both compar-
for both
methods are
isons of theare
methods
presented superpixel
in Figures 4 boundaries
presented in Figures
and 4and
andthe
5, respectively. outcome for the segmented grains for both
5, respectively.
methods are presented in Figures 4 and 5, respectively.

(a) (b)
(a) (b)
Figure
Figure 4.
4. Visual
Visual comparison
comparison of
of the
the detected
detected superpixel
superpixel boundaries
boundaries of
of grains
grains using
using the
the method
method of
of
Figure 4.etVisual
Maitre al. (a) comparison
and our of the (b)
approach detected
[22]. superpixel boundaries of grains using the method of
Maitre et al. (a) and our approach (b) [22].
Maitre et al. (a) and our approach (b) [22].

(a) (b)
(a) (b)
Figure 5. Samples of the outcome of the segmented grains using the method of Maitre et al. [22] (a)
Figure 5. 5.
Figure
and our Samples of(b).
Samples
approach thethe
of outcome of of
outcome thethe
segmented grains
segmented using
grains thethe
using method of of
method Maitre et et
Maitre al.al.
[22] (a)(a)
[22]
and our
and approach
our approach(b).
(b).
Minerals 2022, 12, 455 8 of 18

Algorithm 1: Segmentation and Annotation of Grains


Input: Grains mosaic image M with BSE groundtruthing image G and classes n = 5
Output: Segmented grains S with their annotation A
read M, read G
B ← binary (grayscale (M), Otsu)
Be ← erosion (B, ones (15, 15))
Bc ← find external contours (Be , chain approx simple)
Bcount ← length (Bc )
GrainsApprox ← Bcount × 2.5
Mhq ← histogram equalization (M)
S0 ← superpixel (Mhq , GrainsApprox, compactness = 20, sigma = 1)
D ← unique colors ( G )
c←0
for g in S0
Sc ← g  
(i × j )
if (∑in,=0,
m
j=0 g (i, j )! = [0, 0, 0] ) ≥ 1.5 & 25
for d in D
Md ← count ( g (i, j) = d )
Ac, 1 , Ac, 2 ← Max1 ( M), Max1 ( M)
else
Ac, 1 , Ac, 2 ← 0
c ← c+1
end for
Select n classes with maximum grain count

3.4. Grain Class Annotation


We selected five main classes on the basis of the number of segmented grains for each
class and the group distribution of visually similar, rock-forming minerals (Table 1). We
selected six types of individual grain that were further mapped to five classes, including
the background class. These segmented images were labeled by mapping the original
subimages to the SEM-based subimages using the superpixel-based method. The bounding-
box method was then applied to extract the grains that had a rectangular format. The
grains with a height:width ratio greater than 1.75 were discarded. A total of 21,091 images
were segmented.

Table 1. Summary of the selected grain classes.

Class Label Primary Grain Type Secondary Grain Type Number of Grains
Albite None
Quartz None
Quartz Albite
C1 6879 images
Albite Quartz
Albite Any class > 256 pixels
Quartz Any class > 256 pixels
Augite None
Tschermakite Any class > 256 pixels
C2 Tschermakite Augite 3295 images
Augite Tschermakite
Augite Any class > 256 pixels
Magnetite Any class > 256 pixels
C3 3823 images
Magnetite None
Hypersthene Any class > 256 pixels
C4 988 images
Hypersthene None
C5 Background - 6106 images

The final data set consisted of 21,091 images divided into five classes. Albite grain
and quartz grain images were merged into one class because they are visually similar,
Hypersthene
Hypersthene Any
Any class
class >> 256
256 pixels
pixels
C4
C4 Hypersthene Any class > 256 pixels 988
988 images
images
C4 Hypersthene
Hypersthene None
None 988 images
C5 Hypersthene
Background -None 6106
C5 Background - 6106 images
images
Minerals 2022, 12, 455 C5 Background - 6106 images 9 of 18

The
The final
final data
data set
set consisted
consisted of of 21,091
21,091 images
images divided
divided intointo five
five classes.
classes. Albite
Albite grain
grain
and The final data set consisted of 21,091 images divided into five classes. Albite grain
and quartz
quartz grain
grain images
images were were merged
merged intointo one
one class
class because
because theythey areare visually
visually similar,
similar,
and quartz grain
rock-forming images
minerals. were merged intoofone ofclass because theyquartz
are visually similar,
rock-forming
rock-forming minerals.
minerals. TheThe
The sample
sample
sample images
images
images of albite
albitealbite
grain
graingrain
and and
and quartz
quartz graingrain
grain are are
are shown
shown shown
in
in
rock-forming
in Figure
Figure 6, which minerals.
6, which
clearly The
clearly sample
indicate
indicate images
theirtheir of albite
visual
visual grain
similarity.
similarity. and quartz
Augite
Augite grain
grain
grain and are
and shown in
tschermakite
tschermakite
Figure 6, which clearly indicate their visual similarity. Augite grain and tschermakite
Figure
grain6,images
grain which wereclearlyalsoindicate
merged their
intovisual similarity.
one class, Augite
as aresamples grain and
the samples showntschermakite
in Figure 7,
grain images
images were were also
also merged
merged into
into one
one class,
class, as
as are
are the
the samples shown shown in in Figure
Figure 7, 7, due
due
grain
to due images
their theirwere
tovisual also
visual
similarity. merged
similarity.
The into one class,
The background
background class ascontained
are the
class samples
contained
images shown
images
which in Figure
which
were 7, due
were
either either
en-
to their visual similarity. The background class contained images which were either en-
to entirely
their visual
tirely black similarity.
or contained The background class total
very small grains contained
(the total images
number which were either
of nonblack pixels en-
was
tirely black
black or or contained
contained very very small
small grains
grains (the
(the total number
number of of nonblack
nonblack pixels
pixels was
was less
less
tirely
less
than black
than or
256)contained
or containedvery small
noise grains
in the (the total
background. number
Figure of8nonblack
shows pixels
the samplewas less
images
than 256)
256) oror contained
contained noise
noise inin the
the background.
background. FigureFigure 88 shows
shows thethe sample
sample images
images of of the
the
than
of 256)
background or contained
the background
class. For noise
class.
the in the
For thebackground.
experiments,experiments,
these Figure
five these 8 five
classes’ shows the images
classes’
data set sample images
data set
wereimagesof the
dividedwere
background class. For the experiments, these five classes’ data set images were divided
background
divided
into 20% for class.
into 20%For
training, the
forand experiments,
training,
the remaining these
and the remaining
80% fivedivided
was classes’
80% data
wasagain set
divided
into images
again were
into
80%/20% divided
80%/20%
for valida- for
into 20% for training, and the remaining 80% was divided again into 80%/20% for valida-
validation/training
into 20% for training,
tion/training sets. sets.the remaining 80% was divided again into 80%/20% for valida-
and
tion/training sets.
tion/training sets.

Albite,
Albite, None
None Albite,
Albite, Quartz
Quartz Quartz,
Quartz, Albite
Albite Quartz,
Quartz, None
None
Albite, None Albite, Quartz Quartz, Albite Quartz, None
Figure 6. Sample
Figure images
6. Sample of the
images albite
of the andand
albite quartz mineral
quartz grains.
mineral grains.
Figure 6. Sample images of the albite and quartz mineral grains.
Figure 6. Sample images of the albite and quartz mineral grains.

Augite,
Augite, None
None Augite,
Augite, Tschermakite
Tschermakite Tschermakite,
Tschermakite, Augite
Augite Tschermakite,
Tschermakite, none
none
Augite, None Augite, Tschermakite Tschermakite, Augite Tschermakite, none
Figure 7. Sample images of the augite and tschermakite mineral grains.
Figure 7. Sample
Figure images
7. Sample of the
images augite
of the and
augite tschermakite
and mineral
tschermakite grains.
mineral grains.
Figure 7. Sample images of the augite and tschermakite mineral grains.

Figure 8. Sample images of background class.


Figure 8. Sample images of background class.
Figure 8. Sample
Figure images
8. Sample of background
images class.
of background class.
3.5.
3.5. ResNet
ResNet Models
Models for
for Grain
Grain Recognition
Recognition
3.5.3.5. ResNet
ResNet Models
Models forfor Grain
Grain Recognition
Recognition
With
With the
the growing
growing difficulties
difficulties in
in the functions
thethe
functions of
of computer
computer vision
vision and
and artificial
artificial intel-
intel-
ligence, With
With thethe
deep growing
growing
neural difficulties
difficulties
network in in
models the
are functions
functions
becoming of of computer
computer
increasingly vision
vision and
complex.and artificial
artificial
Such intel-
intel-
strong
ligence,
ligence, deep
deep neural
neural network
network models
models areare
becoming
becoming increasingly
increasingly complex.
complex. SuchSuchstrong
strong
ligence,demand
models deep neural
more network
data for models to
learning areprevent
becoming increasingly
overfitting. Recent complex. Such strong
deep-learning meth-
models
models demand
demand more
more data
dataforfor
learning
learning to prevent
to prevent overfitting.
overfitting. Recent
Recentdeep-learning
deep-learning meth-
meth-
models
ods have demand
been more data
successfully for learning
applied to to prevent
artificial overfitting.
intelligence Recent
[45,46]. deep-learning
Interest in meth-
convolu-
odsodshave
havebeen
been successfully
successfully applied
applied to
to artificial
artificial intelligence
intelligence [45,46].
[45,46]. Interest
Interest in in convolu-
convolutional
ods have
tional beensystems
neural successfully
(CNN) applied
began to
in artificial intelligence [45,46].wasInterest inonconvolu-
tional
neural neural systems
systems (CNN) (CNN)
began began in 2012
in 2012 2012 with
with
withwith
AlexNet,
AlexNet,
AlexNet, which
which
which was
waswasbased
based
based on LeNet.
on LeNet. LeNet.
New
tional
New neural
CNN-based systems
models (CNN)
have began
since in
been2012
developed, AlexNet, which
including GoogleNet basedand onresidual
LeNet.
New CNN-based
CNN-based models
models havehave
sincesince
been been developed,
developed, including
including GoogleNet
GoogleNet and and residual
residual neural
New CNN-based
neural networks models [47–49].
(ResNet) have since beenmajor
CNN’s developed,
advantage including
is GoogleNet and residual
neural networks
networks (ResNet)
(ResNet) [47–49].
[47–49]. CNN’s CNN’s major
majormajor advantage
advantage is its
its ability
is its ability ability to learn
to the
to learn learn the
the critical
critical critical
features
neural
features networks
best (ResNet) [47–49]. CNN’s advantage is its ability to learn the critical
features
best best representing
representing
representing the data
the data
data without
thewithout without
any
any
any human
human human intervention.
intervention.
intervention.
features best overcomes
ResNet representing the data
model without and
complexity any human
the intervention.
vanishing gradient problems
ResNet
ResNet overcomes
overcomes model
model complexity
complexity and
and the
the vanishing gradient problemstoto
gradientproblems to pro-
pro-
duce ResNet overcomes
satisfactory modelby
accuracies complexity
training and
deeperthe vanishing
networks gradient
[50]. Eachproblems
ResNet toproduce
pro-
block
duce satisfactory
satisfactory accuracies
accuracies by training
by training deeperdeeper
networks networks
[50]. Each[50].ResNet
Each block
ResNet block
comprises
duce satisfactory accuracies by training deeper networks [50]. Each ResNet block
four layers. The weight layer is expressed as ( Zn+1 = Wn+1 Xn + Yn+1 ). The ReLU layer,
a nonlinear layer, is expressed as ( Xn+1 = H ( Zn+1 )), and a third layer is a weight layer
( Zn+2 = Wn+2 + Yn+2 ). Xn is the input to the three layers combined, and F ( Xn ) is pro-
duced in the output. All these variables are matrices, and the subscripts are used to denote
the layer numbers. In ResNet, a skip or shortcut link is used to bypass the three layers to
pass Xn to an adder. Thus, the fourth layer, ReLU, is applied to F ( Xn ) = Zn+2 to produce
comprises four layers. The weight layer is expressed as (𝑍𝑛+1 = 𝑊𝑛+1 𝑋𝑛 + 𝑌𝑛+1 ). The
ReLU layer, a nonlinear layer, is expressed as (𝑋𝑛+1 = 𝐻(𝑍𝑛+1 )), and a third layer is a
weight layer (𝑍𝑛+2 = 𝑊𝑛+2 + 𝑌𝑛+2 ). 𝑋𝑛 is the input to the three layers combined, and
𝐹(𝑋𝑛 ) is produced in the output. All these variables are matrices, and the subscripts are
used to denote the layer numbers. In ResNet, a skip or shortcut link is used to bypass the
Minerals 2022, 12, 455 three layers to pass 𝑋𝑛 to an adder. Thus, the fourth layer, ReLU, is applied to 𝐹(𝑋 )=
10𝑛 of 18
𝑍𝑛+2 to produce 𝑋𝑛+2 = 𝐻(𝑍𝑛+2 + 𝑋𝑛 ). With this skip, 𝐹(𝑋𝑛 ) = 𝐻(𝑍𝑛+2 ) is added to 𝑋𝑛
before passing through the second ReLU layer to generate 𝑋𝑛+2 .
Xn+2Skip,
= Hor ( Zshortcut,
n+2 + Xn )connection
. With this is a term
skip, F ( Xused
n ) = to H (refer
Zn+2to ) isthe X input
added to Xton the
before adder.
passingBe-
cause X is passed from one layer to
through the second ReLU layer to generate Xn+2 . another, the shortcut connection then permits the re-
sidualSkip,
network
or shortcut, 𝐹(𝑋) = 0, thus,
so that connection allowing
is a term used to a simple
refer totask
the Xtoinput
be performed
to the adder.by X.Because
If this
shortcut
X is passedconnection
from oneislayerabsent, then the the
to another, network
shortcut needs to learn then
connection that the weights
permits layer is
the residual
equivalent
network soto theFidentity
that ( X ) = 0,matrix multiplied
thus, allowing by X, which
a simple task toadds more complexity
be performed to the
by X. If this task.
shortcut
In cases where
connection X is notthen
is absent, required to pass needs
the network through to layers,
learn that the the
network
weights layer is 𝐹(𝑋)
generates nor-
equivalent
mally,
to the as is achieved
identity matrixwhen backpropagation
multiplied by X, whichisadds used.more In this case, it is easier
complexity to the to train
task. 𝐹(𝑥)
In cases
to be the
where X residual 𝐷(𝑋) −to𝑋,pass
is not required which resultslayers,
through in thethe desired
networkoutput of 𝐷(𝑋)F (when
generates X ) normally,
added to as
Xis using
achievedthewhen backpropagation
shortcut connection. Becauseis used. In thethis case, it is
shortcut easier to train
connection doesF (notx ) to be the
require
residual the
weights, D(X ) − X, which
gradient valuesresults
remaininunchanged,
the desiredthus, output of D ( X ) when
overcoming added togradient
the vanishing X using
the shortcut connection. Because the shortcut connection does not require weights, the
problem.
gradient values
Building remain unchanged,
a sequence of ResNet thus, blocks overcoming
produces athe vanishing
ResNet gradientwith
architecture problem.
deeper
Building a sequence of ResNet blocks produces a
networks with low training errors and excellent accuracies. The ResNet blocks mightResNet architecture with deeper
re-
networks
quire poolingwith low when
layers training errors andorexcellent
convolution accuracies.
weight layers generate Thedifferent
ResNetF(X) blocks might
matrices
require
than thepooling
originallayers whenThe
X matrix. convolution
pooling adds or weight𝑋 tolayers
𝐹(𝑋),generate different
which resizes 𝑋 Fto( Xmatch
) matrices
the
thanofthe
size theoriginal X matrix.
𝐹(𝑋) matrix. ThisThecanpooling adds X
be achieved F ( X ), which
bytoadding (𝑊. 𝑋)resizes
to 𝐹(𝑋). X to𝑊,match
in thisthecase,
size
isofathe F ( X ) matrix.
zero-padded Thisincan
matrix both bethe
achieved
rows and by columns W.X ) to from
adding (missing F ( X )the
. W,original
in this case,
𝑋. is a
zero-padded matrix in both the rows and columns missing from the original X.
3.5.1. ResNet Version 1
3.5.1. ResNet Version 1
We used two ResNet architectures, referred to as “ResNet 1” and “ResNet 2”. Figure
We used two ResNet architectures, referred to as “ResNet 1” and “ResNet 2”. Figure 9
9 details the design of ResNet 1 architecture at the block level. No overfitting is present in
details the design of ResNet 1 architecture at the block level. No overfitting is present in
the ResNet architecture because no additional parameters are introduced. This implies
the ResNet architecture because no additional parameters are introduced. This implies
that ResNet is an efficient deep-learning network even for hundreds of network layers. In
that ResNet is an efficient deep-learning network even for hundreds of network layers.
ResNet 1, a convolutional layer splits the feature map into two at the beginning, and the
In ResNet 1, a convolutional layer splits the feature map into two at the beginning, and
filter size is doubled to map the convolutional layer, batch layer, and ReLU layer to
the filter size is doubled to map the convolutional layer, batch layer, and ReLU layer to
32 × 32 × 16, 16 × 16 × 32, and 8 × 8 × 64, respectively, on the basis of the 𝑖 and 𝑗 values,
32 × 32 × 16, 16 × 16 × 32, and 8 × 8 × 64, respectively, on the basis of the i and j values,
where 𝑖 represents how many times the filter size must be doubled, and 𝑗 represents the
where i represents how many times the filter size must be doubled, and j represents the
number of ResNet block iterations on the basis of 𝑁. The deep-network performance is
number of ResNet block iterations on the basis of N. The deep-network performance is
enhanced
enhancedby byadjusting
adjustingthe theinput
inputlayer
layerusing
usingthe thebatch
batchnormalization
normalizationblock. block.ResNet
ResNet11has has
an input image dimension of 48 × 48 ×
an input image dimension of 48 × 48 × 3, with each layer in the architecture consisting ofofa
3, with each layer in the architecture consisting
aconvolutional
convolutionallayer, layer,batch
batchnormalization
normalizationlayer, layer,and anda arectified
rectified linear
linear unit
unit (ReLU).
(ReLU).

Figure 9. ResNet version 1 architecture for mineral recognition.

3.5.2. ResNet Version 2


ResNet 2 architecture at the block level is detailed in Figure 10, and the filter size for
each step is calculated using a flowchart in Figure 11. As for ResNet 1, the feature maps
are initially split into two, and the filter maps are doubled. A bottleneck connection is
introduced in ResNet 2 with the filter size calculated as shown in Figure 11. In addition,
the block size of the skip connection is tripled. The three layers that exist within a residual
function block are the convolutional layers sized [1 × 1], [3 × 3], and [1 × 1], in which the
block is repeated. The main difference between the two architectures is:
The sequence that follows the initial weight, batch normalization, and activation
block differs between the architectures. For ResNet 1, the following sequence is convolu-
tional block → batch normalization block → activation block, whereas, in ResNet 2, the
Minerals 2022, 12, 455 sequence is batch normalization block → activation block → convolutional block. 11 of 18
Postactivation is supported in ResNet 1.
Preactivation is supported in ResNet 2.
In ResNet
increase 1, the second
and decrease ReLU
of input nonlinearity
dimensions is added after
are performed using the 1𝐹(𝑋)
adding to 𝑋. and the
× 1 layer,
3 × 3 layer is the bottleneck with reduced dimensions. The stages of ResNetof
In ResNet 2, the last ReLU nonlinearity is deleted, thus, allowing output the addi-a
2 include
tion
convolutional layer 32 × 32 × 16 in step 1 which produces an output of size 32 × 32to×the
of the residual mapping and identity mapping to be passed with no changes 64.
consecutive
Step 2 producesblock. In addition,
a 16 × 16 × 128theoutput,
gradient value
and stepat3the output layer
produces is 8passed
an 8 × × 256back
outputduring
size.
backpropagation,
These ResNet 2 outputsas is the
areinput
basedlayer,
on thethus,
i andovercoming the vanishing
j values, where i representsgradient
how many problem
times
in
thedeep-learning networks
filter size must that have
be doubled, andhundreds or thousands
j represents the number of layers,
of ResNetthereby,
blockimproving
iterations
their
basedperformance
on N. and limiting/reducing the associated training errors.

Minerals 2022, 12, x FOR PEER REVIEW 12 of 18

Figure 10.
Figure ResNet version
10. ResNet version 22 architecture
architecture for
for mineral
mineral recognition.
recognition.

Figure 11. ResNet


Figure 11. ResNet version
version 22 calculation
calculation of
of filter
filter size
size for
for each
each step.
step.

Note that,ResNet
For both in bothmodels,
ResNetwe 1 and
usedResNet 2, after the initial
experimentation concatenation
to fine-tune of the blocks
the hyperparameters.
in the sequence weights → batch normalization → ReLU, the
The final hyperparameter settings were an activation function (ReLU) learning concatenated sequenced
rate =
block
0.001,isnumber
repeated. The main
of epochs difference
= 50, and batch between
size = the
20. two
These architectures is:
hyperparameters produced the
The sequence
experimental thatdiscussed
results follows the initial weight,
in Section 4. batch normalization, and activation block
differs between the architectures. For ResNet 1, the following sequence is convolutional
block → batch normalization
4. Experimental Results block → activation block, whereas, in ResNet 2, the sequence
is batch normalization block → activation block → convolutional block.
The experimental setup included the use of a high processing computing machine
Postactivation is supported in ResNet 1.
holding 256 GB memory with a graphical processing unit (GPU) Nvidia Tesla-V100 with
Preactivation is supported in ResNet 2.
5120InCUDA
ResNet cores.
1, theWe applied
second ReLU Python 3.8 for the
nonlinearity programming
is added of allF (phases,
after adding X ) to X.including
the preprocessing, classification, and identification. The data set
In ResNet 2, the last ReLU nonlinearity is deleted, thus, allowing output was split so that 80%addi-
of the was
usedoffor
tion training
the residualand the remaining
mapping 20% was
and identity available
mapping to for
be testing. Note,no
passed with however,
changesthat the
to the
80% training portion was actually divided again into an 80% training and 20% validation
split. We tested variable epoch sizes, and the ideal epoch size was chosen to ensure that
the system avoided over- and underfitting. We tested various parameter settings for Res-
Net 1 and ResNet 2 to obtain the optimal results and evaluated their performance against
the better-known deep-learning approaches of LeNet, AlexNet, and GoogleNet.
Minerals 2022, 12, 455 12 of 18

consecutive block. In addition, the gradient value at the output layer is passed back during
backpropagation, as is the input layer, thus, overcoming the vanishing gradient problem
in deep-learning networks that have hundreds or thousands of layers, thereby, improving
their performance and limiting/reducing the associated training errors.
For both ResNet models, we used experimentation to fine-tune the hyperparameters.
The final hyperparameter settings were an activation function (ReLU) learning rate = 0.001,
number of epochs = 50, and batch size = 20. These hyperparameters produced the experi-
mental results discussed in Section 4.

4. Experimental Results
The experimental setup included the use of a high processing computing machine
holding 256 GB memory with a graphical processing unit (GPU) Nvidia Tesla-V100 with
5120 CUDA cores. We applied Python 3.8 for the programming of all phases, including
the preprocessing, classification, and identification. The data set was split so that 80% was
used for training and the remaining 20% was available for testing. Note, however, that the
80% training portion was actually divided again into an 80% training and 20% validation
split. We tested variable epoch sizes, and the ideal epoch size was chosen to ensure that the
system avoided over- and underfitting. We tested various parameter settings for ResNet
1 and ResNet 2 to obtain the optimal results and evaluated their performance against the
better-known deep-learning approaches of LeNet, AlexNet, and GoogleNet.
ResNet 1 and ResNet 2 achieved higher validation accuracies than LeNet, AlexNet,
and GoogleNet (Table 2). The validation accuracy of ResNet 2 was slightly higher than
for ResNet 1. We obtained these scores by applying the segmentation methods presented
in [22]. In the latter paper, they achieved a global accuracy of 89% using a RF classifier;
however, their data were not effective when deep-learning algorithms were applied.

Table 2. Results using minerals segmentation used in [22] with different CNN models.

Training Validation
CNN Model Training Loss Validation Loss
Accuracy (%) Accuracy (%)
LeNet 1.1329 1.5627 66.67 39.29
AlexNet 0.3917 2.8706 88.89 39.88
GoogleNet 0.9911 1.4571 83.33 43.37
ResNet 1 (32) 1.0715 1.3784 72.29 45.61
ResNet 2 (47) 1.0263 1.3269 76.94 49.23

We used superpixel segmentation combined with the proposed ResNet architectures


to produce much higher validation accuracies than those achieved in [22]. LeNet, AlexNet,
and GoogleNet produced validation accuracies ranging from 74.4% to 86.3%, with the
highest accuracy achieved by AlexNet as shown in Table 3. However, the proposed ResNet
1 and ResNet 2 achieved a higher validation accuracy of 89.8% and 90.6%, respectively.
Notice that, compared to the highest achieved validation accuracy in [22], which was 49%,
our proposed method increased by 84.69%, which is a significant increase by all measures.
The highest achieved validation accuracy of 90.5% produced by the ResNet 2 architecture
of 47 layers sets a new threshold for researchers in the field of grain recognition. It is
also an improvement of 1.69% when compared to the accuracy achieved in [22] using an
RF classifier.
We varied the number of layers for ResNet 1 and ResNet 2 to determine the best
parameters for achieving the highest accuracy. The best accuracy for ResNet 1 was achieved
using 74 layers (Table 4); however, although there was a slight improvement going from 32
to 74 layers, training times increased markedly for 74 layers. Hence, ResNet 1 with 32 layers
was the chosen architecture for this application. For ResNet 2, we found the highest
validation accuracy using 47 layers, accompanied by a reasonable training time. Although
Minerals 2022, 12, 455 13 of 18

the training time between 29 layers and 47 almost doubled, the increased validation
accuracy justified using the 47 layers for this application.

Table 3. Results using proposed minerals segmentation with different CNN models.

Training Validation
CNN Model Training Loss Validation Loss
Accuracy (%) Accuracy (%)
LeNet 1.063 0.6374 61.60 74.43
AlexNet 0.3425 0.3847 90.00 86.30
GoogleNet 0.7875 0.626 72.40 76.23
ResNet 1 (32) 0.3418 0.3668 90.40 89.80
ResNet 2 (47) 0.3523 0.3621 90.40 90.56

Table 4. Comparison of results using proposed minerals segmentation with different ResNet models
and varying number of layers.

Training Validation Training Validation Training Validation


Model # of Layers
Loss Loss Accuracy (%) Accuracy (%) Time (h) Time (h)
ResNet 1 20 0.3219 0.3579 90.76 89.77 75.00 0.18
ResNet 1 32 0.3418 0.3668 90.40 89.80 133.76 0.30
ResNet 1 74 0.3586 0.3771 90.62 89.88 278.83 0.55
ResNet 2 29 0.3491 0.3770 90.38 89.86 173.74 0.29
ResNet 2 47 0.3523 0.3621 90.40 90.56 291.30 0.55
ResNet 2 110 0.3738 0.3895 90.07 90.05 671.26 0.96

We compared the various ResNet-model–layer combinations in terms of training


accuracy (Figure 12), validation accuracies (Figure 13), training loss (Figure 14), and valida-
tion loss (Figure 15). A consistent pattern emerged of ResNet 1 (32 layers) and14ResNet
Minerals 2022, 12, x FOR PEER REVIEW of 18 2
(47 layers) being the best models of the series.

Figure 12.
Figure 12. Training
Training accuracy
accuracycurves
curvesfor
forthe
thevarious
variousResNet
ResNetmodels.
models.

The confusion matrices in Figure 16 show the comparison of each class’s accuracy
for the best proposed model (ResNet version 2 with 47 layers). The left confusion matrix
shows the percentage accuracies for each class, and the right confusion matrix shows
correctly classified grain images for each class. The results in the confusion matrix indicate
that the classes C1 and C5 achieved higher accuracies as they had more grain images for
the training.
Minerals 2022, 12, 455 14 of 18

Figure 12. Training accuracy curves for the various ResNet models.
Figure 12. Training accuracy curves for the various ResNet models.

Figure 13.
Figure 13. Validation
Validation accuracy
accuracycurves
curvesfor
forthe
thevarious
variousResNet models.
ResNet models.
Figure 13. Validation accuracy curves for the various ResNet models.

Minerals 2022, 12, x FOR PEER REVIEW 15 of 18

Figure 14. Training loss curves for the various ResNet models.
Figure 14. Training loss curves for the various ResNet models.

Figure 15.
Figure 15. Validation
Validation loss
losscurves
curvesfor
forvarious
variousResNet
ResNetmodels.
models.

The confusion matrices in Figure 16 show the comparison of each class’s accuracy for
the best proposed model (ResNet version 2 with 47 layers). The left confusion matrix
shows the percentage accuracies for each class, and the right confusion matrix shows cor-
rectly classified grain images for each class. The results in the confusion matrix indicate
that the classes C1 and C5 achieved higher accuracies as they had more grain images for
The confusion matrices in Figure 16 show the comparison of each class’s accuracy for
the best proposed model (ResNet version 2 with 47 layers). The left confusion matrix
shows the percentage accuracies for each class, and the right confusion matrix shows cor-
rectly classified grain images for each class. The results in the confusion matrix indicate
Minerals 2022, 12, 455 that the classes C1 and C5 achieved higher accuracies as they had more grain images 15 offor
18
the training.

Figure16.
Figure 16.Confusion
Confusion matrix
matrix for
forthe
thebest
bestResNet
ResNetversion
version22(47
(47layers)
layers)model.
model.Confusion
Confusionmatrix
matrixofof
percentage accuracies for each class (left) and confusion matrix of correctly classified images (right).
percentage accuracies for each class (left) and confusion matrix of correctly classified images (right).

Whenwe
When wecompared
comparedour ourResNet
ResNet22model
model(47
(47layers)
layers)with
withtechniques
techniquespublished
publishedin
inthe
the
recentliterature—using
recent literature—usingthe thepublished
published method
method onon
ourour grain
grain datadata set—we
set—we observed
observed that that
the
the superpixel-based
superpixel-based grain grain segmentation
segmentation and theand the 2ResNet
ResNet 2 (47
(47 layers) layers)
clearly clearly outper-
outperformed the
existing
formed techniques
the existingand achievedand
techniques theachieved
highest accuracy values
the highest (Tablevalues
accuracy 5). (Table 5).

Table5.5.Comparison
Table Comparisonofofthe
theproposed
proposedmethod
methodwith
withexisting
existingmethods.
methods.

Reference Methodology Accuracy


Accuracy (%)
Reference Methodology
Modified superpixel grains with ResNet2 with (%)
This paper 90.56
Modified superpixel grains with ResNet2 with 47 lay-
47 layers
This paper 90.56
Julien et al. (2019) [22] ers Superpixel color features with random forests 89.00
Julien et
Julien et al.
al. (2019)
(2019)[22] Superpixel
[22] Superpixel segmented
color featuresgrains
with with CNNforests
random 49.2389.00
Brian et al. (2021) [50] Neighborhood component analysis and cubic SVM 65.75
Brian et al. (2021) [50] Neighborhood component analysis quadratic SVM 39.72%

5. Discussion and Conclusions


We presented two improved residual network architectures to automate the detec-
tion and count of individual mineral grains. These algorithms, ResNet 1 and ResNet 2,
are modified versions of ResNet. We adopted the superpixel segmentation method and
applied preprocessing techniques to provide the seed for the segmentation method, which
made the data more appropriate for deep-learning algorithms. The ResNet 2 architec-
ture with 47 layers produced the highest validation accuracy of 90.5%. To our knowl-
edge, this is the highest reported accuracy achieved using deep-learning networks for this
particular application.
Few papers explore the use of machine-learning techniques and deep-learning al-
gorithms for the automatic recognition, classification, and counting of grain minerals;
however, the existing approaches offer benchmarks against which we can compare our
results. Our ResNet 1 and ResNet 2 outperformed the deep-learning algorithms LeNet,
AlexNet, and GoogleNet in automatic grain detection and count application. Despite
these very encouraging results, improvements must be made prior to the application of
our deep-learning techniques in the field. The data set must be enhanced to eliminate
problems of mislabeling, unbalanced data, and fusion. Moreover, the developed approach
is limited by:
a. The scarcity of mineral data sets. A key contribution of this work is the development
of such a data set, because they are not readily available for grain mineral classification.
b. Unbalanced data for different classes. In the developed data set, there was an unequal
number of images available for each class.
c. High-performance GPUs are required for training. We had access to a GPU system;
however, the training step required a considerable amount of time to be performed.
Minerals 2022, 12, 455 16 of 18

Future work will include developing data sets for the purpose of grain mineral recog-
nition and enhancing new and current methods to achieve a higher recognition rate with
more mineral classes. These advances will include applying various image fusion and
registration techniques to greatly improve the mapping of the original images with the
labeled images. We will also explore other techniques for segmentation that may enhance
accuracy. These may include the region-growing-based method, fuzzy C-means, and
deep-learning segmentation.

Author Contributions: G.L.: Planning, methodology, analysis, experiments, initial draft writing.
K.B.: Supervision, methodology, original draft writing and revision. J.M.: Methodology, review and
editing. A.B.: Data collection. L.P.B.: Funding procurement, supervision, review and editing. All
authors have read and agreed to the published version of the manuscript.
Funding: This research is funded by a Fonds de Recherche du Québec—Nature et Technologies (FRQ-
NT) grant to L.P.B. (Programme de recherche en partenariat sur le développement durable du secteur
minier-II, grant number: 2020-MN-283346) with contributions from IOS Servives Géoscientifiques Inc.
Data Availability Statement: The data presented in this study can be requested from
[email protected].
Acknowledgments: We are thankful to the IOS Servives Géoscientifiques Inc. for providing
technical support.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Jung, D.; Choi, Y. Systematic Review of Machine Learning Applications in Mining: Exploration, Exploitation, and Reclamation.
Minerals 2021, 11, 148. [CrossRef]
2. Sengupta, S.; Dave, V. Predicting applicable law sections from judicial case reports using legislative text analysis with machine
learning. J. Comput. Soc. Sci. 2021, 1–14. [CrossRef]
3. Zantalis, F.; Koulouras, G.; Karabetsos, S.; Kandris, D. A Review of Machine Learning and IoT in Smart Transportation.
Future Internet 2019, 11, 94. [CrossRef]
4. Latif, G.; Shankar, A.; Alghazo, J.; Kalyanasundaram, V.; Boopathi, C.S.; Jaffar, M.A. I-CARES: Advancing health diagnosis and
medication through IoT. Wirel. Netw. 2019, 26, 2375–2389. [CrossRef]
5. Ali, D.; Frimpong, S. Artificial intelligence, machine learning and process automation: Existing knowledge frontier and way
forward for mining sector. Artif. Intell. Rev. 2020, 53, 6025–6042. [CrossRef]
6. Chow, B.H.Y.; Reyes-Aldasoro, C.C. Automatic Gemstone Classification Using Computer Vision. Minerals 2021, 12, 60. [CrossRef]
7. Girard, R.; Tremblay, J.; Néron, A.; Longuépée, H. Automated Gold Grain Counting. Part 1: Why Counts Matter! Minerals 2021,
11, 337. [CrossRef]
8. Boivin, J.-F.; Bédard, L.P.; Longuépée, H. Counting a pot of gold: A till golden standard (AuGC-1). J. Geochem. Explor. 2021,
229, 106821. [CrossRef]
9. Plouffe, A.; McClenaghan, M.B.; Paulen, R.C.; McMartin, I.; Campbell, J.E.; Spirito, W.A. Processing of glacial sediments for
the recovery of indicator minerals: Protocols used at the Geological Survey of Canada. Geochem. Explor. Environ. Anal. 2013,
13, 303–316. [CrossRef]
10. Xu, C.S.; Hayworth, K.J.; Lu, Z.; Grob, P.; Hassan, A.M.; García-Cerdán, J.G.; Niyogi, K.K.; Nogales, E.; Weinberg, R.J.; Hess, H.F.
Enhanced FIB-SEM systems for large-volume 3D imaging. eLife 2017, 6, e25916. [CrossRef]
11. Nie, J.; Peng, W. Automated SEM–EDS heavy mineral analysis reveals no provenance shift between glacial loess and interglacial
paleosol on the Chinese Loess Plateau. Aeolian Res. 2014, 13, 71–75. [CrossRef]
12. Akcil, A.; Koldas, S. Acid Mine Drainage (AMD): Causes, treatment and case studies. J. Clean. Prod. 2006, 14, 1139–1145.
[CrossRef]
13. Hudson-Edwards, K.A. Sources, mineralogy, chemistry and fate ofheavy metal-bearing particles in mining-affected river systems.
Miner. Mag. 2003, 67, 205–217. [CrossRef]
14. Hobbs, D.W. 4 Structural Effects and Implications and Repair. In Alkali-Silica Reaction in Concrete; Thomas Telford Publishing:
London, UK, 1988; pp. 73–87. [CrossRef]
15. Lawrence, P.; Cyr, M.; Ringot, E. Mineral admixtures in mortars effect of type, amount and fineness of fine constituents on
compressive strength. Cem. Concr. Res. 2005, 35, 1092–1105. [CrossRef]
16. Erlich, E.I.; Hausel, W.D. Diamond Deposits: Origin, Exploration, and History of Discovery; SME: Littleton, CO, USA, 2003.
17. Towie, N.J.; Seet, L.H. Diamond laboratory techniques. J. Geochem. Explor. 1995, 53, 205–212. [CrossRef]
Minerals 2022, 12, 455 17 of 18

18. Chen, Z.; Liu, X.; Yang, J.; Little, E.C.; Zhou, Y. Deep learning-based method for SEM image segmentation in mineral character-
ization, an example from Duvernay Shale samples in Western Canada Sedimentary Basin. Comput. Geosci. 2020, 138, 104450.
[CrossRef]
19. Hyder, Z.; Siau, K.; Nah, F. Artificial Intelligence, Machine Learning, and Autonomous Technologies in Mining Industry.
J. Database Manag. 2019, 30, 67–79. [CrossRef]
20. Dalm, M.; Buxton, M.W.; van Ruitenbeek, F.; Voncken, J.H. Application of near-infrared spectroscopy to sensor based sorting of a
porphyry copper ore. Miner. Eng. 2014, 58, 7–16. [CrossRef]
21. McCoy, J.; Auret, L. Machine learning applications in minerals processing: A review. Miner. Eng. 2018, 132, 95–109. [CrossRef]
22. Maitre, J.; Bouchard, K.; Bedard, L. Mineral grains recognition using computer vision and machine learning. Comput. Geosci. 2019,
130, 84–93. [CrossRef]
23. Makvandi, S.; Pagé, P.; Tremblay, J.; Girard, R. Exploration for Platinum-Group Minerals in Till: A New Approach to the Recovery,
Counting, Mineral Identification and Chemical Characterization. Minerals 2021, 11, 264. [CrossRef]
24. Kim, C.S. Characterization and speciation of mercury-bearing mine wastes using X-ray absorption spectroscopy. Sci. Total Environ.
2000, 261, 157–168. [CrossRef]
25. Baklanova, O.; Shvets, O. Cluster analysis methods for recognition of mineral rocks in the mining industry. In Proceedings of the
2014 4th International Conference on Image Processing Theory, Tools and Applications (IPTA), Paris, France, 14–17 October 2014.
[CrossRef]
26. Iglesias, J.C.A.; Gomes, O.D.F.M.; Paciornik, S. Automatic recognition of hematite grains under polarized reflected light mi-
croscopy through image analysis. Miner. Eng. 2011, 24, 1264–1270. [CrossRef]
27. Gomes, O.D.F.M.; Iglesias, J.C.A.; Paciornik, S.; Vieira, M.B. Classification of hematite types in iron ores through circularly
polarized light microscopy and image analysis. Miner. Eng. 2013, 52, 191–197. [CrossRef]
28. Figueroa, G.; Moeller, K.; Buhot, M.; Gloy, G.; Haberla, D. Advanced Discrimination of Hematite and Magnetite by Auto-
mated Mineralogy. In Proceedings of the 10th International Congress for Applied Mineralogy (ICAM), Trondheim, Norway,
1–5 August 2011; Springer: Berlin/Heidelberg, Germany, 2012; pp. 197–204.
29. Iglesias, J.C.; Santos, R.B.M.; Paciornik, S. Deep learning discrimination of quartz and resin in optical microscopy images of
minerals. Miner. Eng. 2019, 138, 79–85. [CrossRef]
30. Philander, C.; Rozendaal, A. The application of a novel geometallurgical template model to characterise the Namakwa Sands
heavy mineral deposit, West Coast of South Africa. Miner. Eng. 2013, 52, 82–94. [CrossRef]
31. Sylvester, P.J. Use of the Mineral Liberation Analyzer (MLA) for Mineralogical Studies of Sediments and Sedimentary Rocks; Mineralogical
Association of Canada: Quebec City, QC, USA, 2012; Volume 1, pp. 1–16.
32. Goldstein, J. Practical Scanning Electron Microscopy: Electron and Ion Microprobe Analysis; Springer Science & Business Media:
Berlin/Heidelberg, Germany, 2012.
33. Potts, P.J.; Bowles, J.F.; Reed, S.J.; Cave, R. Microprobe Techniques in the Earth Sciences; Springer Science & Business Media:
Berlin/Heidelberg, Germany, 2012.
34. Safari, H.; Balcom, B.J.; Afrough, A. Characterization of pore and grain size distributions in porous geological samples—An
image processing workflow. Comput. Geosci. 2021, 156, 104895. [CrossRef]
35. Duarte-Campos, L.; Wijnberg, K.M.; Gálvez, L.O.; Hulscher, S.J. Laser particle counter validation for aeolian sand transport
measurements using a highspeed camera. Aeolian Res. 2017, 25, 37–44. [CrossRef]
36. Cox, M.R.; Budhu, M. A practical approach to grain shape quantification. Eng. Geol. 2008, 96, 1–16. [CrossRef]
37. Latif, G.; Iskandar, D.A.; Alghazo, J.; Butt, M.M. Brain MR Image Classification for Glioma Tumor detection using Deep
Convolutional Neural Network Features. Curr. Med. Imaging 2021, 17, 56–63. [CrossRef]
38. Alghazo, J.; Latif, G.; Elhassan, A.; Alzubaidi, L.; Al-Hmouz, A.; Al-Hmouz, R. An Online Numeral Recognition System Using
Improved Structural Features—A Unified Method for Handwritten Arabic and Persian Numerals. J. Telecommun. Electron.
Comput. Eng. 2017, 9, 33–40.
39. Wang, Y.; Balmos, A.D.; Layton, A.W.; Noel, S.; Ault, A.; Krogmeier, J.V.; Buckmaster, D.R. An Open-Source Infrastructure for
Real-Time Automatic Agricultural Machine Data Processing; American Society of Agricultural and Biological Engineers: St. Joseph,
MI, USA, 2017.
40. Wójcik, M.; Brinkmann, P.; Zdunek, R.; Riebe, D.; Beitz, T.; Merk, S.; Cieślik, K.; Mory, D.; Antończak, A. Classification of Copper
Minerals by Handheld Laser-Induced Breakdown Spectroscopy and Nonnegative Tensor Factorisation. Sensors 2020, 20, 5152.
[CrossRef] [PubMed]
41. Hao, H.; Guo, R.; Gu, Q.; Hu, X. Machine learning application to automatically classify heavy minerals in river sand by using
SEM/EDS data. Miner. Eng. 2019, 143, 105899. [CrossRef]
42. Vos, K.; Vandenberghe, N.; Elsen, J. Surface textural analysis of quartz grains by scanning electron microscopy (SEM): From
sample preparation to environmental interpretation. Earth-Sci. Rev. 2014, 128, 93–104. [CrossRef]
43. Sundaresan, V.; Zamboni, G.; Le Heron, C.; Rothwell, P.M.; Husain, M.; Battaglini, M.; De Stefano, N.; Jenkinson, M.; Griffanti, L.
Automated lesion segmentation with BIANCA: Impact of population-level features, classification algorithm and locally adaptive
thresholding. NeuroImage 2019, 202, 116056. [CrossRef] [PubMed]
44. Li, Z.; Chen, J. Superpixel Segmentation using Linear Spectral Clustering. In Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1356–1363.
Minerals 2022, 12, 455 18 of 18

45. Hechler, E.; Oberhofer, M.; Schaeck, T. Deploying AI in the Enterprise IT Approaches for Design, DevOps, Governance, Change
Management, Blockchain, and Quantum Computing; Springer: Berkeley, CA, USA, 2020.
46. Alghmgham, D.A.; Latif, G.; Alghazo, J.; Alzubaidi, L. Autonomous Traffic Sign (ATSR) Detection and Recognition using Deep
CNN. Procedia Comput. Sci. 2019, 163, 266–274. [CrossRef]
47. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2012,
60, 84–90. [CrossRef]
48. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with
Convolutions. arXiv 2014, arXiv:1409.4842v1.
49. Wu, Z.; Shen, C.; Van Den Hengel, A. Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognit. 2019,
90, 119–133. [CrossRef]
50. Sinaice, B.; Owada, N.; Saadat, M.; Toriya, H.; Inagaki, F.; Bagai, Z.; Kawamura, Y. Coupling NCA Dimensionality Reduction with
Machine Learning in Multispectral Rock Classification Problems. Minerals 2021, 11, 846. [CrossRef]

You might also like