0% found this document useful (0 votes)
121 views

Malaria Detection Using Image Processing and Machi

This document proposes using image processing and machine learning to detect malaria from blood smear slides. Currently, malaria is detected by manually examining stained blood cell slides under a microscope, which is time-consuming and requires expert technicians. The authors aim to develop an automated system to detect and classify malaria parasites in blood samples. Their proposed methodology involves collecting digital blood cell images, preprocessing the images using techniques like segmentation and morphological operations, and then using machine learning algorithms to learn features of infected cells and detect what type of infection is present. This automated approach could help address limitations of current microscopy-based methods and improve malaria detection.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views

Malaria Detection Using Image Processing and Machi

This document proposes using image processing and machine learning to detect malaria from blood smear slides. Currently, malaria is detected by manually examining stained blood cell slides under a microscope, which is time-consuming and requires expert technicians. The authors aim to develop an automated system to detect and classify malaria parasites in blood samples. Their proposed methodology involves collecting digital blood cell images, preprocessing the images using techniques like segmentation and morphological operations, and then using machine learning algorithms to learn features of infected cells and detect what type of infection is present. This automated approach could help address limitations of current microscopy-based methods and improve malaria detection.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Malaria Detection Using Image Processing and Machine

Learning
​ anchana Shrestha​** ​Rojesh Man Shikhrakar​***
Suman Kunwar​*​ M
*​
Universitat Autònoma de Barcelona, Email: ​[email protected]​ **​ ​Pokhara University, Email:
​ ​Jawaharlal Nehru Technological University, Email: ​[email protected]
[email protected]​, ***

 
Abstract​- ​Malaria is mosquito-borne blood disease caused by
parasites of the genus Plasmodium. Conventional diagnostic
tool for malaria is the examination of stained blood cell of II. RELATED WORK
patient in microscope. The blood to be tested is placed in a slide Traditional method of detecting malaria disease is using
and is observed under a microscope to count the number of microscope which is time consuming and is difficult, which
infected RBC. An expert technician is involved in the needs considerable expertise of laboratory technician. People
examination of the slide with intense visual and mental who are bitten by female anopheles mosquito infected with P.
concentration. This is tiresome and time consuming process. falciparum are most at risk of dying from malaria. Most of the
In this paper, we construct a new mage processing system for research has found that a person with little or no immunity to
detection and quantification of plasmodium parasites in blood malaria such as young children, pregnant women, or travelers
smear slide, later we develop Machine Learning algorithm to coming from areas with no malaria is most likely to become sick
learn, detect and determine the types of infected cells according or die. Poor people living in rural areas who lack access to health
to its features. care are at greater risk for the disease [7].
Most of the methodologies for detecting malaria disease are
based on two criteria: (i) images acquired under well controlled
Index Terms​- Malaria, Plasmodium, Edge Mask, RBC, conditions; (ii) the need of proper microscope equipment. Both
Watershed Segmentation, Support Vector Machines criteria are different to accomplish in endemic area of malaria,
where this type of equipment is scare or non-existent in health
I. INTRODUCTION care facilities. So, L. Rosadoa and his team proposed different
Malaria is a mosquito-borne life threatening disease caused by methodology approach for image processing of malaria-infected
Plasmodium parasite. Globally, an estimated 3.2 billion people thick blood smears by using images exclusively acquired with
are at high risk (>1 in 1000 chance of getting malaria in a year). low cost and accessible tools such as Smartphone. The
According to the report, there were 212 million new cases of methodology was divided into three main block; Optical Circle
malaria worldwide in 2015 (range 148–304 million). The WHO Detection, WBC Detection and Trophozoites Detection. It used
African Region accounted for most global cases of malaria two different Smart phones, HTC One S and LG Nexus 5, with
(90%), followed by the South-East Asia Region (7%) and the image resolution ranging from 1456 × 2592 to 1944 × 2592
Eastern Mediterranean Region (2%). [1]. Visual detection and pixels. [8]L. Rosadoa and his team proposed the method that
recognition of ​Plasmodium ​in RBC is possible via chemical only represents a component of mobile-based framework for
process [2]. The staining process somewhat colorizes the RBCs malaria parasite detection. They do not identify and count all
but highlight ​Plasmodium​, WBCs and platelets. The detection of possible species- stages combinations of MP that potentially
Plasmodium ​requires detection of the stained objects. However, infect humans.
we need to analyzed stained objects further to determine if they N.A.Khan and his team proposed a computer vision based
are parasites or not to prevent false diagnosis. Several methods approach to identify the MP from light microscopy images. The
exist for malaria detection. research deals with the challenges involved in the automatic
Malaria parasite (MP) in blood sample can be identified by using detection of malaria parasite tissues. It is based on pixel based
image segmentation and feature extraction using minimum approach. They used K-means clustering (unsupervised
distance classifier [3]. Based on Image Acquisition, Image approach) for the segmentation to identify malaria parasite tissue
Preprocessing, Image Smoothing, Thresholding and Dilation [9].The purpose of K-means clustering is that the clusters of
image segmentation is done. Feature extraction uses two phases items with the same target category are identified. The
in architectural model: 1) Training Phase and 2) Recognition predictions for new data items are made by assuming that they
Phase which helps to recognize the MP. are of the same type and nearest to the cluster center. They used
In this work, we focus 1)automated detection and quantification chemical process that includes permeabilization, fixation,
of malaria detection, 2) strategy to determine infected image mounting and staining which is a little more difficult to do in
using machine learning 3) discuss to improve the predictive rural areas because of insufficient materials and expertise [9].
value for detection of infected cells .

 
III. METHODOLOGY
Our data consists of heterogeneous datasets, we have lot of going to discuss about machine learning but we will show the
variability in the images. So it might be easy to develop an architecture we used for this research.
algorithm that works for particular image but it is much different
challenge to find an algorithm that works across the board of
heterogeneous datasets.There are couple of approaches we can
take we can try an algorithm that is robust to the heterogeneity or
we can cached the data into more homogeneity state.

​A. Data Collection


Digital image are the main data sources for this research.The
images used in this research are taken from Vector-Borne
Disease Research and Training Center (VBDRTC), Hetauda ,
Epidemiology and Disease Control Division, Teku, Kathmandu,
Nepal. These images are of highly defined and magnified. Image
of RBC’s that are taken in thin smear are further processed and
analyzed for the MP detection.

B. Purposed Architecture
Using malaria infected images which is of less noisy and devoid
of artifact [4] [5] [6], this research can be done. The method used
for the counting malaria infected RBCs is shown in Figure A.

Fig. 1 b): Machine Learning algorithm to learn, detect and


determine the types of infected cells according to its features.

1) RBC’s Image Database:


Image acquisition is the first process in all the image processing
steps. So, we will have RBC’s image database where malaria
infected RBC’s and non-infected RBC’s will be placed. Some of
the images that are tested for MP are shown below.

Fig. 1 a): Methodology of Image Processing system for


detection and quantification of plasmodium parasites.

Instead of finding how many cells are infected we will find what
types of infection it is using machine learning.
A machine learning algorithm takes examples of inputs and
outputs associated with a task and produces a program that can
automatically differentiate them. In this research we are not

 
From the result, we can see that the background and RBCs with
parasite has been separated into white and black. For the better
segmentation we will be doing complement of the above result.
4) Morphological Operations
Morphology is a tool for extracting image components that are
useful in representation and description of region, shape.
Morphological operations apply a structuring element to an input
image, creating an output image of the same size. The value of
each pixel in the output image is based on a comparison of the
corresponding pixel in the input image with its neighbors.

Images that are taken first goes towards chemical process where
pH will be maintain so that the MP will be visible clearly. Figure
1 is the input image which has two Malaria Parasite and some
impurities. Figure 2 is the input image which does have Malaria
Parasite but contains impurities. Figure 3 has one MP and
shallow purple colored impurities. Figure 4 is the input image
which contains one MP and impurities.
2) Global Thresholding
Segmentation is the process of separating an image into regions
with respect to objects. We try to segment regions by identifying
common properties. The simplest property that pixels in a region
can share is intensity. So, a natural way to segment such regions
is through thresholding, the separation of light and dark regions.
Thresholding creates binary images from grey-level ones by
turning all pixels below some threshold (T) to zero and all pixels Erosion and Dilation are fundamental steps for morphological
above that threshold to one. processing. Dilation adds pixels to the boundaries of objects in
Let us suppose a point (x,y) for which f (x,y) >T is called an an image, while erosion removes pixels on object boundaries.
object point, otherwise the point is called background point. From Figure 7 it is noted that the pixels around the boundaries
Mathematically, the operation can be expressed as, has been removed. Removing the pixel boundaries, noise can be
reduced. Form Figure 6 it is noted that the pixels around the
boundaries as been added. So, dilation helps in detecting the
object clearly.
5) Segmentation
Segmentation subdivides an image into its constituent regions or
objects. The level of detail to which the subdivision is carried
where the pixels labeled 1 denote object and the pixels labeled 0
depends on the problem being solved. Segmentation accuracy
denotes background.
determines the eventual success or failure of computerized
3) Enhancement
analysis procedures. For this reason, considerable care should be
Image enhancement is the process of adjusting digital image so
taken to improve the probability of accurate segmentation. Most
that the result is more suitable than the original one and will be
of the segmentation algorithm is based on the basic properties of
helpful for further processing. During the research, input image
intensity values: discontinuity and similarity.[12] During the
will be enhanced after we threshold the image. After we
research, it is found that two types of segmentation have to be
threshold the Figure-5 we found the following result.
implemented to get the result. Watershed segmentation and
Color-based segmentation are used in the research.

a) Watershed Segmentation
Watershed segmentation is relatively a recent approach which
tends to separate touching objects. This research applied
Watershed segmentation process so that the overlapped RBCs
will be separated and will be helpful for counting RBCs.
In watershed segmentation, we assume 2D, grayscale image as a
topological in which the location is given by the x,y image
coordinates and the height at that location corresponds to the
image intensity or grayscale value. A catchment basin defines
Figure 5 : After thresholding that connected region for which any rainfall drains to the same
 
low point or minimum. In terms of digital image, catchment of Figure-6 and negate the distance is done and shown in
basins consist of a group of connected pixels. Line which divides Figure-10. Then after, watershed segmentation is done which is
one catchment basin from another is called watershed ridges, shown in the Figure-11. Noise and small-scale structures within
watershed lines or simply watersheds. [13] the objects generally result in many local, small catchment
Rain which falls on the landscape will naturally drain basins. These spoil things and produce over-segmentation. [13]
downwards, under the action of gravity, to its nearest minimum To overcome this, minima that are too shallow are removed and
point. We imagine the landscape being gradually flooded from do watershed in the result. Final watershed segmentation is show
below with the water entering through the local minima. As the in Figure-12.
water level increases, we construct dams which prevent the water
from the catchment basins spreading into adjacent catchment
basins. The construction process stops when the water level
reaches the height of the highest peak. The dams built in this way
are the watersheds which partition the landscape into distinct
regions containing catchment basin. [13] Figure-8 represents
watersheds and catchment basins.

b) Color based segmentation


Color based segmentation is an approach to separate the defined
color from other colored object. Blood samples are gone through
the chemical process, and mostly the MP will be in the form of
purple color. So, we segment the image with respect to RBCs
​ Minima cells in Figure-13.
Fig. 8: Watersheds and Catchment Basin
We can use original input image for the segmentation but for the
suitable segmentation, gradient image is used because gradient
magnitude is usually high along object edges and low elsewhere.
So, the watershed ridges lie along the object edges. After the
morphological operation, distance transform image is used.
Distance transform is an operator that transforms the values of
pixels in an image according to their distance from the boundary
of the object. Farther the pixel from the boundary, higher the
value is returned. This will result in transformed image where the
area near and within the boundary will be dark whereas the area
near the center of the image will be white. Here the distance is
calculated as Euclidean distance. Figure-9 shows the distance
transform image after morphological operations. Figure 13: Color based segmentation

The CIE has defined a system that classifies color according to


the HVS (the human visual system). Using this system we can
specify any color in terms of its CIE coordinates and hence be
confident that a CIE defined color will match another with the
same CIE definition. The CIE has measured the sensitivities of
the three broad bands in the eye by matching spectral colors to
specific mixtures of three colored lights. The spectral power
distribution (SPD) of a color is cascaded with these sensitivity
functions to produce three tri-stimulus values. These tri-stimulus
values uniquely represent a color, however since the illuminant
and lighting and viewing geometry will affect the measurements
Euclidean distance i.e. the distance from the every pixel to the these are all carefully defined. The three CIE tri-stimulus values
nearest nonzero is calculated and Figure-9 shows the distance are the building blocks from which many color specifications are
transformed image of Figure-6. Since the catchment basin in it is made. CIELab has no associated two dimensional chromaticity
not shown properly, the distance transforms of the complement diagram and no correlate of saturation. CIELhs can therefore not

 
be defined. [14] humans by the mosquito. This form disappears from the
bloodstream as it enters various cells of the body, particularly the
CIE L*a*b* is an attempt to linearize the perceptibility of unit liver. Here it multiplies to produce large numbers of a form
vector color differences. It is non-linear, and the conversions are which can infect other liver cells. Each parasite in a red blood
still reversible. Coloring information is referred to the color of cell undergoes further division. The red blood cell bursts and
the white point of the system, subscript n. The non-linear released parasite can enter other red blood cells. As a result of
relationships for L* a* and b* are the same as for CIELUV and this extensive division, millions of these parasites can be present
are intended to mimic the logarithmic response of the eye. in the blood. Some of the parasite transforms into male and
female forms of the parasite.[10] The image of malaria infecting
human blood cell is shown in Figure-14:

Hue is an angle in four quadrants, and there is no saturation term


in this system.
When determining CIEL*a*b* values for CRT displayed colors
it is usual to used the CRT’s white point as the reference white.
[14]

Figure 14 : Malaria Transmission Cycle[11]


6) Calculate and find perimeter of parasite and mean perimeter of
So, the RBC’s are in the form of circular shape. When the RBC’s
RBC cells.
are infected by MP, the shape of RBC’s will be bursts and the
After we finished the most important section of our process, we
shape of RBC’s will be changed.
now have to find the mean perimeter of the RBCs. From the
7) Compare whether the parasite is greater than mean value of
research it is found that most of the parasites are larger than the
RBC cells
mean perimeter of RBCs cells. With the help of matlab function
This process comes under the feature extraction process. To
regionprops we can easily detect the perimeter of object in the
detect infected and non-infected cell, we will compare the
image. Regionprops helps to measure properties of image
parasite’s parameter with mean parameter of RBCs. If the
regions. The syntax can be written as
perimeter of parasite in greater than the mean perimeter of RBCs
Stats = regionprops(L,properties);
cell, we will make a circle around the parasite. If the perimeter of
Stats = regionprops(L,properties) measures a set of properties for
the parasite is equal or less than the mean perimeter of RBCs
each labeled region in the label matrix L. Positive integer
cells we will do nothing.
elements of L correspond to different regions. The return value,
8) Plot circle around infected RBC and calculate the infected
STATS, is a structure array of length max(L(:)).The fields of the
cells.
structure array denote different measurements for each region, as
Parasite perimeter greater than the mean perimeter of the RBCs
specified by properties. We have used ‘Area’ and ‘Perimeter’
cells are plotted with red circle as shown in Figure-15. And then
properties in our algorithm. ‘Area’ is scalar it returns the actual
the number of infected RBCs will be count.
number of pixels in the regions. ‘Perimeter’ returns a scalar that
specifies the distance around the boundary of the region.
Regionprops computes the perimeter by calculating the distance
between each adjoining pair of pixels around the border of the
region. [15]
After finding the mean perimeter of RBCs cells, we will find the
perimeter of each MP.
Basically, malaria is transmitted by female mosquitoes of the
genus Anopheles. This mosquito is itself a parasite, the females
visiting humans for occasional meals of blood. During feeding,
infected mosquitoes pass on the malaria parasite from their
salivary glands. The mosquito is described as vector. An
immature form of plasmodium is injected into the blood of Figure 15 : Final output.

 
If the proposed methodology will able to find at least one RBC perimeter of parasite greater than mean perimeter of RBC is not
infected by the parasite then the research will be successful found. So, the number of infected cells found is 0.
because the person is considered to be malaria infected if at least
one out of multiple parasite can be detected. We have tested for malaria tested non-infected RBCs cells in
With the help of doctor and lab technician we can validate Figure 21. Figure 22 is the binarize output image after threshold.
whether the red marked circle are infected RBCs or not.

IV. RESULTS & DISCUSSION


During the research, more than 40 images have been tested and
some of the testes images are listed below.
We have tested for malaria tested non-infected RBCs cells in
Figure 16. Figure 17 is the binarize output image after threshold.

Figure 18 shows the segmented RBCs after watershed


segmentation. Figure 19 shows the color based segmentation for
parasite’s perimeter.

In the Figure 21, there are artifacts which have not been detected
in final output Figure 25 which are true. Number of infected
parasite is 1 which is not true because there is a MP adjacent to
the red circle in Figure 25 which is not detected. Mean perimeter
is 121.041. Greater than the mean perimeter of RBCs is detected
parasite and mark circle around it.

The mean perimeter of RBCs cells calculated is 318.0261 and the


 
As in input image 1 Figure 16, input image 4 Figure 31 has been
tested. Mean perimeter of RBCs is 118.0082. Parasite greater
than mean value is traced as circle. And finally found the number
of infected cells 1 which is true.
So, from the above result we have found that parasites present in
the RBCS cells have been detected. Some errors are still found
like not counting the entire infected parasite, impurities greater
than mean will be counted as infected cells. Although there are
errors, if we find at least one parasite in the RBCs than it is said
that she or he is malaria infected.

Remaining works
Around 20/20 of the infected and non infected RBCs cells
images have been tested till now. Validation and verification has
to be done. Some errors have been found and need to detect in
which stage MP has been detected and which stage of MP are not
Same as the above process, Figure 26 has been tasted and gives being detected. We have to find the error percentage.
the output in Figure 30. Mean perimeter of the RBCS is
119.7608. The output which gave 1 infected RBC which is true. V. CONCLUSION
We have presented an approach and developed an algorithm
for detecting malaria, automated malaria detection and
quantification of malaria infection. Also, we developed a
strategy to train with machine learning, adaptable to detection of
malaria with other types of parasite and also discuss to increase
the predictive value with results.

ACKNOWLEDGMENT
We would like to thank Vector-Borne Disease Research and
Training Center(VBDRTC), Hetauda, Epidemiology and Disease
Control Division, Teku, Kathmandu, Nepal for giving permission
to use data and materials in order to complete this research.

REFERENCES
[1] WHO, "Fact sheet: World Malaria Report 2016," in ​World Health
Organization​, world Health Organization, 2016. [Online]. Available:
http://www.who.int/malaria/media/world-malaria-report-2016/en.
Accessed: Jan. 25, 2017

 
[2] Bloland PB (2001) Drug resistance in Malaria, WHO/CDS/ CSR/DRS/ [9] N. A. Khan, H. Pervaz, A. Latif, and A. Musharaff, "Unsupervised
2001.4. World Health Organization, Switzerland, 2001 identification of malaria parasites using computer vision," 2017.
[3] D Ghate, C. Jadhav, and N. U. Rani, "AUTOMATIC DETECTION OF [10] D. Taylor, N. Green and G. Stout, 3rd ed., “Bilogical Science” The Press
MALARIA PARASITE FROM BLOOD IMAGES,". [Online]. Available: Syndicate of the University of Cambridge, 2002.
http://ijact.org/volume4issue1/IJ0410050.pdf. Accessed: Jan. 25, 2017 [11] "malaria symptoms sore throat | Best Health Line", ​Best Health Line​, 2017.
[4] Ahirwar A., Pattnaik S., Acharya B., Advancced Image Analysis Based [Online]Available:
System for Automatic Detection and Classification of Malarial Parasite in http://besthealthline.com/malaria-symptoms-sore-throat/. [Accessed: 29-
Blood Images. International Journal of Information Technology and Mar- 2017].
Knowledge Management. Jan-June 2012, Volume 5, No 1, pp. 59-64 [12] Gonzalez, R. and Woods, R. (2009). ​Digital image processing​. 3rd ed.
[5] Amit Kumar, Prof. Arun Choudhary, Prof. P. U. Tembhare, Prof. C. R. India: Dorling Kindersley India Pvt. Ltd.
Pote. Enhanced Identification of Malarial Infected Objects using Otsu [13] Solomon, C. and Breckon, T. (2013). ​Fundamentals of digital image
Algorithm from Thin Smear Digital Images. International Journal of Latest processing​. Hoboken, N.J.: Wiley, [Sec. 10.10].
Research in Science and Technology Vol.1, Issue 2: Page No. 159-163,
[14] Colour Space Conversions. (1998). Adrian Ford ([email protected]
July-August (2012).
<defunct>) and Alan Roberts ([email protected])., pp.5,10-11.
[6] Nattkemper, T. W., Schubert, W., Hermann, T., & Ritter, H. (2004). A
[15] Mathworks.com. (2017). ​Measure properties of image regions - MATLAB
Hybrid System for Cell Detection in Digital Micrographs. In B. Tilg (Ed.),
regionprops - MathWorks United Kingdom​. [online] Available at:
Biomedical Engineering, Proc. ​(Vol. 417). Innsbruck, Austria: ACTA
https://www.mathworks.com/help/images/ref/regionprops.html [Accessed
Press.
27 Jun. 2017].
[7] CDC, "Frequently asked questions (FAQs)," CDC, 2016. [Online].
[16] World Health Organization, Malaria Report 2013.
Available: https://www.cdc.gov/malaria/about/faqs.html. Accessed: Jan. 25,
2017.
[8] L. Rosadoa, J. M. C. da Costab, D. Elias, and J. S. Cardoso, "Automated
detection of malaria parasites on thick blood smears via mobile devices,"
2016

You might also like