0% found this document useful (0 votes)
6 views

layer_1--2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

layer_1--2

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

On-Line Devanagari Handwritten

Character Recognition Using


Moments Features

Shalaka Prasad Deore1,3(B) and Albert Pravin2


1 Sathyabama Institute of Science and Technology,
Chennai, India
[email protected]
2 Department of Computer Science and Engineering,
Sathyabama Institute of Science and Technology, Chennai, India
3 Department of Computer Engineering, M.E.S. College of Engineering,
S. P. Pune University, Pune, Maharashtra, India

Abstract. Now a days recognizing the handwritten character is receiv-


ing high significance because of numerous applications like Educational
Software, On-line Signature Verification, Bank Cheque Processing, postal
code recognition, Electronic library etc. Very less work is accounted in the
research of Devanagari handwritten character recognition (HWDCR), so
that there is a large scope of research in this area. In this paper we pro-
posed a HWDCR system that recognizes Devanagari handwritten char-
acters, the most popular script in India. Using pen tablet handwritten
character is inputted and its on-line features are extracted like sequence
of (x, y) coordinates, stroke and pressure information which are passed
to classifier for classification. We have used MLP-BP Neural Network
Classifier for classification. The average recognition accuracy is achieved
by the proposed HWDCR system is 90% using on-line data.

Keywords: Devanagari handwritten character recognition ·


On-line features · MLP-BP Neural Network · Classifier

1 Introduction
Character recognition is winding up increasingly vital and huge in the advanced
world. It encourages people to carry out their jobs effortlessly.Handwritten char-
acter recognition (HCR) is most challenging and demanding research area as far
as image processing is concerned. Goal of this research is to facilitate automation
in order to minimize human efforts. Handwritten character recognition (HCR)
mainly divided into two classes: On-line HCR and Off-line HCR. In off-line HCR
method the image of the written character is detected from a paper by optical
scanning called scanner. In on-line HCR method input is detected by movement
of pen tip called Digitizer. Handwritten character recognition is very complex
c Springer Nature Singapore Pte Ltd. 2019
K. C. Santosh and R. S. Hegadi (Eds.): RTIP2R 2018, CCIS 1037, pp. 37–48, 2019.
https://doi.org/10.1007/978-981-13-9187-3_4
38 S. P. Deore and A. Pravin

due number of reasons [20]. Firstly, shape similarity between characters. Sec-
ondly, handwriting of a person is different in various circumstances. Thirdly,
noise present while collection of data. With this we also have to consider large
variations in stroke primitives due to different handwriting styles. In English
language there are only 26 characters but Indian scripts consist of vowels and
consonants total as well as Compound characters so recognition of Indian lan-
guages is challenging as compared to English language. The proposed work done
for Devanagari scripts mainly adopted for different languages such as Marathi,
Hindi, Sanskrit, Nepali etc. Devanagari scripts consist of 12 vowels and 36 basic
consonants and 10 numeral characters. So there is challenge to create a software
system that will recognize handwritten Devanagari characters of different font
size, font style and shape.
In Fig. 1 shows different samples of Devanagari Handwritten characters. We
can observe that for same characters also there exist variations in each sam-
ple due different handwriting styles and their number of strokes. We can also
observed shape similarity for some characters.

Fig. 1. Handwriting samples by five writers

The paper covers literature review in Sect. 2 and proposed work in Sect. 3.
Section 4 is about Data Collection followed by the Preprocessing in Sect. 5.
Section 6 describes methods for feature extraction followed by classifier used for
the recognition in Sect. 7. Section 8 elaborates the results followed by conclusion
in Sect. 9.

2 Literature Review
Digitizers are mostly electromagnetic-electrostatic tablets, which send the coordi-
nates of the pen tip to the computer at regular intervals. The on-line handwriting
On-Line Devanagari Handwritten Character Recognition System 39

recognition provides more discriminate properties compared to off-line recognition


resulting in improvement in accuracy. There are many advantages of on-line HCR
over off-line HCR like real time processing, it is very adaptive in real time and the
main advantage of on-line HCR is very little per-processing is required compared
to off-line HCR. The operations, such as smoothing, de- skewing, finding of number
of strokes and loops are efficient using the pen tablet compared to scan images [12].
When the user will write character using pen tablet, that image can also store as
.bmp format in specified folder and the same image we can used as off-line images.
By doing this we required very little preprocessing due to less noise is present. Not
many attempts have been found in recognizing the Devanagari Handwritten Char-
acters. Shelke and Apte, proposed Fuzzy based classification schema for recogni-
tion of Handwritten Devanagari character [1]. Wang, Y. et al. [2] presented topic
Language adaption model focuses on specific topic of Chinese text. In the first stage
author used bi-gram model to recognize topic of text image at the character level
and then text topic is coordinated adaptively. The implemented model reduces
error rate by 11.94%. In paper [3], author explores 2-D moments feature extrac-
tion techniques on Devanagari Handwritten characters. In this paper experiments
are conducted using separate as well as combination of different moment methods
and got improved results using all combination (central+ Geometric+ Complex+
Zernike) of moment methods. Omer, M. et al. [4] presented new handwritten Ara-
bic character dataset. Using stroke information collected from real time samples
implemented online Arabic handwriting recognition system for isolated characters.
Jayashree et al. [5] suggested method of pattern matching in which unique features
are extracted and its pattern is created for classification. Venkatesha et al. [6] pro-
posed orthogonal moments based approach for verifying the on-line handwritten
signature with reduced computational resources. In paper [7], handwritten Gur-
mukhi characters recognition system is implemented in two stages. In the first stage
stroke recognition is done and in next stage based on recognized stroke character is
assessed with good accuracy. More and Rege, they have used Geometric Moment
and Zernike Moment to recognize off-line Devanagari Numerals [8].
In paper [9,11], authors explained moment theory and how efficiently it can
be used as feature extraction technique. In paper [10], author used stroke based
information to recognize on-line Devanagari characters. Agui and Takahashi used
moments for feature extraction to recognize Katakana off-line characters with 3-
layered feed forward NN [12]. Santosh, K.C. et al. [13–15] presented an established
as well as validated approach of using combination of stroke and spatial relation-
ship between them in stroke clustering format. Isolated offline character recogni-
tion is done using radon features in paper [16]. In paper [17], author presented
the Dynamic time wrapping method to manage dissimilar shapes and different
sizes while addressing the multi-class similarities and distortions. Santosh et al.
[18] elaborated the handwritten Nepali characters by using structural approach of
identifying the Stroke Number and Order Free handwriting. In paper [19] author
presented an Ensembling Model to recognize Devanagari handwritten characters.
Authors used very famous SVM, K-NN and NN classifiers for Ensembling and
achieved good accuracy.
40 S. P. Deore and A. Pravin

3 Proposed Work
We implemented a system which is used to recognize 12 vowels of Handwritten
Devanagari character. In India the Devanagari script used by masses especially in
Gujarat, Maharashtra, North India, Madhya Pradesh and West Bengal. Devana-
gari script is read and written from left to right direction. Here input image is
captured using a pen tablet and while writing we have recorded its x and y coor-
dinates and pressure information. Then captured on-line information of image
is stored in text file. This information is passed to feature extraction phase for
calculating features of different characters, which is implemented using various
non-orthogonal Moments.Then neural network is applied to classify characters.
Figure 2 depicts architecture of proposed Character Recognition system.

Fig. 2. Architecture of proposed Devanagari handwritten character recognition System

4 Dataset Creation

On-line handwritten data is inputted with a digitizing pen tablet and special
stylus. These pen movements gives us (x, y) pixel coordinates of a character
image. These coordinates, along with the pressure information are stored in text
file. Out of which only x and y coordinates information is used by our on-line
HCR system. Figure 3 shows GUI of our developed application and Fig. 4 shows
contents of text file. No restriction was forced on the content or style of writing,
the only constraint is the requirement of isolated characters. We have collected
100 samples of each 12 characters from different persons. We have created our
own on-line data set of 1200 characters.
On-Line Devanagari Handwritten Character Recognition System 41

Fig. 3. GUI of HWDCR Fig. 4. On-line data of handwritten


character

5 Preprocessing
Preprocessing is one of the essential phase of handwriting recognition. It removes
the noise introduced due to software and device constraints thus improving over-
all recognition rate. When user writes a character its on-line information (i.e. x
and y coordinate information) is captured in timely basis form. We are storing
N no. of samples in text file. We are taking on-line data for feature calculation
in which very less noise is present. So that not much preprocessing is required
for on-line data set. We have done only sample normalization and thinning oper-
ation on image. Figure 5(a) shows original input character and (b) shows same
processed character.

Fig. 5. (a) Original input character (b) Same processed character


42 S. P. Deore and A. Pravin

6 Feature Extraction
In the process of handwriting recognition, it is important to identify correct feature
set. Feature extraction is very important phase for efficient data representation and
for further processing. To differentiate between one class to another a set of char-
acteristics are mined. This would minimize the intricacies and improve precision
of algorithm. Moments are shape descriptors used to characterize the shape and
size of the image. Moments provide important properties of image like area, cen-
troid, orientation etc. Main advantage of moment feature is their ability to provide
invariant measures of shape. We have implemented recognition system by using 1-
D Moments as feature extraction method using on-line data set.

6.1 Geometric Moment (1-D)


Geometric moments (1-D) are basically projections of the image function onto
the monomials, i.e., xp , the pth order geometric moment, mp , is stated below:
N

mp = X[i]p (1)
i=1

Using on-line x coordinate information we calculated 4 moment features as:


N
1
m1 = X[i] (2)
N i=1
 N
0.5
1  2
m2 = (X[i] − m1 ) (3)
N − 1 i=1
  3
N
1 i=1 (X[i] − m 1 )
m3 = (4)
N m2
  4
N
1 i=1 (X[i] − m1 )
m4 = (5)
N m2
Where i = 1..N, X(i) denotes the x pixel array, N denotes number of samples and
p denotes order of moment. Similarly we calculated another 4 moment features
using y coordinate information.

6.2 Complex Moments (1-D)


The notion of complex moments was recently introduced as a simple and straight-
forward way to derive moment invariants. Complex moments are invariant to
rotation. The 1-D complex moment of the pth order is stated below:
N
 p
cp = (X[i] + iY [i]) (6)
i=1
On-Line Devanagari Handwritten Character Recognition System 43

Where i = 1..N, X(i) denotes the x pixel array, Y(i) is y pixel array, N denotes
number of samples and p denotes order of moment. We calculated 4 features of
complex moment.

6.3 Central and Hu’s Moments (1-D)


Geometric moments cannot handle the shift translation, change in rotation and
difference in size. To make it invariant to translation we calculated Central
moments and to handle the rotation and size, Hu’s moments are computed.
Below listed Hu’s moments can tackle the translation, rotation and size of the
image. The 1-D central moment of the pq th order is expressed as:
N
  p  q
μpq = X − X̄ Y − Ȳ (7)
i=1

Where i = 1..N, X(i) denotes the x pixel array, Y(i) is y pixel array, N denotes
number of samples and p denotes order of moment. Where,
m10 m01
X̄ = and Ȳ = (8)
m00 m00
To handle deviation in size of character image the central moments is normalized
using below expression:
μpq p+q
ηpq = and γ= +1 (9)
μ00 γ 2
For p, q = 0,1,.. and for (p+q) = 2,3,..
As per above transformation done in central moment, Hu’s moments can be
derived as follows [12]:
φ1 = η20 + η02 (10)
φ2 = (η20 − η02 )2 + 4η11
2
(11)
φ3 = (η30 − 3η12 )2 + (3η21 + η03 )2 (12)
2 2
φ4 = (η30 + η12 ) + (η21 + η03 ) (13)


2 2
φ5 = (η30 − 3η12 ) (η30 + η12 ) (η30 + 3η12 ) − 3 (η21 + η03 )

2 2
+ (3η21 − η03 ) (η21 + η03 ) 3 (η30 + η12 ) − (η21 + η03 ) (14)

φ6 = (η20 − η02 )[(η30 + η12 )2 − (η21 + η03 )2 ] + 4η11 (η30 + η12 )(η21 + η03 )(15)

φ7 = (3η12 − η30 )(η30 + η12 )[(η30 + η12 )2 − 3(η21 + η03 )2 ]


+(3η21 − η03 )(η21 + η03 )[3(η30 + η12 )2 − (η21 + η03 )2 ] (16)
44 S. P. Deore and A. Pravin

7 Character Classification and Recognition


Character Recognition is done based on a popular method called Multilayer Neu-
ral Network as depicted in Fig. 6. Feed forward is an acyclic network in which the
flow is unidirectional from the input to output nodes via hidden nodes. We have
considered leveraging the Back propagation algorithm for training the classifier.
In network, calculated mean squared error information is propagated back to hid-
den layer and weights are adjusted to get desired output. Neural network is consist
of one hidden layer trained using sigmoid activation function. We have 12 charac-
ter samples of 100 different peoples. Out of which 80 character samples are trained
and 20 are used for testing. Results are different for different features. Characters
are normalized before passing to Neural Network. Using following equation they
are normalized:

N
(f eature(i) − mean(i))
[ ] (17)
i=1
σ(i)

Fig. 6. Architecture of MLP neural network

8 Experiments and Results


8.1 Individual Moment-Based Results

Data used for the present work were collected from different individuals. Class
1 consists of samples of character of 100 different peoples; class 2 consists
of samples of character and so on. Like this we have total 12 classes for
12 Devanagari characters so we have total 1200 samples of Devanagari basic
characters (only vowels) for the experiment of the proposed work. Table 1 show
On-Line Devanagari Handwritten Character Recognition System 45

Table 1. Devanagari character recognition results using online features

Character classes Geometric Central Hu’s Complex


moment(%) moment(%) moment(%) moment(%)
1 45 70 60 40
2 70 70 25 05
3 80 65 55 10
4 85 90 60 35
5 70 55 45 05
6 55 65 50 05
7 60 70 50 35
8 70 85 35 05
9 65 75 65 15
10 50 80 40 30
11 35 65 15 10
12 45 60 40 15
Average recognition 61 71 45 18
rate

Fig. 7. On-line character recognition result

recognition rate of each class from 1 to 12 determined using different moment


features which used on-line information for its calculation.
From Table 1 we can observe that Central moment have good recognition
results among all other moments because of CM are invariant to translations.
We also observed that features calculated using Complex moment (COM) have
very much variation so that its recognition result is very poor.
46 S. P. Deore and A. Pravin

Figure 7 shows recognition rate of different on-line moments shown in Table 1


for 12 class characters. From Fig. 7 we can observe that graph of all features is
non-linear so that we can combined two different features. These two combined
features are passed to neural network to see its recognition result. Table 2 depicts
how after combining features we are getting improved recognition result. Overall
average recognition rate is increased as compared to using them individual. When
we combined all four features it increases recognition rate of each individual class
characters and average rate is also 90%.

Table 2. Average character recognition result of 1-D combined moments

Online moments (combined) Average recognition rate


CM & GM 84%
CM & COM 73%
GM & COM 70%
Combined all 4 1-D moments (GM,CM,HuM,COM) 90%

8.2 Comparison of Results


The method discussed in this paper obtains accuracy of 90% recognition rates
for On-line Moments. These results are obtained from 1200 sample of Devanagari
Characters. The recent work on recognizing Gujarati Handwritten Script using
neural network [5] and On-line Signature verification using Zernike moments [6]
shows the accuracy of 71.66% and 80% respectively. In paper [10], the classifi-
cation rate of unconstrained on-line Devanagari character is 86.5%. Presented
system for recognition of Stroke Number and Order Free Handwriting Nepali
characters using structural approach in [18] got overall accuracies of 85.87% and
88.59% using both original and preprocessed samples respectively.

9 Conclusion
Implemented HCR system which will recognize handwritten Devanagari charac-
ters using on-line data and moments as feature. We collected 100 samples of each
12 characters from different persons so our data set consists of total 1200 charac-
ters. Out of 100 samples we used 80 samples for training Neural Network and 20
samples for testing. We observed that: (i) Central Moment (CM) gives the high-
est recognition rate 71% and Complex Moment (COM) gives lowest recognition
rate 18% for on-line features compared to other moments when each moment is
applied separately (ii) Combination of any two moments improves the recogni-
tion rate. (iii) Combination of all 1-D moments gives recognition rate to 90%
(iv) 1-D Moments can also use as feature for on-line system which is not influ-
enced by stroke sequence. For future work we plan to recognize the character set
On-Line Devanagari Handwritten Character Recognition System 47

of “Devanagari Consonants” and “Consonant-Vowel combination (Compound


Character)” using moments. And also wants check results with different types
of features.

References
1. Shelke, S., Apte, S.: A fuzzy based classification scheme for unconstrained hand-
written Devanagari character recognition. In: International Conference on Commu-
nication, Information and Computing Technology (ICCICT), pp. 1–6. IEEE Press,
Mumbai (2015). https://doi.org/10.1109/ICCICT.2015.7045738
2. Wang, Y., Ding, X.: Topic language model adaption for recognition of homologous
on-line handwritten Chinese text image. IEEE Signal Process. Lett. 21, 550–553
(2014). https://doi.org/10.1109/LSP.2014.2308572
3. Deore, S., Ragha, L.: Moment based online and online handwritten character recog-
nition. CiiT Int. J. Biometrics Bioinf. 3, 111–115 (2011). BB032011004
4. Omer, M., Ma, S.: Online Arabic handwriting character recognition using matching
algorithm. In: 2nd International Conference on Computer and Automation Engi-
neering (ICCAE), pp. 259–262. IEEE Press, Singapore (2010) . https://doi.org/
10.1109/ICCAE.2010.5451492
5. Prasad, J., Kulkarni, U., Prasad, R.: Offline handwritten character recognition of
Gujarati script using pattern matching. In: 3rd International Conference on Anti-
counterfeiting, Security, and Identification in Communication, pp. 611–615, IEEE
Press, Hong Kong (2009). https://doi.org/10.1109/ICASID.2009.5276999
6. Radhika, K., Venkatesha, M., Shekar, G.: On-line signature authentication using
Zernike moments. In: 3rd International conference on Biometrics: Theory, applica-
tions and systems, pp. 109–112. IEEE Press, Washington (2009). https://doi.org/
10.1109/BTAS.2009.5339022
7. Sharma, A., Kumar, R.: On-line handwritten Gurmukhi character recognition
using elastic matching. In: Congress on Image and Signal Processing (CISP), pp.
391–396. IEEE Press, Sanya (2008). https://doi.org/10.1109/CISP.2008.297
8. More, V., Rege, P.: Devnagari handwritten numeral identification based on Zernike
moments. In: IEEE Region 10 Conference (TENCON), pp. 1–6. IEEE Press, Hyder-
abad (2008). https://doi.org/10.1109/TENCON.2008.4766863
9. Shu, H., Luo, L.: Moment-based approaches in imaging part 1 basic features. IEEE
Eng. Med. Biol. Mag. 26, 70–74 (2007)
10. Connell, S.D., Sinha, R., Jain, A.: Recognition of unconstrained on-line Devanagari
characters. In: 15th International Conference on Pattern Recognition, pp. 368–371.
IEEE Press, Barcelona (2000). https://doi.org/10.1109/ICPR.2000.906089
11. Liao, S., Pawlak, M.: On image analysis by Moments. IEEE Trans. Pattern Anal.
Mach. Intell. 18, 254–266 (1996). https://doi.org/10.1109/34.485554
12. Agui, T., Takahashi, H., Nagahashi, H.: Recognition of handwritten katakana in a
frame using moment invariants based on neural network. In: IEEE International
Joint Conference on Neural Networks, pp. 659–664. IEEE Press, Singapore (1991).
https://doi.org/10.1109/IJCNN.1991.170475
13. Santosh, K., Nattee, C., Lamiroy, B.: Relative positioning of stroke based cluster-
ing: a new approach to on-line handwritten Devangari character recognition. Int. J.
Image Graph. (IJIG) 12, 1–24 (2012). https://doi.org/10.1142/S0219467812500167
14. Santosh, K., Iwata, E.: Stroke-Based Cursive Character Recognition. IntechOpen
(2012). https://doi.org/10.5772/51471
48 S. P. Deore and A. Pravin

15. Santosh, K., Nattee, C., Lamiroy, B.: Spatial similarity based stroke number and
order free clustering. In: 12th International Conference on Frontiers in Handwriting
Recognition (ICFHR), pp. 652–657. IEEE Press, Kolkata (2010). https://doi.org/
10.1109/ICFHR.2010.107
16. Santosh, K.: Character recognition based on DTW-radon. In: 11th International
Conference on Document Analysis and Recognition (ICDAR), pp. 264–268. IEEE
Press, Beijing (2011). https://doi.org/10.1109/ICDAR.2011.61
17. Santosh, K., Wendling, L.: Character recognition based on non-linear multi-
projection profiles measure. Front. Comput. Sci. 9, 678–690 (2015). https://doi.
org/10.1007/s11704-015-3400-2
18. Santosh, K.C., Nattee, C.: Stroke number and order free handwriting recogni-
tion for Nepali. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol.
4099, pp. 990–994. Springer, Heidelberg (2006). https://doi.org/10.1007/978-3-
540-36668-3 120
19. Deore, S.P., Pravin, A.: Ensembling: model of histogram of oriented gradient based
handwritten Devanagari character recognition system. Traitement du Signal 34,
7–20 (2017). https://doi.org/10.3166/ts.34.7-20
20. Jagtap, A.B., Hegadi, R.S.: Offline handwritten signature recognition based on
upper and lower envelope using eigen values. In: World Congress on Computing
and Communication Technologies (WCCCT), pp. 223–226. IEEE (2017)

You might also like