© 2021 JETIR July 2021, Volume 8, Issue 7 www.jetir.
org (ISSN-2349-5162)
AUTOMATED CHEQUE PROCESSING
SYSTEM
JN SUDARSHAN MANIKANTA YA
Department of Information Science, Department of Information Science,
RNS Institute Of Technology, Bengaluru, India RNS Institute Of Technology, Bengaluru, India
[email protected] [email protected] KALPAJ PISE AKSHATA S BAYYAR
Department of Information Science, Department of Information Science,
RNS Institute Of Technology, Bengaluru, India RNS Institute Of Technology, Bengaluru, India
[email protected] [email protected] ABSTRACT: Despite of swift advancements processing cheque includes extracting and recognition
happening in digital technology, financial of handwritten and typed information such as payee
institutions like banks still rely upon conventional name, amount, account number, date, cheque number,
medium of processing the bank cheques by humans. signature. Since the cheque id and account number are
The process is cumbersome and takes couple of days present on a bank cheque ina magnetic format therefore
for actual transfer of money which involves they can be extracted using optical character
verification by the intermediaries. This leads to high recognition. In most cases, payee name and date are
time and costs. In this paper, we propose an hand written which makes it an extremely formidable
automated system which extracts relevant details of task to extract the information. The signature is
a bank cheque like Payee Name, Amount, Date, Bank extracted and verified with the original signature of the
Name using Optical Character Recognition and account holder stored in the database of the drawer
Deep Learning and verifies the signature on the bank. If the signature is not matched, the system displays
cheque with the existing signature stored in the an error message. The primary goal of the system is to
database using feature extraction and principal develop a accurate model which reads the handwritten
component analysis. Thesignature for a new user is texts like name of the payee, date and amount in the
stored using it’s hash value for security purposes. cheque with the least possible error and then processes
The proposed system uses modified convolution the extracted data and performs transaction in encrypted
neural network to extract the handwritten content on format which maintains security. The amount is hashed
cheque leaf where in IAM dataset is used to train the using SHA 256 which stores the hashed value of
model and get the optimized results. This system will function(Amount) and is eventually stored in the
facilitate the process and lead to reduction intime database. And when performing certain transactions the
and costs. The efficiency and performance is hashed value in the database is decrypted and is added
measured on the self generated data set of bank or subtracted with the new value in transaction and the
cheques. result is hashed again and stored in database. In this way
if an intruder manages to break into the database, he still
Keywords— Machine Learning, Optical Character can’t modify the contents in the database.
Recognition(OCR), Deep Learning, CNN, Image
segmentation, Image feature extraction, Support Vector II. EXISTING SYSTEM
Machine(SVM), Connected component analysis(CCA) There were many models and systems developed and
published by many developers and authors in the past
I. INTRODUCTION decade. Few of the models were successful and others
With digital innovations happening across the world in did not attain the necessary criteria to be qualified as
all sectors, it becomes crucial for every industry to successful, one of the models is listed below. This
automate their processes to attain better performance method is a new digit recognizer that can automatically
and efficiency for any model. Banking and financial retrieve the account number and courtesy amount that is
industry plays a crucial role since large number of written on cheque instead of being inputted manually by
transactions occur in the form of bank cheques. customers. Some research in the past introduced
Automated analysis ofbank cheques is an important field different solutions to read the information on the
to explore since the bank cheques are processed by cheque, which includes the courtesy amount (numeric),
human intervention. In case manual verification, legal amount (textual), signature and particularly
important parameters like payee name, date, sign and written language such as English, French or Korea for
amount of cheque are checked by the employees of the cheque processing automation or auxiliary verification.
bank. This creates the entire process to be time In Malaysia, there is a research conducted to
consuming and more often than not prone to error. This developathe Bank Cheque Recognition System by using
further accumulates the total costs and the resources for the neural network, however, researchers were not
executing the process. The automated system of satisfied with the performance obtained. To the best of
our knowledge, none of the research and
JETIR2107218 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org b712
© 2021 JETIR July 2021, Volume 8, Issue 7 www.jetir.org (ISSN-2349-5162)
implementation have been done onadigital recognition trained CNN classifier to recognize the digits. To train the
in Malaysia, especially in improving performance in the CNN classifier, a systematic approach is followed. To
banking domain. The solution aims to automate the recognize the digits extracted from the cheque images, we
cheque deposit process in a country which will be need to first train the classifier. CNN is used as classifier
beneficial to both bank staff and customers. The digit which takes the sample digit images and learns. Once the
recognizer will incorporate with the cheque deposit classifier is trained, then testing can be done. The image
machine. Customer is required to insert cheques into the size and channels are to be specified for input layer. The
machine and the machine will proceed to scan the image size and channel considered for experimentation are 28 ×
of cheque then read the courtesy amount and bank 28 × 1. First two parameters represent row and columns,
account number based on the image that has been and third represents the channel: 1 for greyscale and 3 for
captured. colour image. Filter size and number of filters are
specified into the convolutional layer; the first parameter
III. PROPOSED SYSTEM: is filter size. The second parameter is the number of
The proposed method takes the cheque as input, filters.
extracts amount, account number and date from the
image, recognizes the digits, constructs the complete C. Post-processing
numbers and stores it in the excel file. Later this can be Post-processingadeals with the process of combining the
entered into database for further transactions. The individual digits into complete number for account
proposed method has three modules: number, date and amount. Each field contains the
1. Pre-processing and digit extraction. collection of digits recognized by classifier that is stored
2. Training the classifier and testing of the data set. in the form of string. Using simple mathematical formula
3. Post-processing ofathe recognized digits. which is used to reverse the number, the complete number
is constructed and used to extract the character/digits
A. Pre-processing and digit extraction from cheque properly. However, if the digits are touching each other,
images: then CCA extracts whole connected component as one
In the first stage of the proposed method, cheque images component.
are scanned and taken as an input to the pre-processing
stage. Areas of the amount block, data block and account IV. Literature Review:
number are assessed and identified. Then region of In this paper[1] a method for signature verification, which
interest (ROI) is defined by creating binary mask with is based on Perception and Probability. It means first the
pixel. Convolutional Neural Network Approach is used system roughly determines to which class a signature
for Extracting remaining portion of the cheque leaf . This belongs to and then it finally decides whether the
mask is applied on cheque image to extract the account signature can be accepted or not. Perception presents the
number, date and amount fields. The output image is then class, which a signature “possibly” belongs and the
converted into greyscale, and pre-processing is applied to pattern classification based on state transition determines
smooth the image. Further, each individual segmented if at all it belongs to that class. Beside it defines an
date, amount and account number images is further accurate closeness function. They have proposed such a
segmented into individual digits using connected system where it combine the spatial features of sum graph
component analysis (CCA). Projection profile method and HMM and classify them separately by a PNN
works well if the spacing between lines is proper. If the Knowledge based classifier. In this paper[2], cheque is a
spacing is uneven, then segmentation using projection payment instrument that requires high-cost processing in
profile method fails to segment. Therefore, CCA which is banks because it involves significant manual works. The
the number of neurons that connect to the same region of usage of cheque still exists as an important non- cash
the input. This argument decides the number of feature payment instrument, even though Bank Negara Malaysia
maps. Padding string and value usually used in the form has imposed a new processing fee of RM0.50 per cheque
of key-value pair to add padding to the input feature map. since 2015. In this paper, they have proposed a digit
CNN layers are followed by a down-sampling operation recognizer where manual input of payee’s account
that reduces the spatial size of the feature map and number and cheque amount by the customer will be
removes redundant spatial information. We have used 2 × ceased to simplify the manual process at the cheque
2 max pooling for down sampling. Fully connected layer deposit machine. In the proposed method[3] verifies a
is having 10 outputs as it corresponds to 10 digits. Next cheque by identifying and examining the account holder’s
layer in the CNN isasoft max which uses activation signature. The signature extraction goes through image
function to normalize the output of fully connected layer, acquisition, gray scale image translation, binary image
and output of this layer is positive numbers that sum to extraction, which is localized, segmented. The
one. implementation involves the image extraction and
extracted image is divided into characters are going to be
B. Training and Testing using Convolutional Neural localized. The localized data is compared with the
Network (CNN): collected database which is already collected from the
Once the digits are extracted and segmented into given database. This method is implemented using offline
individual digits, these extracted digits will be fed to mode, thereby allowing portability. This paper gives
JETIR2107218 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org b713
© 2021 JETIR July 2021, Volume 8, Issue 7 www.jetir.org (ISSN-2349-5162)
effective sign algorithm and also provides a security by the parameter identification considerably improves with
manual checking In the proposed system[4] they can removal of extra background information. Rotation As
detect handwritten digits from scanned input image by the scanned images may vary in terms of orientation
using neural network technique is presented. This therefore we used the date box present in all standard
methodology of recognition of hand writing is effective cheque leaflets and the relative invariant nature of the
and fast compared to earlier molded image pixel position was utilized. In order to perform the rotation of
comparison methodology, which is comparatively very image, It is determined that the point of rotation and
slow. In the initial phase, hand writing samples are degree of rotation. The primary component essential for
collected from different people and designed a form for rotation to work was contour extraction, as we were able
handwritten digit input. In this paper[5], They have to determine the position values of date box by using it
tackled the problem of general, unconstrained text which used as our anchor for any set of operation related
recognition. They have presented a novel, data and to length mapping. Also, in order to perform the rotation,
computation efficient, neural network architecture that we used the midpoint of the image as the rotation point,
can be trained end-to-end on variable-sized images using and we used date box in order to determine the angle
variable-sized line level transcriptions. They have necessary for the rotation. Removing background noise
conducted an extensive set of experiments on seven There was lot of extra information in an original image
public benchmark datasets covering a wide range of text which is to be removed. For the task of removing of
recognition sub-tasks, and demonstrated state-of-the-art background information, the date box is used to present
performance on each one of them using the same in the standard cheque template.
architecture and with minimal change in hyper-
parameters. It discusses[6] the important results reported A. Segmentation of Cheque
so far in preprocessing, extraction, recognition and To use only the desired part of the image for various
verification of handwritten fields on bank cheques and operations, the good idea is to do image segmentation. For
highlights the positive directions of research till date. The this, image is separated in each pattern locally in the
paper has a comprehensive bibliography of many optimum way. Thus, we have performed segmentation to
references as a support for researchers working in the ensure that the tool would use only the required
field of automatic bank cheque processing In this information of image for its processing, while accessing
paper[7], the design and development of the courtesy the entire process. As illustrated in figure 1, we have
amount and date of Malaysian bank cheques was separated each key parameter of the cheque leaflet using
reported. The system has successfully implemented the segmentation for correct identification and verification.
detection and extraction module of the system but the Other than contour extraction in order to determine the
recognition results were not very satisfactory. Possible position values for the date box as explained in previous
causes of failure have been discussed to point out section(s), The identified region of interest (ROI) on
improvements that can be made and pitfalls that should cheque leaf standard dimensions is done. In order to
be avoided in future work. In this paper[8] they have remove the extra information from the desired region, we
presented some novel ideas underlying a cheque reading have created a standard template using pixel values for
system developed in our institution. They focused on the bank cheques. After the segmentation of bank cheque
reading of legal amounts and post-processing of the image, we have separated each segment of key parameters
recognition results. For the legal amount recognition, an for OCR to determine the patterns for verification.
approach based on Hidden Markov Models was proposed.
HMM (Hidden Markov Model) turned out to be very
useful because no segmentation of the legal amount is
required, neither into characters nor into literal words.
V. METHODOLOGY
The image acquisition of a bank cheque is crucial for the
CTS (Cheque Truncating System). Generally, flatbed
scanners are used to acquire such images. Due to
orientation and irregularities presented in the scanned
image(s), As we are unable to use the acquired image(s)
directly for the image processing operations therefore it
requires some pre-processing step. Image pre-processing Figure.1 Segmentation of cheque
Image [preprocessing is a technique used to scan cheque
images. As a scanned image obtained from the scanner B. Handwritten Text extraction from Cheque
cannot be directly used thus it is in need of pre- Handwritten extraction of texts has been a challenging
processing, which involves two primary operations, i.e, problem in Deep Learning and Natural Language
rotation and removal of unnecessary background Processing. The handwritten texts can be recognized by
information. In first step, scanned image is rotated with making use of various Machine Learning algorithms
respect to the ‘Date Box’ (is a common feature presented including the Convolution Neural Networks, Support
at the same part of every bank cheque) and then removed Vector Machines where we have extract features then
the background noise and extra information. Efficiency of work upon them. we have used Neural Networks for this
JETIR2107218 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org b714
© 2021 JETIR July 2021, Volume 8, Issue 7 www.jetir.org (ISSN-2349-5162)
application. The model extracted the sentences by making
use of OpenCv to crop out handwritten texts from certain
parts of the cheque and then we have split the sentences
into words based on space character between them. Each
of the words are then passed into the model which
comprises of 5 layers of Neural Network followed by 3
layers of Recurrent Neural Network and eventually
retrieving the digital text by making use of Connectionist
Temporal Classification.
C. Signature feature extraction and verification
In design of system shown in figure 2, features from
signature are extracted and then compared with features
which are stored to verify whether the signature belongs
to the concerned person. The system is used to avoid the
counterfeit incidents taking place in the banking
institutions. Features are extracted using the PCA
(Principal Component Analysis) method. The extracted
features are then matched with those stored in the
database. If the signature is verified then the further
execution proceeds else execution stops.
Figure 3 CNN architecture for handwritten digit recognition of
courtesy amount
Figure. 2 signature feature extraction
VI. SYSTEM DESIGN
D.CNN model for amount identification
After extracting the image segments, CNN model is used As illustrated in figure 4 ,the cheque is fed to the system
for handwritten numeric digits recognition of courtesy which performsOCR at first stage on whole data as a pre-
amount as well as to convert the legal amount into string. processing step. Currently we have made templates for
have used Deep Learning Toolbox (a MATLAB toolbox) each bank like HDFC, SBI, PNB, OBC, BOI so that data
for CNN implementation on two convolution layers with can be extracted using Opencv to crop out the required
six and twelve filters as shown in figure 3. Whereas, in parts effectively and efficiently. After getting the results
max pooling process, we have used a window of 2x2 with we perform string matching and manipulation methods on
one thousand epochs and one stride. After that, it receives the result extracted to get the specific bank template.
numeric output from the courtesy amount of bank System design is the process of the defining the
cheques and converted it into string to compare with the architecture, components, modules, interfaces, and data
legal amount string. for a system to satisfy specified requirements. Systems
design could be seen as the application of systems theory
to product development. Object-oriented analysis and
methods are becoming the most widelyaused methods for
computer systems design. Systems design is therefore the
process of defining and developing systems to satisfy
specified requirements of the user. The UML has become
the standardalanguage in object oriented analysis and
JETIR2107218 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org b715
© 2021 JETIR July 2021, Volume 8, Issue 7 www.jetir.org (ISSN-2349-5162)
design. 95.14%. Similarly, the learning progress of training
networks for character recognition on different epochs
and iterations. The experimental results are represented
below. After extracting the numbers from the courtesy
amount images, these images were converted from
numbers into words using IPV system. Once the acquired
the result in words for the courtesy amount, we were
compared it with the legal amount using the algorithm
mentioned above and in case of matching the strings of
courtesy amount and legal amount, then they were
proceeded for verification of signature.
Fig.4 Architecture design
Image Acquisition: This step deals with acquiring the
video by any one of the video capturing devices such as
Handy-cam, Mobile camera, USB camera, etc.
Preprocessing: Some pre-processing is applied on the
image to reduce noise. There are some common methods
of preprocessing: Smooth, Dilate, Erode, Median, Open,
Close etc.
Handwriting Recognition: After preprocessing
handwriting recognition is used to find the amount
written on cheque, it recognizes the digits and stores
in an array and we have verified it with the database
for the gaining information on the account balance and
process the transaction.
Signature Recognition: This is the main step in the
processing system Signature verification is a technique
used by banks, intelligence agencies and high-profile
institutions to validate the identity of an individual.
Signature verification is often used to compare signatures
in bank offices and other branch capture. An image of a
signature or a direct signature is fed into the signature
Fig 5. Segmented regions of SBI bank cheque: a Original
verification software and compared to the signature image scanned cheque, b cropped cheque leaflet, c gray- scale cheque
on file. This step is important in the processing of cheque image, d IFS code, e account number, f cheque number, g legal
Post processing: Post-processing deals with the process amount, h courtesy amount, and i signature
of combining the individual digits into complete number
for account number, date and amount. Each field contains
the collection of digits recognized by classifier that is VIII. CONCLUSION
stored in the form of string. Using simple mathematical We have developed the model to verify the bank cheques
formula which is used to reverse the number, the using OCR, CNN, SIFT and SVM. We have used OCR
complete number is constructed. method to identify the machine typographic characters
with desirable accuracy and efficiency, whereas, we have
VII. RESULT ANALYSIS performed CNN to give precise output for the handwritten
The system has been tested and the following are stated digits written on the cheque leaflet. We have proposed
in the procedure elucidated above in terms of and implemented the algorithm to convert numbers into
performance and accuracy using MATLAB software words to verify the cheque which is one of the major
which is a deep learning toolbox and image processing reasons of its bounce-off and of halt the monetary
toolbox. We used 200 bank cheque images in our training transaction. In order to achieve this, We have applied
. Out of which, many images were from IDRBT bank OCR technique to recognize the machine printing digits
cheque dataset and few cheque images were scanned and and achieved 95% accurate matching. Then, separate
trained manually. The process of training and testing on database is used to train the network and after achieving
the developed system with these cheque leaflets were a desired level of accuracy after training, we have used
conducted. Usage of various different key parameter different data sets to test the trained model for matching
segments obtained for bank cheque leaflet for training and the numbers into the words. Evidently, achieved an
testing. For handwritten digit recognition using CNN, the accuracy of 99.14% for the digit recognition which is an
system achieved an accuracy with a mean value of improvement over the previous accuracy of 99.05% for
JETIR2107218 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org b716
© 2021 JETIR July 2021, Volume 8, Issue 7 www.jetir.org (ISSN-2349-5162)
the CNN used for digit recognition. Similarly, for Springer Science+Business Media, LLC, part of Springer
character recognition using CNN, we were able to achieve Nature 2020
the accuracy up to 99.94%. For signature recognition, [5] Victor Carbune, Pedro Gonnet, Thomas Deselaers,
SIFT and SVM classifier models are used to determine Henry A. Rowley, Alexander Daryin, Marcos Calvo, Li-
the relevancy for verification of signature and achieved Lun Wang, Daniel Keysers, Sandro Feuz, Philippe
98.1% accuracy. Gervais, “Fast multi-language LSTM-based online
handwriting recognition”, International Journal on
IX. REFERENCES Document Analysis and Recognition (IJDAR) (2020)
[1] Sook Chin Chiew , Xin Yuan Law , Ren Zhang Tan , 23:89–102
XinYing Chew, Khai Wah Khaw “Digital Recognition by [6] Girish C. J, Mrs. Geetha G. P “Design of Bank Cheque
Deep Learning Techniques: A Proposed Digit Recognizer Validation System”, International Journal of Engineering
to Automate Cheque Deposition”,In Amity Journal of Research & Technology (IJERT) 05, May-2015
Computational Sciences (AJCS) 2019 [7] Mohit Mehta, Member, IACSIT, Rupesh Sanchati and
[2] Ganga Holi and Divya K. Jain, “Convolutional Neural Ajay Marchya, “Automatic Cheque Processing System”,
Network Approach for Extraction and Recognition of International Journal of Computer and Electrical
Digits from Bank Cheque Images”, In International Engineering, 2018
Conference on Intelligent Sustainable Systems, 2019 [8] Sebastian Salazar-Colores, Eduardo Cabal-Yepez, “A
[3] Mukesh Jha, Madhur Kabra, Sahil Jobanputra, and Fast Image Dehazing Algorithm Using Morphological
Prof. Rupali Sawant, “Automation of Cheque Transaction Reconstruction” IEEE TRANSACTIONS ON IMAGE
using Deep Learning and Optical Character Recognition”, PROCESSING, VOL. 28, NO. 5, MAY 2019
In Second International Conference on Smart Systems [9] Beresneva A, Epishkina A, Shingalova D (2018)
and Inventive Technology (ICSSIT 2019) Handwritten signature attributes for its verification. In:
[4] Saleem Ulla Shariff, Maheboob Hussain, Mohammed IEEE conference of Russian young researchers in
Farhaan Shariff, “Automated bank cheque verification electrical and electronic engineering (EIConRus), pp
using image processing and deep learning methods”, 1477–1480
JETIR2107218 Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org b717