0% found this document useful (0 votes)

80 views24 pages

DigiDoc - Transforming Handwritten Prescriptions Into Digital Clarity

Uploaded by

pritish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views24 pages

DigiDoc - Transforming Handwritten Prescriptions Into Digital Clarity

Uploaded by

pritish

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

DigiDoc

Transforming Handwritten Prescriptions into Digital Clarity

DigiDoc: Digital Prescription Solution

Introduction
Understanding Doctor’s handwriting is no easy task. Important documents like medical prescriptions when
interpreted wrongly leads to deaths of many people. FDA has estimated that around 7000 lives have been lost due
to misinterpretation of Doctor’s handwriting. DigiDoc is a terminal/web based application that digitizes the doctor’s
prescription and produces image/pdf file with clear fonts for better understanding.

Contents of this file

● Installation and Quick Start
● User Manual
● Underlying Algorithms
● Code Architecture
● Testing

Installation and Quick Start

Python version 2.7

Dependencies installation
python get-pip.py
apt-get -y install python-opencv
virtualenv venv
source venv/bin/activate
pip install matplotlib
pip install numpy
pip install pyimagesearch
pip install reportlab
apt-get install git
git clone https://github.com/tmbdev/ocropy.git
python setup.py install

The entire application comprises of two subparts :

● User Interface - Comprises of PHP , Javascripts , CSS
● Python Scripts - Preprocessing , Segmentation , detection and prediction.

User Interface
The index.php file present inside the UI folder runs on a local server (localhost) . An image has to be be uploaded
in jpg/jpeg format .
The final processed image can be downloaded in text/pdf format once the processing is done .
Link to the history is provided in the main page and various prescription (sorted as date wise ) can be downloaded
in text/ pdf formats .
History can directly be accessed by running history/ history.php on a local server.

Directories present in UI
● uploads - It comprises of all the uploaded images .
● edited_image - It comprises of the finally processed image .
● text_files - It consists of the prescriptions in the text format .

Python Scripts
The smart_ocr.py present in the folder is run automatically once the the image has been uploaded .
This file runs in background cmd via upload.php script present in the UI folder. The smart_ocr.py imports
all the other remaining python scripts related to scanning and perspective orientation etc .
Underlying Algorithms
1. Image Reorientation
In order to get good results on segmentation and then character recognition we first
preprocess the image. Goal was to remove the background from the raw image and then make it
more sharper.
Removing the background noise and then generate the “Bird’s Eye View” of the image.This
was done in in 3 steps.

a. Detect the edges.

i. Image converted to Grayscale
ii. Used Gaussian Blur to remove the noise
iii. Perform Canny Edge Detection
b. Use the edges to get the contour of the image
i. Find contours in the edged image
ii. Assumption- Largest contour in the image with exactly 4 points is the desired image
iii. Sort the contours by area and take the largest one
iv. If the contour selected has four points than we have found the document.

c. Apply the perspective transform to get the top down view of the document present in the
image.
i. Used the four point transform function from pyimagesearch library
ii. Warped image is converted to grayscale
iii. Adaptive thresholding done to give it a scanned look and which will increase
sharpness of the image.
2. Segmentation
Next major task is to extract the text segments from the image. Now for this we rely on the
fact the density of pixels near the text area is relatively much higher than remaining page. So
rather than using any readymade tool we design our own algorithm of ‘Histogram’ with highly
tunable capacity depending on the image.

So the principle of histogram is as follow -

● Add all the pixels value in a particular row(or column) and iterate this for all rows(or
columns)
● The rows(or columns) with texts will show high peaks of pixel values. As shown in image
below.
● Next we narrow down the area to selectively high rows(or columns) where we estimate the
texts and then apply histogram on it again in the other axes.
Fig: Histogram to detect text

In the image above we showed for a graph that is even more difficult to detect than regular
text on paper for the reason that graph itself has high density of random lines. We first added all
pixels in a column for all columns and plotted histogram at bottom . The we narrowed down area
to region with high peaks. We eliminated narrow peaks. Now for selected area we added all pixels
in a row(over selected region) for all rows. So we have complete coordinates of all lines. Even in
such noisy image we could separate out text. This shows our algorithms proof of concept.

3. Text and Word Segmentation

So when we apply it on a scanned prescription some region of paper may be colored.So it
will increase the average pixel density making extraction difficult. So we separate out areas with
different based on color of page and apply normalization (based on top x-percentile)

Fig : Left
side -
Before normalization , Right Side - After partitioning and individual normailation

As shown in picture above we separate grey colored parts from non colored part and apply
normalization individually.

3.1. Stage 1 : Finding basic text regions in page

Next we separate out the out the lines of text based on histograms.

Fig:

histograms along rows and corresponding generated lines

Now we we select areas with these text and apply histogram along columns. This will give
us complete broad text boxes.

Fig: Sample text region

Fig: Histogram to find text region specifically

Output of the first stage:-

Fig: original grayscale image Fig: 1st stage output

3.2. Stage 2: Separate Lines within text region

Next we find lines within these text boxes. By that we mean we find if there are multiple
lines within a same textbox. So we normalize and apply histogram within text boxes.

Fig: Sample text region with multiple lines Fig : segmented lines
Output of Second Stage 2:-
Fig: Stage 1 output Fig: stage 2 output (with segmented sublines)

3.3 Stage 3 (Final segmentation): Finding individual words in page

Now that we got individual lines precisely we now extract individual words. SO we use
histograms to break text lines at the most logical point to extract words. An example has been
shown in 3.4 (proof of concept)
Fig:

Stage 2 output with line accuracy Fig: stage 3 - Final words segmented

Note - The random boxes will be eliminated by OCR

3.4. Proof of Concept - Robustness of Our Model

Doctors save our lives regardless of circumstance even when we suffer due to our faults. So
the least we can give them is by ignoring their handwriting faults … And hence making it our
responsibility to adjust according to their handwriting.

1. We know doctors generally write in improper alignment (slant lines) So we seperate lines
regardless of it.

Fig: Despite Improper line writing, the histograms able to seperate the two lines.

2. We know doctors love to scribble with intermixing words so we separate out logical words
regardless of it ….. Separating the inseparable ….

Fig: Regardless of the level of scribbling we separate out each words

3. And Finally we go beyond the page color …. We go beyond language ….

Fig : Regardless of page color our normalization works. Our algorithm is unaffected by language

So our algorithm is highly robust .

4. Information Extraction
Now we have coordinates of each box. We use our OCR (explained in later section) and pass
each word to ocr and store the OCR result with coordinates. We then use our format information to
make sense out of the OCR data as shown in image

5. Offline Handwriting Recognition

This is the bottleneck in the pipeline, as the accuracy of prediction is contingent on the quality of
handwriting recognition. But the state of the art in handwriting still struggles to understand handwriting
that is even slightly complex.

For this project, we used an open-source project build on top of the idea of LSTM for handwriting
recognition called ocropy (https://github.com/tmbdev/ocropy).

We realised for the handwriting to be efficient, we decided to train the LSTM on the handwriting
of a certain doctor and then use it to constantly digitize his handwriting.
Simple RNN

LSTM Chain

These were the steps involved in Optical Character Recognition:

1. We first train a model on the handwriting of the doctor, using an image of annotated text
written by the doctor. Sample image we used:
2. The we save the model and use it for the OCR of the image passed from the segmentation
algorithm.

Example segment sent for extraction:

Result extracted:

iike he mahbe hea

6. Medicine name matching

We have extracted medicine names, diseases and symptoms data from online websites and
built a corpus which is used to match the output of the OCR with the corpus for correct prediction.
These are the following steps used for the matching algorithm.
● Process the text output of the ocr, to remove noise characters.
● We generate sequences by using consecutive n-grams from the output.
● We use upto 4-grams, for better accuracy.
● We iterate over each n in the n-gram and apply fuzzy pattern matching with the entire
corpus.
● After each iteration we sort the corpus on the basis of the value of the pattern matching,
then we discard the bottom 50% of the corpus.
● This is done upto 4-grams. At the end the word with the highest score/value is returned.

When tested on our corpus of ~14000 words, we can accurately match the medicine
provided the input captures at least 2-3 bigrams.

Flowchart of the Application

User Interface
The entire application is made in the form of a website .Input format in a image file (JPG, JPEG format) and the
output will come in PDF and text . The same page allows to user to visit the history of previously processed
images.

Once the image is uploaded and the image is sent for further processing , the progress bar loads .
Once the preprocessing is done on the uploaded image(Grayscale conversion, Top view format) , the main
algorithm which comprises of segmentation , OCR detection and prediction runs on the preprocessed image . The
final page shows both the uploaded as well as the final processed image . It also has the option to download the
file in pdf/text format.
Our Application also provides access to the history section . The various prescriptions are sorted date wise and the
raw image as well as the final processed text/pdf file can also be downloaded. One can also navigate to the main
page of the application using the back button.
Assumptions:
Due to presence of a vast number of layouts used by Doctor we are fixing a layout and our application is build
keeping in mind that layout.

Following are the assumptions for the input image.

● Image resolution should be equal to or greater than 1024*720 for higher accuracy.
● Clinic Name and Doctor’s details should be present in printed format along with some other details like
Patient name tag, age tag, sex tag.
● Remark and Advice should be present in top 30% of the page.
● Medicine details are present in the remaining 70% of the page. Left 70% contains medicine names and
right 30% contains medicine doses.
● Header contains Clinic’s name
● Footer has details like phone number and address.
● Both header and footer are in printed format.
● No diagrams/symbols should be present in the handwritten areas.

How To Hack Wifi in Windows 7 - 8 - 8.1 - 10 Without Any Software - Using With CMD
No ratings yet
How To Hack Wifi in Windows 7 - 8 - 8.1 - 10 Without Any Software - Using With CMD
10 pages
Quick Reference: 45/545RFE Component Location and I.D
No ratings yet
Quick Reference: 45/545RFE Component Location and I.D
8 pages
250 MW O&M Manual
100% (2)
250 MW O&M Manual
375 pages
LTE Radio Access Network Protocols and Procedures
0% (1)
LTE Radio Access Network Protocols and Procedures
151 pages
Optical Character Recognition: Kaivan Gandhi 60001160012 Rahul Jha 60001160019 Shagun Vasmatkar 60001160061
No ratings yet
Optical Character Recognition: Kaivan Gandhi 60001160012 Rahul Jha 60001160019 Shagun Vasmatkar 60001160061
7 pages
Debashis 006
No ratings yet
Debashis 006
16 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
Muda Mura Muri
No ratings yet
Muda Mura Muri
11 pages
Optical Character Recognition
100% (1)
Optical Character Recognition
36 pages
Mechanical Engineering - Lab Manual For Measurement and Instrumentation
No ratings yet
Mechanical Engineering - Lab Manual For Measurement and Instrumentation
18 pages
Hira For Cement Mill
No ratings yet
Hira For Cement Mill
6 pages
Chuong 2
No ratings yet
Chuong 2
197 pages
Proposal - SRI SAI ENTERPRISES MOHAN NAGAR
No ratings yet
Proposal - SRI SAI ENTERPRISES MOHAN NAGAR
4 pages
Chapter 5 DC Machines
No ratings yet
Chapter 5 DC Machines
49 pages
MyPractice - Question Bank - Results
No ratings yet
MyPractice - Question Bank - Results
194 pages
Jagruthi Institute of Engineering and Technology: Optical Character Recognition
No ratings yet
Jagruthi Institute of Engineering and Technology: Optical Character Recognition
28 pages
2022 MDP APP and Budget Matrix F2F SARSARACAT ES
No ratings yet
2022 MDP APP and Budget Matrix F2F SARSARACAT ES
15 pages
Mini Project-04,52 00
No ratings yet
Mini Project-04,52 00
85 pages
Doctor's Handwritten Prescription Recognition System in Multi-Language Using Deep Learning
No ratings yet
Doctor's Handwritten Prescription Recognition System in Multi-Language Using Deep Learning
5 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
Sample Project Report
No ratings yet
Sample Project Report
26 pages
Word Embedding Generation For Telugu Corpus
No ratings yet
Word Embedding Generation For Telugu Corpus
28 pages
Raspberry Pi
No ratings yet
Raspberry Pi
21 pages
Stress Tensor Strain Tensor UG
No ratings yet
Stress Tensor Strain Tensor UG
39 pages
Unlocking Rapid Data Extraction: Groq + OCR and Claude Vision - by Júlio Almeida - Python in Plain E
No ratings yet
Unlocking Rapid Data Extraction: Groq + OCR and Claude Vision - by Júlio Almeida - Python in Plain E
17 pages
Introductn Working Processes Defmn
No ratings yet
Introductn Working Processes Defmn
35 pages
Thesis Proposal
No ratings yet
Thesis Proposal
21 pages
Introductn Working Products Applications
No ratings yet
Introductn Working Products Applications
28 pages
Optical Character Recognition (Ocr) : Karan Panjwani T.E - B, 68 Guided By: Prof. Shalini Wankhade
No ratings yet
Optical Character Recognition (Ocr) : Karan Panjwani T.E - B, 68 Guided By: Prof. Shalini Wankhade
24 pages
Bib Sepport System
No ratings yet
Bib Sepport System
17 pages
Unit III 8254
No ratings yet
Unit III 8254
29 pages
Presentation On OCR of Noisy Images Using MATLAB
No ratings yet
Presentation On OCR of Noisy Images Using MATLAB
23 pages
Mini Project Hospital
No ratings yet
Mini Project Hospital
13 pages
Optical Character Recognition: Bangalore Institute of Technology
No ratings yet
Optical Character Recognition: Bangalore Institute of Technology
21 pages
Text Retrieval From Scanned Forms Using Optical Character Recognition Springerlink
No ratings yet
Text Retrieval From Scanned Forms Using Optical Character Recognition Springerlink
10 pages
Lecture 3-5
No ratings yet
Lecture 3-5
23 pages
9589-First Manuscript-57755-2-10-20220620 - X
No ratings yet
9589-First Manuscript-57755-2-10-20220620 - X
12 pages
Capstonepres
No ratings yet
Capstonepres
12 pages
Optical Character Recognition - OCR Text Recognition
No ratings yet
Optical Character Recognition - OCR Text Recognition
11 pages
ANN Miniproject Report
No ratings yet
ANN Miniproject Report
11 pages
Is 1892
No ratings yet
Is 1892
1 page
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
No ratings yet
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
7 pages
OCR (Optimal Character Recogintion)
No ratings yet
OCR (Optimal Character Recogintion)
7 pages
Prescription Label Reading
No ratings yet
Prescription Label Reading
15 pages
Bengal College of Engineering and Technology, Durgapur: "Handwritten Text Recognition"
No ratings yet
Bengal College of Engineering and Technology, Durgapur: "Handwritten Text Recognition"
15 pages
Optical Character Recognition: Presented By: - Vikas Shukla - Raj Singh
No ratings yet
Optical Character Recognition: Presented By: - Vikas Shukla - Raj Singh
11 pages
Bilingual OCR Report
No ratings yet
Bilingual OCR Report
10 pages
Online Character Recognition Presentation
No ratings yet
Online Character Recognition Presentation
34 pages
Forced Perspective Photography
100% (1)
Forced Perspective Photography
3 pages
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
No ratings yet
Build Your Own Optical Character Recognition (Ocr) System Using Google'S Tesseract and Opencv
10 pages
Surrvey Paper On Intelligent Reader For Visually Impaired People
No ratings yet
Surrvey Paper On Intelligent Reader For Visually Impaired People
5 pages
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
No ratings yet
Extraction of Information From Handwriting Using Optical Character Recognition and Neural Networks
6 pages
Smart Port: Design and Perspectives
No ratings yet
Smart Port: Design and Perspectives
6 pages
Problem Set 5 Solutions: Introduction To Algorithms
No ratings yet
Problem Set 5 Solutions: Introduction To Algorithms
8 pages
Blog - Are Smartphone Rentals Value Fo... - Mobile World Live
No ratings yet
Blog - Are Smartphone Rentals Value Fo... - Mobile World Live
8 pages
Peerj Cs 1964
No ratings yet
Peerj Cs 1964
24 pages
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
No ratings yet
Optical Character Recognition: Divyanshu Sagar Ahmed Zaid Faizee Vidyut Singhania
11 pages
Wind Energy Conversion
No ratings yet
Wind Energy Conversion
7 pages
Review On Optical Character Recognition of Devanagari Script Using Neural Network
No ratings yet
Review On Optical Character Recognition of Devanagari Script Using Neural Network
6 pages
The Derivative As The Slope of The Tangent Line
No ratings yet
The Derivative As The Slope of The Tangent Line
5 pages
Ifi 1 Else K Ifk 1: Rint OUR
No ratings yet
Ifi 1 Else K Ifk 1: Rint OUR
7 pages
Raj Synopsis12
No ratings yet
Raj Synopsis12
5 pages
Accu 204 Trabajofinal
No ratings yet
Accu 204 Trabajofinal
3 pages
Information
No ratings yet
Information
19 pages
VBScript Variables
No ratings yet
VBScript Variables
6 pages
7-Forex Trading Is A Business Learn To Trade The Market PDF
No ratings yet
7-Forex Trading Is A Business Learn To Trade The Market PDF
8 pages
Binarization and Segmentation of Kannada Handwritten Document Images
No ratings yet
Binarization and Segmentation of Kannada Handwritten Document Images
6 pages
Optical Character Recognition (OCR) For Printed Devnagari Script UsingArtificial Neural Network
No ratings yet
Optical Character Recognition (OCR) For Printed Devnagari Script UsingArtificial Neural Network
5 pages
VBScript Conditional Statements
No ratings yet
VBScript Conditional Statements
5 pages
Sata SSD 2.5 Inch
No ratings yet
Sata SSD 2.5 Inch
2 pages
Handwritten Notes Recognition Using Artificial Intelligence
No ratings yet
Handwritten Notes Recognition Using Artificial Intelligence
4 pages
ML Report
No ratings yet
ML Report
5 pages
Optical Character Recognizer: Team Member
No ratings yet
Optical Character Recognizer: Team Member
7 pages
AI Summary
No ratings yet
AI Summary
3 pages
All-Electric Bus HVAC Solutions: Choose From A Range of Clean, Efficient Solutions
No ratings yet
All-Electric Bus HVAC Solutions: Choose From A Range of Clean, Efficient Solutions
4 pages
Approach 4
No ratings yet
Approach 4
3 pages
Confluence 2018 8442875
No ratings yet
Confluence 2018 8442875
4 pages
VBScript Introduction PDF
No ratings yet
VBScript Introduction PDF
4 pages
Copy of Копия - Short Film Budget Template -
No ratings yet
Copy of Копия - Short Film Budget Template -
3 pages
1 s2.0 S1877050924007786 Main
No ratings yet
1 s2.0 S1877050924007786 Main
10 pages
Summary of Charges Summary of Charges Summary of Charges: Past Due
No ratings yet
Summary of Charges Summary of Charges Summary of Charges: Past Due
3 pages
A Matlab Project in Optical Character Recognition (OCR) : Introduction: What Is OCR?
No ratings yet
A Matlab Project in Optical Character Recognition (OCR) : Introduction: What Is OCR?
6 pages
Implement Ocr For Medical Data Extraction - Methods and Comparisons
No ratings yet
Implement Ocr For Medical Data Extraction - Methods and Comparisons
3 pages
41 Assigment 4 Chapter 6-9
No ratings yet
41 Assigment 4 Chapter 6-9
1 page
Lenovo IdeaPad Flex 5 14 2-In-1 Touchscreen Lapt
No ratings yet
Lenovo IdeaPad Flex 5 14 2-In-1 Touchscreen Lapt
1 page
Computer Graphics in Python
From Everand
Computer Graphics in Python
Martin McBride
No ratings yet
Generative Ai: A Comprehensive Guide to Innovative Ai Models (A Step-by-step Understanding of Fundamental Concepts With Practical Applications)
From Everand
Generative Ai: A Comprehensive Guide to Innovative Ai Models (A Step-by-step Understanding of Fundamental Concepts With Practical Applications)
Anthony Phillips
No ratings yet
Learning OpenCV 3 Application Development
From Everand
Learning OpenCV 3 Application Development
Samyak Datta
No ratings yet
Document Mosaicing: Unlocking Visual Insights through Document Mosaicing
From Everand
Document Mosaicing: Unlocking Visual Insights through Document Mosaicing
Fouad Sabry
No ratings yet
Ian Talks Python A-Z
From Everand
Ian Talks Python A-Z
Ian Eress
No ratings yet
Vertex Computer Graphics: Exploring the Intersection of Vertex Computer Graphics and Computer Vision
From Everand
Vertex Computer Graphics: Exploring the Intersection of Vertex Computer Graphics and Computer Vision
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
From Everand
Histogram Equalization: Enhancing Image Contrast for Enhanced Visual Perception
Fouad Sabry
No ratings yet
Raster Graphics Editor: Transforming Visual Realities: Mastering Raster Graphics Editors in Computer Vision
From Everand
Raster Graphics Editor: Transforming Visual Realities: Mastering Raster Graphics Editors in Computer Vision
Fouad Sabry
No ratings yet
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Volume Rendering: Exploring Visual Realism in Computer Vision
From Everand
Volume Rendering: Exploring Visual Realism in Computer Vision
Fouad Sabry
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet

DigiDoc - Transforming Handwritten Prescriptions Into Digital Clarity

Uploaded by

DigiDoc - Transforming Handwritten Prescriptions Into Digital Clarity

Uploaded by

DigiDoc

Transforming Handwritten Prescriptions into Digital Clarity

Contents of this file

Installation and Quick Start

The entire application comprises of two subparts :

a. Detect the edges.

So the principle of histogram is as follow -

3. Text and Word Segmentation

3.1. Stage 1 : Finding basic text regions in page

histograms along rows and corresponding generated lines

Fig: Sample text region

Fig: Histogram to find text region specifically

Output of the first stage:-

3.2. Stage 2: Separate Lines within text region

3.3 Stage 3 (Final segmentation): Finding individual words in page

Note - The random boxes will be eliminated by OCR

3.4. Proof of Concept - Robustness of Our Model

Fig: Regardless of the level of scribbling we separate out each words

3. And Finally we go beyond the page color …. We go beyond language ….

So our algorithm is highly robust .

5. Offline Handwriting Recognition

These were the steps involved in Optical Character Recognition:

Example segment sent for extraction:

iike he mahbe hea

6. Medicine name matching

Flowchart of the Application

Following are the assumptions for the input image.

You might also like