DD Cen TS 14567-2004
DD Cen TS 14567-2004
14567:2004
Postal services —
Automated processing
of mail items — Address
block locator
National foreword
This Draft for Development is the official English language version of
CEN/TS 1 4567:2004.
Comments arising from the use of this Draft for Development are requested so
that UK experience can be reported to the European organization responsible
for its conversion to a European standard. A review of this publication will be
initiated 2 years after its publication by the European organization so that a
decision can be taken on its status at the end of its 3-year life. Notification of
the start of the review period will be made in an announcement in the
appropriate issue of Update Standards.
According to the replies received by the end of the review period, the
responsible BSI Committee will decide whether to support the conversion into
a European Standard, to extend the life of the Technical Specification or to
withdraw it. Comments should be sent in writing to the Secretary of BSI
Technical Committee SVS/4, Postal services, at 389 Chiswick High Road,
London W4 4AL, giving the document reference and clause number and
proposing, where possible, an appropriate revision of the text.
Cross-references
The British Standards which implement international or European
publications referred to in this document may be found in the BSI Catalogue
under the section entitled “International Standards Correspondence Index”, or
by using the “Search” facility of theBSI Electronic Catalogue or of
British Standards Online.
This publication does not purport to include all the necessary provisions of a
contract. Users are responsible for its correct application.
Summary of pages
This document comprises a front cover, an inside front cover,
the CEN/TS title page, pages 2 to 41 and a back cover.
The BSI copyright notice displayed in this document indicates when the
document was last issued.
ICS 03.240
English version
This Technical Specification (CEN/TS) was approved by CEN on 3 February 2003 for provisional application.
The period of validity of this CEN/TS is limited initially to three years. After two years the members of CEN will be requested to submit their
comments, particularly on the question whether the CEN/TS can be converted into a European Standard.
CEN members are required to announce the existence of this CEN/TS in the same way as for an EN and to make the CEN/TS available
promptly at national level in an appropriate form. It is permissible to keep conflicting national standards in force (in parallel to the CEN/TS)
until the final decision about the possible conversion of the CEN/TS into an EN is reached.
CEN members are the national standards bodies of Austria, Belgium, Cyprus, Czech Republic, Denmark, Estonia, Finland, France,
Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Slovakia,
Slovenia, Spain, Sweden, Switzerland and United Kingdom.
© 2004 CEN All rights of exploitation in any form and by any means reserved Ref. No. CEN/TS 1 4567:2004: E
worldwide for CEN national Members.
CEN/TS 1 4567:2004 (E)
Contents
page
Foreword......................................................................................................................................................................3
Introduction .................................................................................................................................................................4
1 Scope ..............................................................................................................................................................7
2 Normative references ....................................................................................................................................8
3 Terms and definitions....................................................................................................................................8
4 Symbols and abbreviations ........................................................................................................................1 0
5 Address block locator without information encodation ..........................................................................1 0
5.1 General ..........................................................................................................................................................1 0
5.2 Size and layout .............................................................................................................................................1 1
5.3 Placement .....................................................................................................................................................1 1
5.4 Print quality ..................................................................................................................................................1 2
6 Information-based address block locators ...............................................................................................1 2
6.1 General ..........................................................................................................................................................1 2
6.2 String of characters .....................................................................................................................................1 2
6.3 Address block locators based on bar codes ............................................................................................1 3
6.4 Address block locators based on two-dimensional symbols .................................................................1 5
6.5 Information content .....................................................................................................................................1 6
Annex A (informative) Possible algorithm for locating ABLs based on concentric squares............................1 9
Annex B (informative) Possible algorithm for locating ABLs based on character strings................................29
Bibliography ..............................................................................................................................................................41
2
CEN/TS 1 4567:2004 (E)
Foreword
This document (CEN/TS 1 4567:2004) has been prepared by Technical Committee CEN/TC 331 , "Postal services",
the secretariat of which is held by NEN.
This document has been prepared under a mandate given to CEN by the European Commission and the European
Free Trade Association.
Annexes A and B are informative.
According to the CEN/CENELEC Internal Regulations, the national standards organizations of the following
countries are bound to announce this CEN Technical Specification: Austria, Belgium, Cyprus, Czech Republic,
Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Latvia, Lithuania,
Luxembourg, Malta, Netherlands, Norway, Poland, Portugal, Slovakia, Slovenia, Spain, Sweden, Switzerland and
United Kingdom.
3
CEN/TS 1 4567:2004 (E)
Introduction
The reliability, speed and cost of mail processing and delivery are the most important aspects of the Quality of
Service which is requested by postal service users. Postal operators’ performance in these respects is highly
dependent upon the level of automation achieved in the mail sorting process.
The automatic reading of addresses is one of the techniques which help to speed up and reduce the costs of mail
sorting. The first step in address reading is the determination of the location of the address block. Address block
location is the process by which an address reading machine automatically locates one or more potential address
blocks within the electronic image of a postal item before trying to read them.
The reliability and ease of address block location directly affects the performance and cost of address reading
systems. However, address block location may be difficult, especially when addresses are not placed in a pre-
specified location and / or when they are printed on, or surrounded by, noisy backgrounds. Noise may be made of
text, pictures, logos, drawings, textures, and all sorts of patterns that can be mistaken for the relevant address
block. This difficulty is particularly obvious for plastic-wrapped items for which the address is printed on a label
which is affixed on a background (see Figure 1 ).
To overcome the difficulty posed by noisy backgrounds, address reading machines need to be able to filter out non
address material in electronic images of postal items. Cost/performance trade-offs generally lead to address
reading machines which are not able to reliably locate addresses in all situations.
Noise is also detrimental to video-coding operations because it takes longer for human operators to find the
address is in a cluttered display (ball-trap effect) than it would take for an address appearing over a homogeneous
background. Modern video-coding systems may therefore also be equipped with address block location modules in
order to facilitate the task of human operators and to fit more than one address onto a single display.
Multi-Line Optical Character Recognition (MLOCR) and video-coding systems are designed to locate address
blocks through their typical features, such as their location relative to the borders of the postal item, their alignment
and the number and syntax of lines. However, these features are not sufficient to achieve reliable location of
address blocks on all items.
One possible approach to resolution of this problem is to impose constraints on the physical placement of
addresses on postal items and on the appearance of the non-address zones of the item. However, this approach is
limited in practice because mailers require a considerable degree of freedom in the location of addresses and on
the visual appearance of postal items.
Address Block Locators (ABLs) provide an alternative solution. An address block locator is a specific feature or
mark, added to an item, which can be easily and reliably detected by image processing software and which is
unlikely to occur on an item, other than in association with an address block. Since an ABL can be easily detected,
placing one in the vicinity of an address block makes it possible to locate the block whatever its position and
background. The use of ABLs, particularly on items with a busy background, may improve automation system
performance, thereby allowing constraints on address presentation and position to be relaxed.
The European Committee for Standardization (CEN) draws attention to the fact that it is claimed that compliance
with this document may involve the use of a patent concerning ABL. CEN takes no position concerning the
evidence, validity and scope of this patent right. The French Post Office states that the ABL standard is in part
covered by a patent called «Marque de repérage et procédé de localisation d’une information par ajout de cette
marque», laid down in France the 1 0/03/1 995 for LA POSTE, number 9502827, published the 1 3/09/1 996, number
2 731 535 and delivered the 25/04/1 997.
The French Post Office commits itself to grant any user of the ABL standard, a license for using this patent in the
countries where the patent has been laid down. To date, the French Post Office has the right to grant a license in
France only. This license will be negotiated in reasonable conditions.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights
other than those identified above. CEN shall not be held responsible for identifying any or all such patent rights.
4
CEN/TS 1 4567:2004 (E)
be easily printable, using normally available equipment, by mail producers: in many cases address block
locators will have to be printed at the same time as the address. To limit costs and simplify operations, printing
should require no special or additional equipment beyond that already used for address printing;
be pre-printable: for addresses printed on labels, it should be possible to use labels on which an ABL has been
pre-printed. The ABL then assists in locating the label, which in turn carries the address;
allow relaxation of constraints on address presentation: machine readability of addresses imposes a number of
constraints on address presentation. Some of these constraints (e.g. fixing the address location on the postal
item face) are designed so that addresses can be located more easily. By easing address block location, ABLs
allow relaxation of such constraints on address presentation. For example, when using an address block
locator, it could be possible to place the address in any location on the postal item;
be small in size: for reasons of costs and, above all, of aesthetics and saving space, ABLs should be as small
as possible;
preferably be usable for information encoding: since ABLs necessarily use some space on the postal item, that
space can desirably also be utilised for encoding information (identifiers, routing codes, proof of payment, non
delivery instructions, etc.) which may be useful to the mailer, the postal operator and/or the recipient;
give some freedom to mailers, enabling them to choose the most convenient locator (and information content)
for any given situation. This implies that there should be not just one ABL, but a small set of consistent and
compatibly designed ABLs;
5
CEN/TS 1 4567:2004 (E)
not be subject to risk of improper use: ABL specifications should be easy for non specialists to understand,
easy to implement without special equipment, and should guarantee proper use in most cases;
be distinctive: ABLs should have characteristics which make it unlikely that they will occur other than in
association with an address block and which make them different from marks defined by other standards or for
other purposes. The finding of an ABL should be a reliable indication that an address block is nearby;
be easily and reliably locatable, by image processing software, using tractable algorithms: it would in principle
be possible to design image processing software that would be able to locate almost any type of ABL.
However, to contain costs, it is important that detection systems can be based on simple algorithms which do
not require excessive computing power. Moreover, if an ABL is present, there should be a high probability
(95 % to 99 %) of its being detected. Similarly, there should be a low probability (maximum 5 %) of the
detection system falsely detecting an ABL where no ABL has been printed;
together with the algorithms used to locate them in images, be in the public domain: ABLs should be based on
patterns that have not been patented and that can be printed by mailers and used by postal operators and
their equipment suppliers without having to pay any fee;
be robust to skew: addresses are often skewed, particularly on flats, and it would be unrealistic to require
mailers to eliminate skew in the addresses they print. Also, mail sorting systems do not perfectly register items,
so an address may appear skewed to the image capture device, even when it is perfectly aligned on the item.
ABLs should therefore be detectable even when they are skewed within some acceptable limits;
support detection of address orientation: skewed addresses, especially addresses printed on labels, should
remain readable. For this, the orientation of the address must be determined and compensated for. It is
therefore desirable that ABLs support an easy computation of (probable) address skew;
be compatible with other marks: postal items may have other marks (e.g. for encoding of proof of payment
data, for encoding sender or addressee information, post code boxes, facing marks, etc.) in addition to an
ABL. There should be no conflict between address block locators and such other marks.
These requirements can best be met by a standard (range of) address block locators. This standard specifies such
a range of ABLs. It is structured under two main headings:
5. Address block locator without information encodation : specifies the characteristics of an address block locator
in the form of a special printed symbol;
6. Information-based address block locators : specifies the characteristics of address block locators in the form of
information-containing character strings, linear bar codes and two-dimensional symbols.
Annex A provides a possible algorithm for locating ABLs based on concentric squares; annex B provides one for
locating ABLs based on character strings. Both annexes are informative.
6
CEN/TS 1 4567:2004 (E)
1 Scope
This Technical Specification defines a set of physical marks called Address Block Locators (ABLs). ABLs are
marks, printed in the vicinity of addresses on postal items, that are intended to facilitate automatic recognition of
address location and processing of the addresses on mail sorting and video-coding equipment.
The Technical Specification describes two families of ABLs which may be printed on all types of postal items,
including letters, flats and parcels.
In the first family, address block locators take the form of pictograms which bear no other information than being a
landmark for the address block. One such pictogram is defined herein for use in association with the delivery
address block. It may be printed at the same time as the address or pre-printed on an envelope, an insert, or a
label, with the address being printed, on the same physical support, at a later stage.
The second family covers address block locators which contain an encoded specification of the address block type
and location and which can also be used for encoding other data, not directly related to address block location.
Such data may include addressee or postal item identifiers, routing data, non-delivery instructions, a return address
and references or other data which are relevant for either the mailer or the addressee. It may also include address
checking data which may be used to verify correct interpretation of the printed address by the OCR system. In this
family, three types of ABL are defined: one based on a pattern of alphanumeric characters; one on a linear bar
code and one based on two-dimensional symbologies. These locators can be applied to the delivery address block
and to forwarding or return address blocks. They will normally be printed within the same process as the address
itself.
The Technical Specification is intended to be used by:
postal operators, in the specification of requirements for mail presentation and for the acquisition of mail
processing systems.
Adherence to the Technical Specification is voluntary. However:
mailers should be aware that adoption of those aspects of the Technical Specification which are supported by
their postal operator(s) should result in faster, more reliable, processing of their postal items, particularly where
the addresses on these are printed on, or surrounded by, busy backgrounds;
suppliers should be aware that support for the Technical Specification should enhance the performance of
their systems, increasing their attractiveness to postal operators;
postal operators should be aware that adoption of the Technical Specification, and promotion of its use by
mailers, should result in faster, more reliable and more efficient processing and a reduction in video-coding
volumes.
The set of marks defined as ABLs in this Technical Specification is not intended to be exhaustive. In particular,
individual postal operators may develop systems which use other marks, including facing marks and digital postage
marks, to facilitate address block location. Where this is the case, it is recommended that such other marks be
supported in addition to, rather than instead of, the ABLs defined herein.
7
CEN/TS 1 4567:2004 (E)
2 Normative references
This Technical Specification incorporates by dated or undated reference, provisions from other publications. These
normative references are cited at the appropriate places in the text, and the publications are listed hereafter. For
dated references, subsequent amendments to or revisions of any of these publications apply to this Technical
Specification only when incorporated in it by amendment or revision. For undated references the latest edition of
the publication referred to applies (including amendments).
EN 1 361 9:2002, Postal services - Mail item processing - Optical characteristics for processing letters.
EN ISO/IEC 1 541 6, Information technology - Automatic identification and data capture techniques - Bar code print
quality test specification - Linear symbols (ISO/IEC 1 541 6:2000).
prEN ISO/IEC 1 541 7, Information technology - Automatic identification and data capture techniques - Bar code
symbology specification - Code 1 28 (ISO/IEC 1 541 7:2000).
prEN ISO/IEC 1 541 8 1 , Information technology - EAN/UCC Application Identifiers and Fact Data Identifiers and
Maintenance (ISO/IEC 1 541 8:1 999) .
prEN ISO/IEC 1 5434, Information technology - Transfer syntax for high capacity ADC media (ISO/IEC
1 5434:1 999).
EN ISO/IEC 1 6022, Information technology - International symbology specification - Data matrix (ISO/IEC
1 6022:2000).
ISO/IEC 1 5438, Information technology -- Automatic identification and data capture techniques -- Bar code
symbology specifications -- PDF41 7.
3.1
autodiscrimination
distinctive feature of certain symbologies which allows discrimination between them
3.2
bar code aspect
ratio of height to length of a bar code
3.3
code 1 28
bar code symbology defined in prEN ISO/IEC 1 541 7
3.4
data element separator
special character used to separate multiple data constructs in prEN ISO/IEC 1 5434 formats which use
prEN ISO/IEC 1 541 8 data identifiers to specify the content and structure of individual data constructs
1 prEN ISO/IEC 1 541 8 relies on and cannot be used without reference to ANSI MH1 0.8.2 (see Bibliography).
8
CEN/TS 1 4567:2004 (E)
3.5
data identifier
alphanumeric prefix to a data structure that defines the content, format and intended interpretation of the data
NOTE Data identifiers are specified in prEN ISO/IEC 1 541 8 and ANSI MH1 0.8.2. [5]
3.6
Data Matrix
two-dimensional symbology specified in EN ISO/IEC 1 6022
3.7
format
part of the content of an prEN ISO/IEC 1 5434 compliant two-dimensional symbol which conforms to a specified
data formatting and identification standard
NOTE prEN ISO/IEC 1 5434 supports a variety of data formatting and identification standards, including
prEN ISO/IEC 1 541 8, ISO 9735 [3], ISO/IEC 8824 [1 ] and 8825 [2]; the particular standard applied to a given format is specified
in the format indicator at the beginning of the format. Only prEN ISO/IEC 1 541 8 formats are supported by this Technical
Specification.
3.8
format indicator
code, in a prEN ISO/IEC 1 5434 compliant two-dimensional symbol, which specifies the overall structure of a format
3.9
id-tag
machine-readable mark, placed on an individual postal item by a postal operator, which can be used for the
purpose of identifying the item so that, in subsequent processing, the item may be recognised and associated with
associated computer-based information
NOTE It is important to note that an id-tag, unlike possible other forms of item identification, is optimised for postal
processing use. It is allocated by a postal operator and has no significance outside of the postal system.
3.1 0
narrow element
minimum width element (bar or space) in linear bar coding symbologies such as code 1 28
3.1 1
origin post
postal services provider which accepts an item for processing and delivery, and which has primary contractual
responsibility for ensuring that such delivery takes place in accordance with its published service standards
3.1 2
point
unit of measurement, often used to measure type size
NOTE The point is 1 /72 of an inch, or approximately 0,35 mm.
3.1 3
self-discrimination
see autodiscrimination
3.1 4
symbology
mechanism whereby data values may be represented in the form of variations in optical or other characteristics of a
physical support that may be 'read' using a suitable detector
NOTE The set of alphanumeric printed characters is one example. Others include the patterns of bars defined in Code 39
and Code 1 28.
9
CEN/TS 1 4567:2004 (E)
3.1 5
two-dimensional symbology
symbology in which the variations in optical or other characteristics which are used for representing data have two
dimensional significance
NOTE The set of alphanumeric printed characters is one example. Others include Data Matrix and PDF41 7 symbologies.
Contrast these with the patterns of bars defined in Code 39 and Code 1 28, which have only one dimensional significance.
3.1 6
x-dimension
see narrow element
10
CEN/TS 1 4567:2004 (E)
5.3 Placement
The locator should be sited in a specific position relative to the address. The placement of the pictogram is shown
in Figure 4. Note that the left most characters of the address may fall within the clear zone defined in 5.2 and that
the ABL may partly fall within the clear zone, around the address, which is defined in section 4.1 .2 of
EN 1 361 9:2002.
11
CEN/TS 1 4567:2004 (E)
12
CEN/TS 1 4567:2004 (E)
a fixed-pitch font with a pitch of between 2,1 and 2,8 mm and a font size of between 1 0 and 1 2 points;
NOTE This should result in a character height of between 2 and 4 mm.
spacing, between the characters in each six character group, of between 1 5 % and 30 % of the pitch;
NOTE This results in each character group being between 1 1 ,97 mm (6*2.1 – 0.30*2.1 ) and 1 6,38 mm (6*2,8 – 0,1 5*2,8)
in length.
word spacing (between each group of six characters) of between 6 and 8 mm.
NOTE This results in four character groups occupying between 65,9 (4*1 1 ,97 + 3*6) and 89,5 (4*1 6,38 + 3*8) mm in
length. Similarly, five groups will occupy between 83,9 and 1 1 3,9 mm; six will occupy 1 01 ,8 to 1 38,3 mm.
2 Value A may be supported by some postal operators. In this case, the detected pattern should form the last line of an address-like block of
text.
13
CEN/TS 1 4567:2004 (E)
6.3.3 Symbology
Bar codes which are used as address block locators shall be printed using:
any other symbology, supported for customer bar coding purposes by the origin post.
as part of the address block, in accordance with section 4.1 .6 of EN 1 361 9:2002, or
outside, to the left of and rotated 90 ° anticlockwise relative to the address block and separated from it by a
quiet zone of at least 5 mm in width.
Bar code printing should be compliant with the origin post’s published specifications regarding the customer bar
coding of postal items. Where no such specifications have been published, the following shall apply3 :
for bar codes printed as part of the address block, bar height shall be at least 3 mm and shall be comparable
with the height of capital letters in the OCR part of the address block; normal symbology rules regarding bar
code aspect may be dispensed with;
for bar codes printed outside of the address block, bar height shall be at least 5 mm; normal symbology rules
regarding bar code aspect shall be complied with;
the narrow element shall be an integral multiple of the width of the smallest element which can be printed by
the printer used to print the bar code; this multiple shall not be less than two and is recommended to be at
least three;
there shall be a quiet zone of at least 2 mm above and below the bar code and of at least 5 mm before and
after it;
3 These rules assume use of the default symbology, code 1 28. Other rules would apply in case of use of other symbologies.
14
CEN/TS 1 4567:2004 (E)
and orientation of the address block, relative to the ABL. If this character is A, B or C, an address-like
block of text should be found above (A), below (B) or below and rotated 90 ° clockwise relative to (C) the
bar code.
use format indicator 06 for the first format contained within them;
use prEN ISO/IEC 1 541 8 compliant data identifiers to specify the content and format of the data represented in
this first format;
shall include, within this first format, an OCR data locator which is compliant with the definition of
prEN ISO/IEC 1 541 8 data identifier value 55U (see 6.5). Subject to overall symbol size and content limitations,
this format may include additional data constructs, separated from the OCR data locator by means of the data
element separator, character G S.
6.4.3 Symbology
Two-dimensional symbols which are used as address block locators shall be printed using:
4 UPU standard S28 also supports digital postage marks which are printed in the franking area of a postal item. These may
not be used as address block locators.
15
CEN/TS 1 4567:2004 (E)
Since a postal item may carry multiple two-dimensional symbols which are not address block locators, the data
content shall be examined to determine whether a detected bar code serves as an ABL. Discrimination may be
achieved by using four features of the locator:
1 ) use of Data Matrix or PDF41 7 symbology;
2) conformity with prEN ISO/IEC 1 5434, with a first format having format indicator 06;
3) inclusion, in the first format, of an OCR data locator compliant with data identifier value 55U;
4) proximity to a block of text with address-like characteristics: the 5th character of the OCR data locator
specifies the position and orientation of the address block, relative to the two-dimensional symbol. If the
fifth character of a detected pattern is B, C or R an address-like block of text should be found above below
(B); below and rotated 90 degrees clockwise (C) or to the right (R) of the two-dimensional symbol.
16
CEN/TS 1 4567:2004 (E)
NOTE 2 In postal applications, only codes A, B, C and R are expected to be used; value L is defined for completeness.
Moreover, due to limitations implied by UPU standards, certain values are expected to be used only in association with
particular symbologies:
Character String B5
Two-dimensional symbol B, C, R
In character string-based ABLs which do not include supplementary data (see below), padding data comprises a
single U character followed by repetition of components 2 to 4, the whole being repeated sufficient times, with the
last repetition being truncated, to ensure that the resulting ABL is 24 characters in length.
EXAMPLE 55U1 B$ $U1 B$$ U1 B$$U 1 B$$U1
NOTE 2 The value U can be used, as specified in 6.5.6, only in association with OCR encoding of the OCR data locator.
EXAMPLE 55U1 B$$VJ4KBE432523097DIV1 is an ABL with associated organisation identifier: in fact, 'DIV1 ' in the
organisation with Belgian ( BE ) VAT number ( J4K) 432523097 . For use in a character string-based ABL, this would need to
be padded to form groups of six characters. It is already 26 characters, so the minimum possible padded length is 30, as
follows:
NOTE 3 In bar codes and two-dimensional symbols, additional data can alternatively be expressed as a completely
separate data construct, with its own data identifier. In a bar code, this would be separated from the ABL data value by a
plus (+) separator character; in a two-dimensional symbol, it would be separated from the ABL data value by the data
element separator, character G S.
18
CEN/TS 1 4567:2004 (E)
Annex A
(informative)
Possible algorithm for locating ABLs based on concentric squares
The text below is a C language procedure which implements an algorithm that looks for a set of concentric squares
in a binary image. The image is represented as a series of unsigned char values, each representing a pixel (0 for
white pixels, 255 for black ones). The resolution of the image is assumed to be 8 pixels/mm. The memory zone that
stores the image is pointed to by the unsigned char pointer variable pimage. The x and y dimensions of the images
are given as the integers variables Xmax and Ymax.
The algorithm requires only one horizontal and one vertical scan of the image. Each scan is based on the parsing
of an image line or column with respect to a grammar implemented as a finite state automaton. Each state in the
automaton corresponds to a certain position in the locator (white space above locator, first traversal of the large
square, etc.).
After the two scans have taken place, a certain number of candidate positions for locators are obtained in each
scan direction (hits). The candidate locators are situated in places where two possible locator hits are detected,
one in each scan direction, and where the hit centres are close enough.
The figures below illustrate the algorithm on a typical image. Figure 1 shows the original image with two locators.
Figure 2 and Figure 3 show respectively the vertical and horizontal hits in the image. Finally, Figure 4 shows the
detected locators after filtering.
19
CEN/TS 1 4567:2004 (E)
20
CEN/TS 1 4567:2004 (E)
21
CEN/TS 1 4567:2004 (E)
22
CEN/TS 1 4567:2004 (E)
23
CEN/TS 1 4567:2004 (E)
COPYRIGHT NOTICE: This software is provided by La Poste (France). It can be used for address block location
applications only. The software may be freely distributed and/or modified. A condition of such distribution is that this
copyright notice is always included in the comments heading the software. This software is licensed “as is” without
any other warranty of any kind. La Poste expressly disclaims all warranties or conditions including, but not limited
to, implied warranties or conditions of merchantability and fitness for particular purpose and those arising by statute
or otherwise in law or from course of dealing or usage of trade. In no event shall La Poste be liable for any direct,
indirect, special, incidental, consequential or other damages arising out of this software even if La Poste has been
advised of the possibility of such damages in advance.
#include <math. h>
void SearchForSquares ( uns igned char * pimage, int Xmax, int Ymax)
{
int i, j ;
int s tate, s tart, length;
unsigned char * p, pixel;
/ / Vertical s can
pixel = * p;
24
CEN/TS 1 4567:2004 (E)
st at e = 1 2 ; length = 1 ; }
else { st ate = 0 ; lengt h = 1 ; start = j ; }
break;
case 1 2 : / / Stat e 1 2 : white space below locator
if ( length == WHI TESPACE) s tate = 1 3 ;
else if ( pixel == 0 ) length++;
else { st ate = 0 ; lengt h = 0 ; start = j +1 ; }
break;
case 1 3: / / Stat e 1 3 : Locat or detected in t he column
st ate = 0 ;
length = 0 ;
st art = j +1 ;
break;
}
}
}
/ / Horiz ontal scan
pixel = * p;
swit ch( st at e) {
case 0 : / / Stat e 0 : whit e space left t o locat or
if ( length > WHI TESP ACE) { length = WHI TESPACE; s tart ++; } ;
if ( pixel == 0 ) length++;
else if ( length == WHI TESPACE) { s tate = 1 ; length = 1 ; }
else { st ate = 0 ; lengt h = 0 ; start = j + 1 ; }
break;
case 1 : / / Stat e 1 : firs t travers al of a large black s quare
if ( pixel == 2 5 5) length++;
else if ( length >= MI NWI DTH & & lengt h <= MAXWIDTH) {
st at e = 2 ; length = 1 ; }
else { st ate = 0 ; lengt h = 1 ; start = j ; }
break;
case 2 : / / St at e 2 : white s pace bet ween large and medium
square
if ( pixel == 0 ) length++;
else if ( length >= MI NWI DTH & & lengt h <= MAXWIDTH) {
st at e = 3; length = 1 ; }
else { st ate = 0 ; lengt h = 0 ; start = j +1 ; }
break;
case 3: / / St at e 3: firs t travers al of a medium black
square
if ( pixel == 2 5 5) length++;
26
CEN/TS 1 4567:2004 (E)
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 4 ; lengt h = 1 ; }
else { s tate = 0 ; length = 1 ; s tart = j ; }
break;
cas e 4 : / / State 4 : white space bet ween medium and s mall
s quare
if ( pixel == 0 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 5 ; lengt h = 1 ; }
else { s tate = 0 ; length = 0 ; s tart = j +1 ; }
break;
cas e 5: / / St ate 5 : first t raversal of a small s quare
if ( pixel == 2 55 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 6; lengt h = 1 ; }
else { s tate = 0 ; length = 1 ; s tart = j ; }
break;
cas e 6: / / St ate 6: white s pace inside small s quare
if ( pixel == 0 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 7 ; lengt h = 1 ; }
else { s tate = 0 ; length = 0 ; s tart = j +1 ; }
break;
cas e 7 : / / St ate 7 : second travers al of a s mall square
if ( pixel == 2 55 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 8 ; lengt h = 1 ; }
else { s tate = 0 ; length = 1 ; s tart = j ; }
break;
cas e 8 : / / State 8 : white space bet ween small and medium
s quare
if ( pixel == 0 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 9; lengt h = 1 ; }
else { s tate = 0 ; length = 0 ; s tart = j +1 ; }
break;
cas e 9: / / St ate 9: second travers al of a medium s quare
if ( pixel == 2 55 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 1 0 ; length = 1 ; }
else { s tate = 0 ; length = 1 ; s tart = j ; }
break;
cas e 1 0 : / / State 1 0 : white space bet ween medium and large
s quare
if ( pixel == 0 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 1 1 ; length = 1 ; }
else { s tate = 0 ; length = 0 ; s tart = j +1 ; }
break;
cas e 1 1 : / / St ate 1 1 : s econd t raversal of a large s quare
if ( pixel == 2 55 ) lengt h++;
else if ( lengt h >= MI NWI DTH & & length <= MAXWI DTH) {
st ate = 1 2 ; length = 1 ; }
else { s tate = 0 ; length = 1 ; s tart = j ; }
break;
cas e 1 2 : / / St ate 1 2 : whit e space right to t he locator
if ( lengt h == WHI TESP ACE) st at e = 1 3;
else if ( pixel == 0 ) length++;
else { s tate = 0 ; length = 0 ; s tart = j +1 ; }
break;
cas e 1 3 : / / St ate 1 3: Locator det ected in the column
27
CEN/TS 1 4567:2004 (E)
/ / re initialisation of t he automat on
st ate = 0 ;
lengt h = 0 ;
st art = j +1 ;
break;
}
}
}
/ / Checking compatibilit y bet ween vert ical and horiz ontal hit s
nbhit = 0 ;
28
CEN/TS 1 4567:2004 (E)
Annex B
(informative)
Possible algorithm for locating ABLs based on character strings
This informative annex provides one possible algorithm for locating ABLs based on character strings. The
philosophy of the algorithm is the following. First, mathematical morphology operators are applied to the whole
image to transform words of text into connected sets of black pixels. Second, these sets are filtered out so as to
keep only those whose size is compatible with the range of allowed sizes for the groups of characters in the
locator. Finally, only the groups made of four or more sets with a sufficient alignment can correspond to the locator.
In practice, only the locator is likely to survive the three steps because the same pattern almost never occurs in
ordinary text.
The text below is a C language procedure which implements the algorithm. The image is represented as a series of
unsigned char values, each representing a pixel (0 for white pixels, 255 for black ones). The resolution of the image
is assumed to be 8 pixels/mm. The memory zone that stores the image is pointed to by the unsigned char pointer
variable pimage. The x and y dimensions of the images are given as the integers variables Xmax and Ymax.
The algorithm is based on a smearing of the image. The smearing method is described below as function
Smearing. It is designed so as to group characters forming word-like groups into a single connected component.
After smearing, the connected components of the smeared image are determined through the function
LabelComponents. The source code for this function has been omitted as there exist many public domain methods
for labelling connected components.
The resulting connected components enter a step by step filtering process with the following steps:
filtering of components that do not form a line of four or more neighbour components;
filtering of lines of four or more components based on the size of the line;
After all these steps, some components remain. This is represented by the table retain . All surviving components
have a retain value of 4 or more.
These steps are illustrated below on a sample image. Figure 1 shows the original image. Figure 2 shows the
smeared image. Figure 3 shows the image after size filtering. Figure 4 shows the image after neighbour filtering.
Figure 5 shows the image after line forming and line size filtering.
29
CEN/TS 1 4567:2004 (E)
30
CEN/TS 1 4567:2004 (E)
31
CEN/TS 1 4567:2004 (E)
32
CEN/TS 1 4567:2004 (E)
33
CEN/TS 1 4567:2004 (E)
Figure B.5 — Image after number of neighbours and line size filtering
34
CEN/TS 1 4567:2004 (E)
COPYRIGHT NOTICE: This software is provided by La Poste (France). It can be used for address block location
applications only. The software may be freely distributed and/or modified. A condition of such distribution is that this
copyright notice is always included in the comments heading the software. This software is licensed “as is” without
any other warranty of any kind. La Poste expressly disclaims all warranties or conditions including, but not limited
to, implied warranties or conditions of merchantability and fitness for particular purpose and those arising by statute
or otherwise in law or from course of dealing or usage of trade. In no event shall La Poste be liable for any direct,
indirect, special, incidental, consequential or other damages arising out of this software even if La Poste has been
advised of the possibility of such damages in advance.
#include <s tdio. h>
#include <math. h>
#include <malloc. h>
#include <errno. h>
typedef s truct {
int xs ; / / Lower x limit
int ys ; / / Lower y limit
int xe; / / Upper x limit
int ye; / / Lower y limit
} Component ; / / Dat a s tructure for connect ed Component s
/ / I nternal functions
void Smearing( unsigned char * t b_in, uns igned char * tb_out , int lx, int ly) ;
int LabelComponent s( uns igned char * bin, int lx, int ly,
int fond, int obj et _ori, int obj et_label) ;
/***********************************************************************************/
int __decls pec ( dllexport ) Locat eStringLocator( uns igned char * pimage, int Xmax, int
Ymax)
{
unsigned char * pimage_aux;
35
CEN/TS 1 4567:2004 (E)
int i, j , k;
int xmin, xmax, ymin, ymax, surface;
/ * I nitialisations * /
ret ain[ i] = 1 ;
}
}
right[ j ] = i;
left[ i] = j ;
}
}
}
36
CEN/TS 1 4567:2004 (E)
void Smearing( unsigned char * t b_in, uns igned char * tb_out, int lx, int ly)
{
37
CEN/TS 1 4567:2004 (E)
for ( x = 0 ; x < ( lx - 4 ) ; x += 4 ) {
p_tb1 = t b_out + x;
p_tb2 = t b_out + x;
p_tb3 = t b_out + x + 1 ;
p_tb4 = t b_out + x + 1 ;
p_tb5 = t b_out + x + 2 ;
p_tb6 = t b_out + x + 2 ;
p_tb7 = t b_out + x + 3;
p_tb8 = t b_out + x + 3;
p_tb4 += lx;
p_tb6 += lx;
p_tb8 += lx;
if ( ( * p_t b1 ) == 0 & & ( * p_t b3 ) == 0 & & ( * p_t b5) == 0 & & ( * p_t b7 ) == 0 ) {
p_tb1 = p_t b2 ;
p_tb3 = p_t b4 ;
p_tb5 = p_t b6;
p_tb7 = p_t b8 ;
} else if ( ( * p_t b2 ) == 0 & & ( * p_t b4 ) == 0 & & ( * p_t b6) == 0 & & ( * p_t b8 ) == 0 ) {
if ( ( p_tb2 - p_tb1 ) >= siz e2 ) {
while ( p_tb1 ! = p_tb2 ) {
* p_t b1 = 0 ;
p_tb1 += lx;
}
while ( p_tb3 ! = p_tb4 ) {
* p_t b3 = 0 ;
p_tb3 += lx;
}
while ( p_tb5 ! = p_tb6) {
* p_t b5 = 0 ;
p_tb5 += lx;
}
while ( p_tb7 ! = p_tb8 ) {
* p_t b7 = 0 ;
p_tb7 += lx;
}
} else {
p_tb1 = p_t b2 ;
p_tb3 = p_t b4 ;
p_tb5 = p_t b6;
p_tb7 = p_t b8 ;
}
}
}
}
38
CEN/TS 1 4567:2004 (E)
p_tb1 = p_tb0 ;
while ( ( * p_t b1 ) ! = 0 & & p_tb1 <= p_fin) p_tb1 ++;
p_tb2 = p_tb1 ;
while ( ( * p_t b2 ) == 0 & & p_tb2 <= p_fin) p_tb2 ++;
if ( ( * p_t b2 ) ! = 0 ) {
if ( ( p_tb1 - p_tb0 ) >= siz e2 & & ( p_t b1 - p_tb0 ) <= s iz e3) {
p_tb3 = p_tb0 ;
while ( p_t b3 ! = p_tb1 ) * p_tb3 ++ = 2 5 5;
} else {
p_tb3 = p_tb0 ;
while ( p_t b3 ! = p_tb1 ) * p_t b3 ++ = 0 ;
}
p_tb0 = p_t b2 ;
p_tb1 = p_t b2 ;
while ( ( * p_t b1 ) ! = 0 & & p_tb1 <= p_fin ) p_t b1 ++;
p_tb2 = p_t b1 ;
while ( ( * p_t b2 ) == 0 & & p_tb2 <= p_fin ) p_tb2 ++;
p_tb2 --;
}
else {
p_tb1 = p_t b2 ;
while ( ( * p_t b1 ) ! = 0 & & p_tb1 <= p_fin ) p_tb1 ++;
p_tb2 = p_t b1 ;
while ( ( * p_t b2 ) == 0 & & p_tb2 <= p_fin ) p_tb2 ++;
p_tb2 --;
}
}
p_tb2 ++;
}
if ( ( * ( p_tb2 -1 ) ) == 0 ) {
if ( ( p_tb1 - p_t b0 ) >= siz e2 & & ( p_tb1 - p_t b0 ) <= siz e3) {
p_tb3 = p_t b0 ;
while ( p_t b3 ! = p_t b1 ) * p_t b3++ = 2 55 ;
}
else {
p_tb3 = p_t b0 ;
while ( p_t b3 ! = p_t b1 ) * p_t b3++ = 0 ;
}
}
}
39
CEN/TS 1 4567:2004 (E)
for ( x = 0 ; x < ( lx - 4 ) ; x += 4 ) {
p_tb1 = t b_out + x;
p_tb2 = t b_out + x;
40
CEN/TS 1 4567:2004 (E)
Bibliography
This annex provides full reference and sourcing information for all standards and other reference sources which are
quoted in the above text. For references which mention specific version numbers or dates, subsequent
amendments to, or revisions of, any of these publications may not be relevant. However, users of this Technical
Specification are encouraged to investigate the existence and applicability of more recent editions. For references
without date or version number, the latest edition of the document referred to applies.
It should be stressed that only referenced documents are listed here.
ANSI standards
[4] ANSI/AIM BC1 3 ITS/97/002, Aztec Code .
[5] ANSI MH1 0.8.2 6, Data identifier and application identifier standard.
NOTE ANSI standards can be obtained from the American National Standards Institute: 1 1 West 42nd Street, New York,
New York 1 0036, U.S.A. Tel: +1 .21 2.642.4900; Fax: +1 .21 2.398.0023; WWW: web.ansi.org .
Others
[6] UPU S28, Communication of postal information using two-dimensional symbols.
[7] UPU S40, Human and OCR Data Capture - Error Detection - Algorithm for the Generation and Checking of
an Error Detection Code.
NOTE UPU standards can be obtained from the Universal Postal Union: Case postale 1 3, 3000 Berne 1 5, Switzerland.
Tel: +41 31 350 31 1 1 ; Fax: + 41 31 350 31 1 0; WWW: www.upu.int
6 As at the date of approval of this Technical Specification, the published version of ANSI MH1 0.8.2 was still that dated 1 995.
This Technical Specification is based on an updated 2001 version, available for trial use from
http://www.autoid.org/ansi_mh1 0sc8_wg2.htm. ANSI standards may be obtained from the American National Standards
Institute: 1 1 West 42nd Street, New York, New York 1 0036, U.S.A. Tel: +1 .21 2.642.4900; Fax: +1 .21 2.398.0023;
WWW: web.ansi.org
41
DD CEN/TS
14567:2004
BSI — British Standards Institution
BSI is the independent national body responsible for preparing
British Standards. It presents the UK view on standards in Europe and at the
international level. It is incorporated by Royal Charter.
Revisions
British Standards are updated by amendment or revision. Users of
British Standards should make sure that they possess the latest amendments or
editions.
It is the constant aim of BSI to improve the quality of our products and services.
We would be grateful if anyone finding an inaccuracy or ambiguity while using
this British Standard would inform the Secretary of the technical committee
responsible, the identity of which can be found on the inside front cover.
Tel: +44 (0)20 8996 9000. Fax: +44 (0)20 8996 7400.
BSI offers members an individual updating service called PLUS which ensures
that subscribers automatically receive the latest editions of standards.
Buying standards
Orders for all BSI, international and foreign standards publications should be
addressed to Customer Services. Tel: +44 (0)20 8996 9001.
Fax: +44 (0)20 8996 7001. Email: [email protected]. Standards are also
available from the BSI website at http://www.bsi-global.com.
In response to orders for international standards, it is BSI policy to supply the
BSI implementation of those that have been published as British Standards,
unless otherwise requested.
Information on standards
BSI provides a wide range of information on national, European and
international standards through its Library and its Technical Help to Exporters
Service. Various BSI electronic information services are also available which give
details on all its products and services. Contact the Information Centre.
Tel: +44 (0)20 8996 7111. Fax: +44 (0)20 8996 7048. Email: [email protected].
Subscribing members of BSI are kept up to date with standards developments
and receive substantial discounts on the purchase price of standards. For details
of these and other benefits contact Membership Administration.
Tel: +44 (0)20 8996 7002. Fax: +44 (0)20 8996 7001.
Email: [email protected].
Information regarding online access to British Standards via British Standards
Online can be found at http://www.bsi-global.com/bsonline.
Further information about BSI is available on the BSI website at
http://www.bsi-global.com.
Copyright
Copyright subsists in all BSI publications. BSI also holds the copyright, in the
UK, of the publications of the international standardization bodies. Except as
permitted under the Copyright, Designs and Patents Act 1988 no extract may be
reproduced, stored in a retrieval system or transmitted in any form or by any
means – electronic, photocopying, recording or otherwise – without prior written
permission from BSI.
This does not preclude the free use, in the course of implementing the standard,
of necessary details such as symbols, and size, type or grade designations. If these
details are to be used for any other purpose than implementation then the prior
BSI written permission of BSI must be obtained.
389 Chiswick High Road Details and advice can be obtained from the Copyright & Licensing Manager.
London Tel: +44 (0)20 8996 7070. Fax: +44 (0)20 8996 7553.
Email: [email protected].
W4 4AL