Invoice Capture Center 5.2/6.0/7.
0
Technical Outline: Vendor Recognition
Gisela Hammann, SAP Solutions Group
January 2013
Abstract
Vendor data are printed on the invoice mostly in form of address data with additional data like
telephone number, internet address of the company, bank data etc. For SAP system the vendor
ID, i.e. the vendor number in the SAP system of invoice receiving company, is the relevant
identification for further invoice processing. That vendor ID is never printed on the invoice. So
an algorithm has been implemented with Invoice Capture Center to determine vendor ID from
data of invoice issuing party printed on the invoice. Highlights:
– How does the algorithm work
– What are prerequisites for optimal vendor recognition
– What variants for vendor determination are implemented
This document describes the functionality available since ICC 5.2 SP4. Please observe that
future versions may be subject to technical changes.
Contents
Contents ....................................................................................................................... 2
Invoice Capture Center: Vendor Recognition .......................................................... 3
SnapMatch Algorithm: How does it work? .............................................................. 3
SAP vendor master data download ........................................................................ 3
Mapping Algorithm .................................................................................................. 5
Additional Check by Application Logic .................................................................... 6
Vendor Data transferred to VIM .............................................................................. 6
Customizing Options .............................................................................................. 7
Parameters ............................................................................................................. 7
Training of Vendor ID with ART .............................................................................. 8
Data Quality of SAP download Data ...................................................................... 8
Additional Measures for Enhancing Vendor Recognition ....................................... 8
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
Invoice Capture Center: Vendor Recognition
Vendor data are printed on the invoice mostly in form of address data with
additional data like telephone number, internet address of the company, bank
data etc. There exist country specific differences, how many vendor data are
printed on the invoice: in some countries company name of the vendor is mostly
printed in form of a logo, or sender address or bank data are not printed on
invoices.
For SAP system vendor ID, i.e. vendor number in the SAP system of invoice
receiving company, is the relevant identification for further invoice processing.
That vendor ID is usually not printed on the invoice.
So an algorithm has been implemented to determine vendor ID from data of
invoice issuing party printed on the invoice. Due to country specific variants
mentioned above, the algorithm not only uses vendor address data, but all
invoice data which may contribute to vendor determination.
Vendor determination implemented with Invoice Capture Center may work with
recognition rate up to 95%, if all surrounding parameters are well adjusted. The
focus of this document is to enable better understanding of the algorithm and
prepare the customer to align the surrounding parameters.
SnapMatch Algorithm: How does it work?
Compared to the precedent version 3.0, ICC 5.2/6.0 uses a new ICC data
mapping algorithm for vendor determination which has been developed based on
generic SnapMatch.
Vendor determination uses an extract of SAP vendor master data downloaded
from the SAP system of the invoice receiving party.
SAP reports for download are delivered with VIM.
SAP vendor master data download
Data listed below are downloaded with vendor download
report: /OPT/IR_DL_VENDOR_TO_STG_TABLE.
The table contains one line per vendor/companycode/system. If there is more
than one bank account for the vendor, additional lines are generated for this
vendor. If there are additional VAT registration numbers for this vendor in table
LFAS, additional lines are generated for this vendor.
Additional communication data are not considered.
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
Snap Match algorithm searches for these data in OCR full text result of the
invoice and tries to find mapping data depending on priority settings described
below.
The following vendor data are downloaded and imported into ICC data base. With
ICC 6.0 SP3, number of data base columns for VAT ID has been increased to 6
aligning with VIM localization for India.
Table field Source Remark
SYSTEM T000 Logical System Name
COMPANYCODE VF_KRED Company Code
VENDORID VF_KRED or Vendor Number, LFA1 for
LFA1 vendor with no relationship
to company code
COMPANY LFA1 Name 1
COMPANY1 LFA1 Name 2
STREET LFA1 Street
STATE VF_KRED Region (State, Province,
County)
ZIP LFA1 Postal code
CITYNAME LFA1 City
POBOX LFA1 P.O. Box
ZIP1 LFA1 P.O. Box postal code
COUNTRY LFA1 Country key
PHONE ADRC Telephone no
FAX VF_KRED Fax Number
BANKCOUNTRY LFBK Bank country key
BANKNUMBER LFBK Bank number
BANKACCOUNT LFBK Bank account number
BANKNAME BNKA Name of bank
ESRNR VF_KRED POR subscriber number
VATID LFA1/LFAS VAT registration number
(international VAT number)
STCEG
VATID1 LFA1 VAT/Tax number 1, might
be STCD1, STCD2,
STCD3, STCD4 or STENR,
(e.g. national VAT number)
VATID2 LFA1 VAT/Tax number 2, might
be STCD1, STCD2,
STCD3, STCD4 or STENR
VATID3 LFA1 STCD3
VATID4 LFA1 STCD4
VATID5 LFA1 STENR
EMAIL ADR6 Internet mail (SMTP)
address
WWW LFA1 Uniform resource locator
IBAN TIBAN IBAN (International Bank
Account Number)
SWIFT BNKA SWIFT Code for
International Payments
BLACKLIST Handling Tax check (no
standard field – reserved
for customer specific
extension)
RESERVE1 Reserved (used for Prio 1
iterm mapping)
RESERVE2 Reserved (used for Prio 2
item mapping)
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
Table field Source Remark
CUSTOM1 Reserved for customer
fields (not used for
mapping)
CUSTOM2 Reserved for customer
fields (not used for
mapping)
For vendors with Country ‚India„ in VIM system with India localization, origin of
VAT ID data is as follows (download program for India):
Table field Source Remark
VATID J_1IMOVEND J_LIPANNO
VATID1 J_1IMOVEND J_LISERN
VATID2 LFA1 STCD2
VATID3 J_1IMOVEND J_LILSTNO
VATID4 J_1IMOVEND J_LICSTNI
VATID5 J_1IMOVEND J_LIEXCD
Mapping Algorithm
Generic mapping algorithm has been optimized for invoice vendor determination
on big samples of SAP vendor master download data from different countries and
different companies. The algorithm delivers very good results for the big majority
of vendors. It cannot be excluded that for handling specific issues for few vendors
customer specific enhancement will be necessary.
For mapping results a filter has been implemented, which assigns to each column
a priority, and then a vendor result will be accepted, if at least a minimum number
of columns with a priority not bigger than a threshold have been matched on the
document.
Unambiguous items have a high priority, whereas ambiguous items have lower
priorities. For a vendor to be identified, either few high priority items are needed
or a higher number of low priority items. In case of more than one competing hits,
the hit supported by the highest number of high priority items wins.
What follows is a detailed list of the priorities (valid in this way since ICC 6.0
SP3):
Priority 1 (1 item needed): VATID, RESERVE1
Priority 2 (1 item needed): EMAIL, FAX, IBAN, PHONE, VATID1 … VATID5,
WWW
Priority 3 (2 items needed): BANKACCOUNT, COMPANY, STREET, SWIFT,
RESERVE2
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
Priority 4 (5 items needed): BANKNAME, BANKNUMBER, CITYNAME,
COMPANY1, POBOX, ZIP, ZIP1
Priority 5 (8 items needed): STATE
Other columns are not significant for vendor identification.
The algorithm matches priority 1 and priority 2 items with exact match. Exact
match requires identical content, but considers different writing styles in data
base and in OCR data, e.g. blanks in VAT ID.
Lower priority items don’t need exact match, but tolerant (fuzzy) match. Match for
lower priority items takes OCR errors into consideration.
So, a vendor result will only be accepted if, for example:
At least one priority 1 or 2 column matches 100%
At least two columns with priority 1 to 3 have been found
At least five columns with priority 1 to 4 have been found
Otherwise result for vendor ID field will remain empty.
Furthermore, vendor results not receiving a minimum confidence value will be
removed, vendor ID field will remain empty.
In a second step, only 1 result will be kept in case of several results have the
same triplet of vendor number, company code and system.
Additional Check by Application Logic
The data mapping result will be checked by additional application logic depending
on the customizing settings. The following checks may take place:
Depending on customizing settings it is checked
o if vendor ID matches with Company Code
o In case of uncertain mapping result, e.g. because there are few
vendor data to be found on invoice, additional check may take
place for PO invoices if determined vendor ID matches with
vendor ID used in PO.
Vendor Data transferred to VIM
Depending on country application, found and confirmed data are transferred to
VIM (e.g. VAT ID, Vendor name is in general transferred in standard, transfer of
“ship to” and “remit to” address is standard for USA).
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
Bank data are currently transferred for some countries (e.g. Germany). Settings
for transfer of vendor data can be adjusted in ICC Customizing Client. For further
details please refer to ICC Customizing Guide, chapter “Exporting Additional
Vendor or Recipient Data” .
Customizing Options
The mapping algorithm is not customizable. Parameters of this mapping
algorithm cannot be modified by customizing, except by adding custom items in
reserve 1 (for items with mapping priority 1) and reserve 2 (for items with
mapping priority 2) columns. Up to two custom items can be added and will
automatically be used for mapping.
For use of the custom fields please refer to the customizing guide, you can use it
for including customer specific fields in data mapping. This would also need
creation of a customer specific periodical report for populating reserve1 and
reserve 2 fields in staging table /OPT/VIM_STG_LIF (that is the table where
standard download report delivered with VIM writes extract of SAP vendor master
data). Data of this table are downloaded to data base of ICC recognition server.
A common use case is different writing style in SAP master data base and on the
invoices (e.g. in some European countries VAT ID is written with or without
leading country identifier).
For special cases concerning few vendors, alternative writing styles
should be maintained in SAP data base.
For handling alternative writing styles for all vendors, a customer specific
periodical report could be created for populating reserve1 column of
staging table with data for alternative writing style. This column will
automatically be used by mapping algorithm.
Parameters
The following parameters can be adjusted for vendor determination:
Use PO number for vendor ID detection
For PO invoices, for which PO download data is available in ICC, PO
number can be used to determine vendor ID from this data.
If this feature is selected and delivers a result, this result has higher
priority than vendor ID from snap match.
Important: this feature cannot be used in environments, where ICC
works with more than one SAP system.
This is because PO numbers are only unique in a single SAP
system.
Vendor ID determination via PO number can help in situations, where
less priority 1 information is printed on invoices, e.g. in the US.
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
Ignore Company Code and subsystem at vendor detection.
That parameter typically is used by companies using identical vendor
IDs in all company codes and in all subsystems – or in a single
system environment. This parameter enables to reduce download
data volume considerably by downloading PO data only for one
company code.
Training of Vendor ID with ART
Currently training with ART is not supported for VendorID.
Data Quality of SAP download Data
As the mapping algorithm uses SAP vendor master data for mapping, data quality
of SAP vendor master data is of big impact for recognition rate for vendor ID. It is
very important that data with mapping priority 1 or 2, described above, are
maintained completely.
The download frequency can be adjusted in the system and depends of the
dynamics of adding or removing vendors in SAP data base. Download may take
place once a day, a week or a month. Appropriate parameter settings have to be
defined during the project implementation phase. Parameter settings have to be
aligned for ICC and VIM.
Additional Measures for Enhancing Vendor
Recognition
For top vendors sending in high invoice volumes it would make sense to ask
them to change invoice layout for making vendor determination easier for ICC
recognition:
Print additional data on the invoice which may be used with priority 1 or 2
for mapping and which may help to determine the vendor (VAT ID,
Telephone number, WWW address etc.)
Print Vendor number on the invoice with defined key word:
e.g. print “vendor ID: 1234”. Reserve1 field in data base could be
populated with Keyword + Vendor Number and used for SnapMatch
mapping.
Avoid invoice layouts with design elements covering information used by
recognition:
o Keywords should not be printed in colored or shaded fields
o Barcodes or stamps – if used at all – should not cover
information and should not be applied next to information to be
recognized
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN
o Vendor data should be printed clearly and completely on the
invoice, vendor company name should not be printed logo only.
w w w. o p e n t e x t . c o m
For more information about Open Text products and services, visit www.opentext.com. Open Text is a publicly traded company on both NASDAQ (OTEX) and the TSX (OTC).
Copyright © 2009 by Open Text Corporation. Open Text and The Content Experts are trademarks or registered trademarks of Open Text Corporation. This list is not exhaustive. All other
trademarks or registered trademarks are the property of their respective owners. All rights reserved. SKU#_EN