Combining spaCy models and matchers
In this section, we’ll go through some recipes that will guide you through the entity extraction types you might encounter in your NLP journey. All the examples are ready-to-use and real-world recipes. Let’s start with number-formatted entities.
Extracting an IBAN
An IBAN is an important entity type that occurs in finance and banking frequently. We’ll learn how to parse it out.
An IBAN is an international number format for bank account numbers. It has the format of a two-digit country code followed by numbers.
How can we create a pattern for an IBAN? We start with two capital letters, followed by two digits. Then, any number of digits can follow. We can express the country code and the next two digits as follows:
{"SHAPE": "XXdd"} Here, XX corresponds to two capital letters, and dd is two digits. Then, the XXdd pattern matches the first block of the IBAN perfectly. How about the rest of the...