Extracting Semantic Representations with spaCy Pipelines
In this chapter, we will apply what we learned in Chapter 4, to the Airline Travel Information System (ATIS), a well-known airplane ticket reservation system dataset. The data consists of utterances – sentences of users asking for information. First, we will extract the named entities, creating our own extraction patterns with SpanRuler. Then we will determine the intent of the user utterance with DepedencyMatcher patterns. We will also use the code to extract the intent and create our own custom spaCy component and use it to process large datasets faster with the Language.pipe() method.
In this chapter, we’re going to cover the following main topics:
- Extracting named entities with
SpanRuler - Extracting dependency relations with
DependencyMatcher - Creating a pipeline component using extension attributes
- Running the pipeline with large datasets