Summary
In this chapter, you learned how to do rule-based matching with linguistic and token-level features. You learned about the Matcher class, spaCy’s rule-based matcher. We explored the Matcher class by using it with different token features, such as shape, lemma, text, and entity type.
Then, you learned about SpanRuler, another lifesaving class that you can achieve a lot with. You also learned how to extract named entities with the SpanRuler class.
Finally, we put together what you learned in this chapter and your previous knowledge and combined linguistic features with rule-based matching with several examples. You learned how to extract patterns, entities of specific formats, and entities specific to your domain.
With this chapter, you completed the linguistic features. In the next chapter, we’ll use all this knowledge to extract semantic representations from text using spaCy pipelines.