Overview of spaCy
NLP is a subfield of AI that analyzes text, speech, and other forms of human-generated language data. Human language is complicated – even a short paragraph contains references to the previous words, pointers to real-world objects, cultural references, and the writer’s or speaker’s personal experiences. Figure 1.1 shows such an example sentence, which includes a reference to a relative date (recently), phrases that can be resolved only by another person who knows the speaker (regarding the city that the speaker’s parents live in), and who has general knowledge about the world (a city is a place where human beings live together):
Figure 1.1 – An example of human language, containing many cognitive and cultural aspects
How do we process such a complicated structure using computers? With spaCy, we can easily model natural language with statistical models, and process linguistic features to turn the text...