Summary
In this chapter, we explored how to leverage large language models (LLMs) within spaCy pipelines using the spacy-llm library. We reviewed the basics of LLMs and prompt engineering, emphasizing their role as versatile tools capable of performing a wide range of NLP tasks, from text categorization to summarization. However, we also noted the limitations of LLMs, such as their high computational cost and the tendency to hallucinate. We then demonstrated how to integrate LLMs into spaCy pipelines by defining tasks and models. Specifically, we implemented a summarization task and, subsequently, created a custom task to extract the context from a quote. This process involved creating Jinja templates for prompts and defining methods for generating and parsing responses.
In the next chapter, we’re going back to more traditional machine learning and learn how to annotate data and train a pipeline component from scratch using spaCy.