Creating End-to-End spaCy Workflows with Weasel
In this chapter, we will explore how to create end-to-end NLP workflows using spaCy and its companion tool, Weasel. Originally a part of spaCy, Weasel has now become a standalone library, meaning you can also use it for other projects that are not created with spaCy.
Data Version Control (DVC) has an ecosystem of solutions for data/model versioning and experiment tracking, enhancing collaboration and experiment management. By integrating Weasel with DVC, we ensure our projects are efficiently versioned and tracked, improving organization and reliability.
In this chapter, we will start by cloning and running a project template with Weasel, following the best software engineering practices to ensure a reproducible and well-structured workflow. We will then adapt this template for a different use case and, finally, we will explore how to use DVC to track and manage trained models, enabling efficient collaboration. This approach will...