LLM Foundations
It might feel like Large Language Models (LLMs) have dominated the AI landscape for a long time, but in reality, it’s only been a couple of years. The AI craze truly took off when OpenAI released ChatGPT in November 2022, reaching a million users within just a week1. This was a remarkable feat—especially considering the closest comparison is Instagram, which took eight weeks to hit a million downloads2. The previous most pivotal moment in AI came in 2012, when AlexNet won the ImageNet competition3, though that breakthrough mostly resonated within academic circles.
In this chapter, we will expand on our understanding of NLP concepts and explore what sets LLMs apart from the models we’ve discussed so far. Specifically, we will cover:
- A brief recap of transformer architectures
- The LLM training setup and the role of InstructGPT
- Hands-on exercises to apply these learnings
All the code snippets presented in...