Updated training setup
In the previous section, we touched on the issue of fine-tuned language models going off-context after generating a few tokens, a problem often referred to as the alignment issue. This challenge restricts the model’s ability to maintain consistent output context, affecting task performance. While fine-tuned models improved at few-shot and zero-shot tasks (refer to the sections on GPT-2 and GPT-3 in Chapter 4), they didn’t always reliably produce the desired results. For instance, a model might handle sentiment analysis well in a few-shot setting but struggle with a task like translation in a similar setup.
To address this limitation, Ouyang et al. proposed InstructGPT7 in early 2022. Although similar in architecture to previous GPT models, InstructGPT was significantly smaller, with just 1.3 billion parameters compared to GPT-3’s 175 billion. The key innovation lay in two additional training steps: instruction fine-tuning and reinforcement...