LLMs
The authors of InstructGPT demonstrated how instruction tuning and RLHF can significantly improve the alignment and overall utility of language models after the initial pretraining step. InstructGPT was about 100x smaller than GPT-3, yet it outperformed GPT-3 on multiple evaluation criteria. This was followed by GPT-3.5, more commonly known as ChatGPT, which popularized the term large language models.
Since then, LLMs have evolved into a comprehensive domain, encompassing most NLP tasks that previously required specialized models (as recently as 2021). GPT-3.5 was succeeded by GPT-4, with 1.76 trillion parameters, and GPT-4o and o1 (as of the time of writing), offering larger input/context windows and multi-modal capabilities, including support for audio and image input/output. Other notable proprietary models include Google’s Gemini series, Anthropic’s Claude series, and others, which are typically offered as closed-weight APIs due to proprietary and financial...