0% found this document useful (0 votes)
110 views

LLM Intro

LLMs are AI models trained on vast amounts of text data using neural networks to predict the next word in a sequence. They can be used for tasks like answering questions or assisting with code by demonstrating a broad but imperfect understanding from their training. To create an assistant model, researchers further train an LLM on a dataset of ideal example conversations consisting of questions and responses collected from people. This finetuning process refines the model's abilities while evaluations aim to identify remaining flaws. Continued data collection, monitoring of user interactions, and periodic retraining helps improve the assistant over time as a human-machine collaborative process.

Uploaded by

sir bibjb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views

LLM Intro

LLMs are AI models trained on vast amounts of text data using neural networks to predict the next word in a sequence. They can be used for tasks like answering questions or assisting with code by demonstrating a broad but imperfect understanding from their training. To create an assistant model, researchers further train an LLM on a dataset of ideal example conversations consisting of questions and responses collected from people. This finetuning process refines the model's abilities while evaluations aim to identify remaining flaws. Continued data collection, monitoring of user interactions, and periodic retraining helps improve the assistant over time as a human-machine collaborative process.

Uploaded by

sir bibjb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Basic IDEA

LLMs are a type of AI that can mimic human intelligence by


processing and generating text. They are trained on massive
datasets of text and code, and they can be used for a variety
of tasks, such as
Large Language Model (LLM)
MacBook

140GB ~500 lines


of C code
Training them is more involved.
Think of it like compressing the internet.

~140GB le

Chunk of the internet, 6,000 GPUs for 12 days, ~$2M


~10TB of text ~1e24 FLOPS

*numbers for Llama 2 70B


fi
Neural Network
Predicts the next word in the sequence.

cat
sat
mat (97%)
on
a

e.g. context of 4 words predict next word


Next word prediction forces the neural
network to learn a lot about the world:
The network “dreams” internet documents:

Java code dream Amazon product dream (?) Wikipedia article dream
How does it work?
Little is known in full detail…
- Billions of parameters are dispersed through the network
- We know how to iteratively adjust them to make it better at prediction.
- We can measure that this works, but we don’t really know how the
billions of parameters collaborate to do it.

They build and maintain some kind of knowledge


database, but it is a bit strange and imperfect:

Recent viral example: “reversal curse”

Q: “Who is Tom Cruise’s mother”?


A: Mary Lee Pfei er ✅
100 billion parameters
Q: “Who is Mary Lee Pfei er’s son?”
A: I don’t know ❌
🤔

=> think of LLMs as mostly inscrutable artifacts,


develop correspondingly sophisticated evaluations.
ff
ff
Training the Assistant
(Just swap the dataset, then continue training)
written by
people:

<USER>
Can you write a short introduction about the relevance of the
term "monopsony" in economics? Please use examples related
to potential monopsonies in the labour market and cite relevant
research.

<ASSISTANT>
"Monopsony" refers to a market structure where there is only
one buyer for a particular good or service. In economics, this
term is particularly relevant in the labor market, where a
monopsony employer has significant power over the wages and
working conditions of their employees. The presence of a
monopsony can result in lower wages and reduced employment
opportunities for workers, as the employer has little incentive to
increase wages or provide better working conditions…

~100K conversations
After netuning you have an Assistant:

<USER>
Can you help me with this code? It seems like there is a bug.
print(“hello world)

<ASSISTANT>
It looks like you forgot to close the string passed to the function
print. You have to add a closing quote to properly terminate the
string. Here is the corrected function:

print(“hello world”)

Let me know if I can help with anything else!


fi
Summary: how to train your ChatGPT

Stage 1: Pretraining
1. Download ~10TB of text.
every 2. Get a cluster of ~6,000 GPUs.
~year
3. Compress the text into a neural network, pay
~$2M, wait ~12 days.
4. Obtain base model.

Stage 2: Finetuning
1. Write labeling instructions
2. Hire people (or use scale.ai!), collect 100K high
quality ideal Q&A responses, and/or comparisons.
every 3. Finetune base model on this data, wait ~1 day.
~week 4. Obtain assistant model.
5. Run a lot of evaluations.
6. Deploy.
7. Monitor, collect misbehaviors, go to step 1.
The second kind of label: comparisons

It is often much easier to compare Answers instead of writing Answers.

Simple example: it’s much easier to spot a good haiku than it is to generate one:
Labeling instructions

[InstructGPT paper]
Increasingly, labeling is a human-machine collaboration…

- LLMs can reference and follow the labeling instructions just as humans can.
- => LLMs can create drafts, for humans to slice together into a nal label.
- => LLMs can review and critique labels based on the instructions.
-…

fi
We can expect a lot more “general capability” across all areas of knowledge:

[Sparks of Artificial General Intelligence: Early experiments with GPT-4, Bubuck et al. 2023]

You might also like