LLM Intro
LLM Intro
~140GB le
cat
sat
mat (97%)
on
a
Java code dream Amazon product dream (?) Wikipedia article dream
How does it work?
Little is known in full detail…
- Billions of parameters are dispersed through the network
- We know how to iteratively adjust them to make it better at prediction.
- We can measure that this works, but we don’t really know how the
billions of parameters collaborate to do it.
<USER>
Can you write a short introduction about the relevance of the
term "monopsony" in economics? Please use examples related
to potential monopsonies in the labour market and cite relevant
research.
<ASSISTANT>
"Monopsony" refers to a market structure where there is only
one buyer for a particular good or service. In economics, this
term is particularly relevant in the labor market, where a
monopsony employer has significant power over the wages and
working conditions of their employees. The presence of a
monopsony can result in lower wages and reduced employment
opportunities for workers, as the employer has little incentive to
increase wages or provide better working conditions…
~100K conversations
After netuning you have an Assistant:
<USER>
Can you help me with this code? It seems like there is a bug.
print(“hello world)
<ASSISTANT>
It looks like you forgot to close the string passed to the function
print. You have to add a closing quote to properly terminate the
string. Here is the corrected function:
print(“hello world”)
Stage 1: Pretraining
1. Download ~10TB of text.
every 2. Get a cluster of ~6,000 GPUs.
~year
3. Compress the text into a neural network, pay
~$2M, wait ~12 days.
4. Obtain base model.
Stage 2: Finetuning
1. Write labeling instructions
2. Hire people (or use scale.ai!), collect 100K high
quality ideal Q&A responses, and/or comparisons.
every 3. Finetune base model on this data, wait ~1 day.
~week 4. Obtain assistant model.
5. Run a lot of evaluations.
6. Deploy.
7. Monitor, collect misbehaviors, go to step 1.
The second kind of label: comparisons
Simple example: it’s much easier to spot a good haiku than it is to generate one:
Labeling instructions
[InstructGPT paper]
Increasingly, labeling is a human-machine collaboration…
- LLMs can reference and follow the labeling instructions just as humans can.
- => LLMs can create drafts, for humans to slice together into a nal label.
- => LLMs can review and critique labels based on the instructions.
-…
fi
We can expect a lot more “general capability” across all areas of knowledge:
[Sparks of Artificial General Intelligence: Early experiments with GPT-4, Bubuck et al. 2023]