0% found this document useful (0 votes)
18 views28 pages

Week 6 & 7 Notes

The document appears to discuss various advanced concepts in machine learning and AI, including model architectures, reasoning processes, and techniques like distillation and diffusion. It highlights the importance of specialized models for different domains and the efficiency of smaller models compared to larger ones. Additionally, it touches on embedding techniques for text and image processing.

Uploaded by

cpusingpython
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views28 pages

Week 6 & 7 Notes

The document appears to discuss various advanced concepts in machine learning and AI, including model architectures, reasoning processes, and techniques like distillation and diffusion. It highlights the importance of specialized models for different domains and the efficiency of smaller models compared to larger ones. Additionally, it touches on embedding techniques for text and image processing.

Uploaded by

cpusingpython
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

hithal

-
-

Cate
Repor
T
Tode
>
- weill

-
Commit
Pusher
T
-

-
Prompt
App
&
Ilanggespe
-

langcham
-
furn or
lithel
2
Cupo
tool crate
X E

3
HTTP
readers
e Reg Body
=
Code
States
E
Rep moder
Body

Tes
O Tools
-

Protocol
MCP Content -
Il

Model
E
Rules
.

G
Pools
((pode)
em .
bird-loos
modernent Measure mate e


to teat
Image
-

lo
zent
video
w Image
tent
3

C rece
embeddings

"apple"
Dis 3
--
Kapi

- [
"oppe"
-L
-
P
El
"geau"
Remote
>
-

I
-
L
L -

]
caligy
3
1 -
l -
>
-
Text
m

let
-
sat--mat
or a

-
67B
m
+
Jan 2011
Vis transformer
traditional
Minter of Expects
MoE-
v2
e 236B -

MLA latent
Multibeaded
Attentian

671B earring
v3 -
Reinforcement
-
Basoning
1
I -
3 an

20
-
4 Points
-

Ri

MoE Reasoning
② A
-

Thought
Chain of
③ Deepseck
istillation local
e

oclama

=
iDons
NNI
Specialized
Mistral
A D finance Models

↑ medical domain/
D - kramledge
-
Dif

Born
regal

shopets ->
Geneal
perts
GAN

eli
e
Parameters
Activate

parameters"
671B
Total
=

activated
·
T Environment

Interputer Penalty

Action
-
((m)
Agent decision)
(make
# ·0 Action
Aiy
-
-
ilp -

State optimize a
noal
-

licynice
Model

Tuned

b
Fire
Instructions
Model
Turned
Enstruct end

Fine

s
Model-Transformer
Base Architecture
Tune

#I
Fire
Instructions
7
4PT3
Foundational
.

v3 .

Chat Models
Reasoning one Turning

Reasoning
Model

Distance
Car
>
- Speed ,
w
rach ?
time
time

trout load
S =
1
To

=
ST
D

t -
time

2
W
-
ana. .
med .

Syst do I
House
>
-

-
Carpet
carpet.

woh
area
Code
/Malea
carpet
-
-

Carpet
Step
Step by
Explain
ToT
T
-

&

of
Thought z

I
- char
W
-
La
Step
carpet
-
-
2
>
- Step
>
-
Substract
-> Steps -

- mistake
Anha
2 Dig
#
Reevaluate
# ans
Better
is
Mintur of Experts -
Computation
faster
.
more efficient
Improvement
T self itself
Learning guides
Reinforcement model
transparency

Chaim of thought
o

Mor valuati a

is better

Performance
raiser/ Mor
Distillation
-
&
accessible
Models too big
Problem -

- =

Distillation
Efficiently

-

lacta I-
Model
mode Student
-

Smaller
Model
- Better
Model
Powerful Performing
Big,

NI/
↓ Specialized
T
General
Bigger
Small
Not
&
Better:
O
Teacher 1
. 2 3,
4 ,
5 ,6, 7 ,
8, 9
Student
>
-
Probab
so D
- O

to
,
T
-

2 - 2

D -
2
3 -34

i
- 3

Y
D
+

·
'
S

Duods
ets
.

Data Set
MNS S Isoft ta
- Labels
soft

se
1
Sum =
-lesser
Teacher <
Probab al params
-
O D +------
O
0 S 22

.

I
>
-
I
O
3-

D + 2
Y

T O 3

Y
vializing
O
004
10
.

Fire Turning
5
O 0 .
03 student
-
the olp
6
I using
probab from
-
O
-
7
teacher.
ove

i
Process
Ressoriy/Thought

place
olP
=>

ecomi Reasoning/
Thought
Proces

of Though
-

↓ "Toget as us

FT About
- Reasoning
i alised
Step
step by
Distillation
-
less computation
Efficient
->

domains
smalles specialized
Accuracy

Model
Powerful
/Huge of parameters
-billions

(senral)
ans
behind
Reasoning
=
Generating -

·
Sindent small
I

to
Model of pasms
Diffusion Model

F
-

ItF -

& -
-
b5

-
Add Noise
Aud
Pass


&
F -

Devaine
② Pass
- Revers Steps
in
- Denoise images
Train
to gewerte
Architecture
V-Net
-
=

Poet
- -
upsample
Dannsampen

&
Diffusion
Stable
-
-

standard
VAEt Diffusion
-Net
Architect comeom
a
o
out T

1
res 4305
decoder
wor +
Differe e
in

-
Crew
s
Mean a Yes
samplig
Text Image
to

-
↓ Embedding.
Embedding Tokenize
Embeddin
Open sentence
meaning
of
& Das
mode

Vec
central words
- -
rogether
.

are playing
Cats
and dogs
&

meaning
2

visual
-
Tent to Image ↑
-

① Text - T
Embed
CLIP
I/p =
image
usig meaningful
>
-
captured representation

Noise
Random - Image
Demeise
+
Y
Entra Info K,
V
Text
Embeding of neces

Attention
was
Tap
Imag /Part
Text,
Patches
-

iP
>
-
Tokenize
- Decodes
Encode
H
d

P
9

# i
-

>
-
meaning
Sayscersification
v
Generation
.

Dall-e
midjourney
e dr
&

Recognition
Encoder
AN
VAE Mo
dels
LIM
Diffusion Diffusion
-

Diffusion Natent
Stable Models
+

(UAE
DM) gened
Architeche
-

Transformer
Diffusion
t
image
Transformer ↓X
generate
M
ChIP

You might also like