0% found this document useful (0 votes)
219 views36 pages

Multi2sim-M2s Simulation Framework

Multi2Sim is an open-source simulation framework that models heterogeneous CPU-GPU systems. It simulates x86 CPUs using detailed architectural simulation and models AMD GPUs using the OpenCL programming model. Multi2Sim addresses limitations of other simulators by providing full-system and cross-compiler simulation capabilities. It supports common benchmarks and has been used to evaluate new architectural proposals.

Uploaded by

Neha Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
219 views36 pages

Multi2sim-M2s Simulation Framework

Multi2Sim is an open-source simulation framework that models heterogeneous CPU-GPU systems. It simulates x86 CPUs using detailed architectural simulation and models AMD GPUs using the OpenCL programming model. Multi2Sim addresses limitations of other simulators by providing full-system and cross-compiler simulation capabilities. It supports common benchmarks and has been used to evaluate new architectural proposals.

Uploaded by

Neha Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Conference title 1

The Multi2Sim Simulation Framework



A CPU-GPU Model
for Heterogeneous Comuting
www!multi2sim!org

"afael U#al
$a%id "! &aeli
'ortheastern Uni%ersit(
)oston* MA
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 2
-utline
1. Introduction
First )lock . The /01 CPU Simulation
2. The x86 CPU Emulation
3. The x86 CPU Architectural Simulation
4. The Memor !ierarch
". #enchmar$% and Simulation%
Second )lock . The AM$ 2%ergreen GPU Simulation
6. The &'enC( Pro)rammin) Model
*. The AM+ E,er)reen -PU Emulation
8. The AM+ E,er)reen -PU Architectural Simulation
.. #enchmar$% and Simulation%
1/. Conclu%ion% and 0uture 1or$
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 3
,! 3ntroduction
Moti%ation
2 4imitations of e/isting CPU simulators
3 Such a% Sim'leScalar4 Simic%4 SSMT4 M5Sim4 SMTSim4
M"4 ...
3 0ull5%%tem ,%. a''lication5onl %imulation.
3 0ree4 o'en5%ource.
3 Architectural %imulation accurac.
3 Al'ha6PISA architecture% 7 cro%%5com'iler%.
3 Inte)rated %%tem.
2 Current simulation needs
3 #a%ed on current 'roce%%or mar$et.
3 !etero)eneou% CPU5-PU en,ironment%.
3 Tool 8or e,aluation o8 ne1 architectural 'ro'o%al%.
3 Simulation o8 a -PU ISA.
2 2/isting GPU simulation aroaches
3 #arra9 :;I+IA Tel%a ISA.
3 &celot9 PT< intermediate lan)ua)e %imulator.
3 :o architectural %imulation.
3 :o emulation o8 AM+ ISA%.
3 :ot ca'a=le o8 hetero)eneou% %imulation.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 4
,! 3ntroduction
Multi2Sim )ackground
2 Multi2Sim 5!/ %ersion series* 2+,, 6/0172%ergreen8
Suerscalar ieline
&ut5o85order execution4
=ranch 'rediction4 trace
cache4 etc.
Multithreading
0ine5)rain4 coar%e5)rain
and %imultaneou% >SMT?.
Multicore architecture!
Con8i)ura=le memor hierarch4
cache coherence4
interconnection net1or$%.
State-of-the-art #enchmarks!
Te%ted %u''ort 8or common re%earch
=enchmar$%4 a,aila=le 8or do1nload.
GPU model
Su''ort 8or &'enC(
=enchmar$%.
Model 8or E,er)reen ISA.
2 Multi2Sim ,!/ %ersion series* 2++9 6M3PS-#ased8
2 Multi2Sim 2!/ %ersion series* 2++0 6/01-#ased8
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 5
,! 3ntroduction
Getting Started
2 User-friendl( installation and
test
$ tar -xzf multi2sim-3.1.tar.gz
$ cd multi2sim-3.1
$ ./configure
$ make
$ sudo make install
2 Alication-onl( simulator
-riginal e/ecution Simulated e/ecution
$ ./test-args hola que tal
arg[0] !hola!
arg[1] !que!
arg[2] !tal!
$ m2s ./test-args hola que tal
"... #imulator out$ut ...%
arg[0] !hola!
arg[1] !que!
arg[2] !tal!
"... #imulator statistics ...%
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 6
,! 3ntroduction
The 3niFile Format
2 2/amle of 3niFile
& 'his is a comment.
[ #ection 0 ]
(olor )ed
*eight +0
[ ,ther#ection ]
-aria.le -alue
$emo ,
2 Multi2Sim uses 3niFile for
3 Con8i)uration 8ile%.
3 &ut'ut %tati%tic 8ile%.
3 Standard error out'ut.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 7
)lock ,
The /01 CPU
Simulation
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 8
2! The CPU 2mulation
$efinition
2 2mulation 6a!k!a! functional simulation8
3 @u%t mimic ori)inal =eha,ior o8 a 'ro)ram.
3 A a% o''o%ed to timin)6detailed6architectural
%imulation.
2 Stes
1? Pro)ram loadin).
2? Simulation loo'.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 9
2! The CPU 2mulation
Program 4oading
2 3nitiali:ation of a rocess state
3 ;irtual memor ma'.
3 ;alue o8 x86 re)i%ter%.
Stack
Program arguments
Environment variables
0x08000000
mmap region
(not initialized)
Heap
Initialized data
Text
Initialized data
0x08xxxxxx
0x40000000
0xc0000000
eax
ebx
eax
ecx
esp
eip
I
n
i
t
i
a
l
i
z
e
d
i
n
s
t
r
u
c
t
i
o
n

p
o
i
n
t
e
r
T
o
p

o
f

s
t
a
c
k
,8 Parse 24F e/ecuta#le
3 E(0 %ection%.
3 InitialiBed code and data.
28 3nitiali:e stack
3 Pro)ram header%.
3 Ar)ument%.
3 En,ironment ,aria=le%.
58 3nitiali:e registers
3 Pro)ram entr 'oint 7 eip
3 Stac$ 'ointer 7 esp
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 10
2! The CPU 2mulation
Simulation 4oo
$emo 2
"ead instr!
at ei$
In%tr.
=te%
$ecode
instruction
In%tr.
8ield%
3nstr! is
int 0x/0
:o Ce%
2mulate
s(stem call
2mulate
/01 instr!
Mo%e ei$
to ne/t instr!
2 2mulation of /01 instructions
3 U'date memor ma' >i8 needed?.
3 U'date x86 re)i%ter%.
3 Exam'le9 add [.$011]2 0x3
2 2mulation of 4inu/ s(stem
calls
3 AnalBe %%tem call code and ar)%.
3 U'date memor ma'.
3 U'date eax 1ith return ,alue.
3 Exam'le9 read4fd2 .uf2 count5&
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 11
5! The CPU Architectural Simulation
$efinition
2 Architectural simulation 6a!k!a! detailed;timing
simulation8
3 Pro,ide% 'er8ormance re%ult% 8rom executin) a 'ro)ram
on a con8i)ura=le CPU model.
3 Main 'er8ormance metric9 execution time.
#ut al%o %tructure% occu'anc4 cache hit rate%4 contention 'oint%...
Architectural
Simulator
cycle counter
CPU
functional
simulator
CPU cores
model
Memory hierarchy
model
Run a new x86
instruction
This is the isntr.
that was run
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 12
5! The CPU Architectural Simulation
The Suerscalar Pieline
$emo 5
Fetch
In%tr.
Cache
0etch Dueue
$isatch

Eeorder #u88er


In%truction Fueue

(oad6Store Fueue
3ssue
Commit
+ata
Cache
Ee)i%ter
0ile
0U
Trace Dueue

Trace
Cache
$ecode
Go' Dueue

<rite#ack
2 Characteristics
3 S'eculati,e execution.
3 #ranch 'rediction.
3 &ut5o85order execution.
3 Trace cache.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 13
5! The CPU Architectural Simulation
Multithreaded Processor Model
Fetch
In%tr.
Cache
$isatch




3ssue
Commit
+ata
Cache
Ee)i%ter
0ile
0U

Trace
Cache
$ecode
<rite#ack
Fetch
In%tr.
Cache
$isatch




3ssue
Commit
+ata
Cache
Ee)i%ter
0ile
0U

Trace
Cache
$ecode
<rite#ack
Fetch
In%tr.
Cache
$isatch




3ssue
Commit
+ata
Cache
Ee)i%ter
0ile
0U

Trace
Cache
$ecode
<rite#ack
Shared Functional
Unit Pool
2 Multithreading Paradigms
3 Coarse grain multithreading
Thread %1itch u'on lon)5latenc e,ent%.
3 Fine grain multithreading
Thread %1itch at a ccle )ranularit.
3 Simultaneous multithreading
Multi'le5thread i%%uin) o8 in%truction%.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 14
5! The CPU Architectural Simulation
Multicore Processor Model
Core / Core 1

Memor( Hierarch(
Fetch
I n%tr.
Cache
$isatch




3ssue
Commit
+ata
Cache
Ee)i%ter
0ile
0U
Trace
Cache
$ecode
<rite#ack
Fetch
I n%tr.
Cache
$isatch




3ssue
Commit
+ata
Cache
Ee)i%ter
0ile
0U
Trace
Cache
$ecode
<rite#ack
2 Multicore Processor
3 Multi'le inde'endent %u'er%calar 'i'eline%.
3 Communication onl throu)h memor
hierarch.
$emo =
2 <hat can we run on it>
3 Multi'le %in)le5threaded 'ro)ram%.
3 &ne >or more? 'ro)ram% %'a1nin) child
thread%.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 15
5! The CPU Architectural Simulation
$efinitions
2 Core 6c-+* c-,* !!!8
3 !ard1are com'onent 1ith an inde'endent %et o8 %u'er%calar 'i'eline%.
3 Each core ma contain %e,eral threads.
$emo =
2 Thread 6t-+* t-,* !!!8
3 !ard1are com'onent 1ith a 'artiall inde'endent %et o8 'i'eline %ta)e%.
2 Conte/t 6ct/-+* ct/-,* !!!8
3 So8t1are thread 1ith inde'endent ,alue 8or re)i%ter% >incl. ei?.
3 Can =e a %eDuential 'ro)ram or a %'a1ned child context.
2 'ode
3 !ard1are com'onent runnin) a context.
3 Multicore 'roc.9 c+4 c,4 A Multithreaded 'roc.9 t+4 t,4 A
Multicore5multithreaded 'roc.9 c+-t+4 c+-t,4 ...
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 16
2 Configuring memor( hierarch(
3 An num=er o8 cache% or)aniBed in an num=er o8 le,el%.
3 Connected throu)h an num=er o8 interconnect%.
3 A %et o8 1 or more cache% mu%t connect to an interconnect 8rom Ha=o,eI.
&nl one cache 3or main memor3 connected H=elo1I.
=! Memor( Hierarch(
Configuration
2 Memor( hierarch( entries
3 Each node ha% t1o entrie% to the memor hierarch9
In%truction entr J +ata entr
3 Se,eral node entrie% can con,er)e to the %ame cache >or main memor?.

3nterconnect
Cache Cache Cache
Cache or
Main Memor(
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 17
=! Memor( Hierarch(
Configuration
c+-t+
$ata
4,
3nstr!
4,
c+-t,
$ata
4,
3nstr!
4,
Core +
c,-t+
$ata
4,
3nstr!
4,
c,-t,
$ata
4,
3nstr!
4,
Core ,
42 Cache 42 Cache
Main Memor(
2 2/amle
3 25core4 25threaded 'roce%%or >4 node%?.
3 Each thread ha% it% o1n 'ri,ate data and in%truction (1 cache%.
3 (2 cache%9 %hared amon) thread%4 'ri,ate 'er core4 uni8ied 8or data6in%tr.
$emo ?
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 18
?! )enchmarks and Simulations
Suorted CPU )enchmarks
2 Se@uential #enchmarks
3 SPEC CPU 2///
3 SPEC CPU 2//6
3 Media#ench5I
$emo 1
2 Parallel #enchmarks
3 SP(AS!52
3 PAESEC 2.1
2 A%aila#ilit( on we#site
3 x86 =inarie% te%ted on Multi2Sim.
3 (i%t o8 execution command%.
3 +ata 8ile% 8or 8ree5di%tri=ution =enchmar$%.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 19
)lock 2
The AM$ 2%ergreen GPU
Simulation
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 20
1! The -enC4 Programming Model
3ntroduction
2 GPU
3 Ma%%i,el 'arallel de,ice.
3 &ri)inall de,oted to )ra'hic% com'utation%.
3 :o1 )ettin) 'o'ular 8or )eneral 'ur'o%e com'utation% >-P-PU?.
3 Sin)le5Pro)ram Multi'le5+ata >SIMP? model.
2 MaAor GPU %endors
3 :;I+IA 7 CU+A 'ro)rammin) lan)ua)e.
3 AM+ 7 &'enC( 'ro)rammin) lan)ua)e.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 21
1! The -enC4 Programming Model
Bector Addition 2/amle
int main45
6
[ ... ]
cl(reate7rogram8ith#ource4...2
9:ector;add.cl92 ...5&
cl(reate<ernel4...2 9:ector;add92
...5&
.uf1 cl(reate=uffer4...2 (>;?@?;)@AB2
size2 ...5&
.uf2 cl(reate=uffer4...2 (>;?@?;)@AB2
size2 ...5&
.uf3 cl(reate=uffer4...2 (>;?@?;8)C'@2
size2 ...5&
cl#et<ernelArg4...2 02 .uf12 ...5&
cl#et<ernelArg4...2 12 .uf22 ...5&
cl#et<ernelArg4...2 22 .uf32 ...5&
cl@nqueueDB)ange<ernel4...5&
[ ... ]
E
-enC4 Host Program
:ector;add.c
-enC4 $e%ice &ernel
:ector;add.cl
;;kernel :oid :ector;add4
;;read;onlF ;;glo.al int G.uf12
;;read;onlF ;;glo.al int G.uf22
;;Hrite;onlF ;;glo.al int G.uf35
6
int id get;glo.al;id405&
.uf3[id] .uf1[id] 0 .uf2[id]&
E
/01 e/ecuta#le #inar(
:ector;add
AM$ 2%ergreen kernel #inar(
:ector;add..in
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 22
1! The -enC4 Programming Model
-enC4 Software 2ntities
Common OpenCL
Kernel:
;;kernel func45
6
E
<ork-
grou
...
<ork-
item
<ork-
grou
<ork-
grou
<ork-
grou
...
.
.
.
.
.
.
ND-Range
...
...
.
.
.
.
.
.
Work-group
<ork-
item
<ork-
item
<ork-
item
Work-item
lo!al memory "ocal memory Pri#ate memory
>SnchroniBation
allo1ed at thi% le,el?
2 Proerties
3 !o%t 'ro)ram con8i)ure% :+5Ean)e and Kor$5)rou' %iBe%.
3 &nl Kor$5item% in the %ame Kor$5)rou' can %nchroniBe and %hare data.
3 Kor$5)rou'% in :+5Ean)e can execute in an order.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 23
9! The 2%ergreen GPU 2mulation
The -enC4 Call Stack
-

e
r
a
t
i
n
g
s
(
s
t
e
m

c
o
d
e
U
s
e
r
-
s

a
c
e
c
o
d
e
-enC4 function call
6e!g!* cl@nqueueDB)ange<ernel8
-enC4 host rogram
AM$ -enC4 li#rar(
6li.,$en(>.so8
S(stem calls
6mainl( ioctl8
GPU $ri%er
M
u
l
t
i
2
S
i
m
2
m
u
l
a
t
e
d

r
o
g
r
a
m
-enC4
function call
-enC4 host rogram
Multi2Sim -enC4 li#rar(
6m2s-li.$encl.so8
Secial s(stem call
6code 52?8
GPU 2mulator
'ati%e 2/ecution Simulated 2/ecution
2 Comarison
3 &'enC( 8unction call% are 8or1arded to m2s-libopencl.so.
3 Each 8unction i% im'lemented a% a %%tem call 32".
3 Multi2Sim emulate% -PU a8ter clEnqueueNDRangeKernel.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 24
9! The 2%ergreen GPU 2mulation
Program 4oading
2 3nitiali:ation of de%ice kernel
3 -lo=al memor ma' >1hole :+5Ean)e?.
3 (ocal memorie% >each 1or$5)rou'?.
3 Ee)i%ter 8ile% >each 1or$5item?.
<ork-item <ork-item

<ork-grou
<ork-item <ork-item

<ork-grou

'$-"ange
Glo#al
Memor(
4ocal
Memories
"egister
Files
-enC4 kernel #inar(
6%ectorCadd!#in8
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 25
9! The 2%ergreen GPU 2mulation
2%ergreen Assem#l( Code
2 Structure
3 Main Control 0lo1 >C0? clau%e.
3 Secondar Arithmetic5(o)ic >A(U? and Texture >TE<? clau%e%.
3 A(U in%truction% are ;(IK.
00 A>IJ ABB)4325 (D'4/5 <(A(*@04(=1J0-135
0 xJ >#*> )3.x2 )0.x2 1
HJ >#*> ;;;;2 )0.x2 40x35.x
tJ ?,- )/.x2 1
1 xJ >#*> )3.x2 7-1.x2 40x25.x
FJ >#*) )1.F2 7-1.z2 40x25.x
zJ ABB;CD' ;;;;2 <(0[1].x2 7-2.x
tJ >#*) )K.x2 <(0[3].x2 1
2 FJ >#*) )2.F2 7-3.z2 40x25.x
01 '@LJ ABB)41++5 (D'425
3 -M@'(* )1.x;;;2 )1.F2 fc131 ?@NA4+5
M@'(*;'O7@4D,;CDB@L;,MM#@'5
+ -M@'(* )2.x;;;2 )2.F2 fc131 ?@NA4+5
M@'(*;'O7@4D,;CDB@L;,MM#@'5
02 A>I;7I#*;=@M,)@J ABB)4+K5 (D'435
3 xJ >B#;8)C'@ ;;;;2 )1.H2 )1.x
1 xJ >B#;8)C'@ ;;;;2 )1.x2 )2.x
K xJ 7)@BD@;CD' ;;;;2 )K.x2 0.0f
I7BA'@;@L@(;?A#< I7BA'@;7)@B
03 PI?7 7,7;(D'415 ABB)4135
0+ ?@?;)A';(A(*@>@##;#',)@;)A8J
)A'415[)1].x;;;2 )02 A))AO;#CQ@4+5 ?A)< -7?
CF 3 nstruction
Counter
Secondar( Clause
3nstruction Counter
Secondar( A4U
Clause
Secondar( T2D
Clause
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 26
9! The 2%ergreen GPU 2mulation
Simulation 4oo
2mulate
>all 1or$5item%?
2mulate
>all 1or$5item%?
"ead CF
instruction
In%tr.
=te%
$ecode
instruction
In%tr.
8ield%
3nstr! is
CF>
Ce% :o
Start A4U;
T2D clause
"ead A4U;
T2D instr!
In%tr.
=te%
$ecode
instruction
In%tr.
8ield%
2nd of
clause>
:o
Go U
Ce%
2 2/ecution of CF clause
3 In%truction% a88ectin) control 8lo1.
3 SnchroniBation o'eration%.
3 Krite% to )lo=al memor.
2 Secondar( A4U clause
3 Arithmetic5lo)ic o'eration%.
3 Acce%%e% to local memor.
2 Secondar( T2D clause
3 Eead% 8rom )lo=al memor.
$emo 9
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 27
0! The GPU Architectural Simulation
AM$ 2%ergreen GPU Architecture
2 The GPU Comute $e%ice
3 Pool o8 'endin) 1or$5)rou'% >K)%?.
3 Set o8 com'ute unit% >Cu%?.
3 +i%'atcher 3 ma'% K-% to CU%.
3 -lo=al memor hierarch.
Comute
Unit +
Comute
Unit ,
Comute
Unit '-,

Kor$5)rou' di%'atcher
Pendin)
Kor$5)rou'
'ool
Glo#al Memor( Hierarch(
A4U
2ngine
CF
2ngine
T2D
2ngine
Ee)i%ter 0ile
Eead
Ka,e8ront
Pool -
l
o
=
a
l

M
e
m
o
r

>
r
e
a
d
%
?
(
o
c
a
l
M
e
m
o
r

A
(
U

C
l
a
u
%
e
T
E
<

C
l
a
u
%
e
-
l
o
=
a
l

m
e
m
o
r

>
1
r
i
t
e
%
?
2 Comute Unit
3 Pool o8 'endin) 1a,e8ront% >K8%?
3 Three execution en)ine%.
3 (ocal memor.
3 Ee)i%ter 8ile.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 28
0! The GPU Architectural Simulation
2/ecution 2ngines
Fetch
6one <F8
In%tr. =te%
=u88er%
$ecode
6round-
ro#in8
In%truction
Memor
>C0
Clau%e?
0rom Eead
Ka,e8ront
Pool
E
x
t
r
a
c
t
K
0
To Eead
Ka,e8ront
Pool
I
n
%
e
r
t
K
0
K0
/
K0
1
K0
:51
E
E
E
C0 In%tr.
=u88er%
>1 entr
'er K0?
K0
/
K0
1
K0
:51
E
E
E
2/ecute 6round-ro#in8
(aunch %econdar
A(U clau%e
(aunch %econdar
TE< clau%e
Execute
C0 in%truction
Comlete
2 Control Flow 6CF8 2ngine
3 4 %ta)e%.
3 Extract% one K0 8rom 'ool at fetch %ta)e.
3 Place% a K0 =ac$ into 'ool at comlete %ta)e.
3 Secondar clau%e% can =e launched at e/ecute
%ta)e.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 29
0! The GPU Architectural Simulation
2/ecution 2ngines
EEE
In%truction
=te%
$ecode
"ead
6each
Su#<F8
SubWF
SubWF !
SubWF "
x
#
$
w
t
...
.
.
.
Stream Core $
%
r
o
c
e
s
s
i
n
&
'
l
e
m
e
n
t
s
%
i
p
e
l
i
n
e
S
t
a
&
e
s
Stream Core ,
Stream Core '.,
!
!
!
<ork-3tem +
Su=K0 /4 14 ...
<rite
2/ecute
In%truction
Memor
>A(U
clau%e%?
(
o
c
a
l
M
e
m
o
r
(
o
c
a
l
M
e
m
o
r

0rom
Ee)i%ter
0ile
To
Ee)i%ter
0ile
x

B
1
t
;(IK
=undle
=u88er
>1 entr?
<ork-3tem +
Su=K0 /4 14 ...
<ork-3tem '-,
Su=K0 /4 14 ...
Fetch
6one <F8
2 Arithmetic-4ogic 6A4U8 2ngine
3 " %ta)e%.
3 K0 i% %'lit into Su=K0% at the read %ta)e.
3 Su=K0 %iBe i% eDual to num=er o8 a,aila=le Stream Core% >Sc%?.
3 Each SC ha% " 'i'elined 'roce%%in) element% >x4 4 B4 14 t?.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 30
0! The GPU Architectural Simulation
2/ecution 2ngines
!!!
Fetch
6one <F8
In%truction
=te%
$ecode
In%truction
Memor
>TE<
Clau%e%?
"ead
EeDue%t to
(1 cache
>-lo=al Mem.?
<rite
+ata 8rom
(1 cache
To
Ee)i%ter
0ile
TE<
in%tr.
=u88er
>1 entr?
0rom
Ee)i%ter
0ile
a
d
d
r
.
d
a
t
a
2 Control Flow 6CF8 2ngine
3 4 %ta)e%.
3 -lo=al memor read% are i%%ued at read
%ta)e.
3 The com'lete at write %ta)e.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 31
0! The GPU Architectural Simulation
Summar( of <ork-items Grouing
2 '$-"ange
3 -rou' o8 all 1or$5item% 8or one $ernel launch.
2 <ork-grou
3 Kor$5item% can 'er8orm %nchroniBation%.
3 Kor$5item% %hare a 8a%t5acce%% local memor.
2 <a%efront
3 SIM+ execution unit.
2 Su#wa%efront
3 Kor$5item% that can =e i%%ued to Stream Core% at a time.
-

e
n
C
4

P
r
o
g
!

M
o
d
e
l
G
P
U

A
r
c
h
i
t
e
c
t
u
r
e
$emo 0
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 32
F! )enchmarks and Simulations
Suorted GPU )enchmarks
2 AM$ S$&Gs -enC4 )enchmarks
3 Matrix com'utation%.
3 0inancial =enchmar$%.
3 Sortin) al)orithm%.
3 etc.
2 Features
3 Pro,ided in Multi2Sim %ite a% x86 J E,er)reen =inarie%.
3 Command5line can =e tuned 8or di88erent in'ut %iBe%.
3 Pro,ide =oth CPU and -PU im'lementation%4 1ith %el85chec$.
$emo F
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 33
,+! Conclusions
Simulation Caa#ilities
2 /01 CPU Simulation
3 ISA5le,el.
3 :o need 8or 8ull5%%tem %imulation.
3 Su'er%calar6multithreaded6multicore.
3 Memor hierarchie% and interconnect%.
3 State5o85the5art =enchmar$%.
2 AM$ 2%ergreen GPU Simulation
3 ISA5le,el.
3 0ir%t 8ull architectural %imulation 8rame1or$.
3 Eeali%tic -PU 'i'eline >=a%ed on AM+ Eadeon "8*/?.
3 Memor hierarchie% and interconnect%.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 34
,+! Conclusions
Additional Material
2 The Multi2Sim Guide
3 Com'lete documentation.
3 H-ettin) %tartedI %ection%4 1ith execution exam'le%.
3 +e%cri'tion o8 CPU and -PU architectural model%.
2 The Multi2Sim Forum
3 +i%cu%%ion 8orum 8or Multi2Sim u%er%.
2 The Multi2Sim Mailing 4ist
3 Announcement% o8 ne1 ,er%ion%4 u'dated documentation4 etc.
The Multi2Sim Simulation Framework* PACT 2+,, Tutorial 35
,+! Conclusions
Future <ork
2 2/tending suort for #enchmarks
3 Su''ort 8or the entire &'enC( %'eci8ication.
3 Su''ort 8or the entire E,er)reen ISA.
3 Su''ort 8or the com'lete AM+ S+L %uite4 and other u'comin)
=enchmar$%.
2 Focus on heterogeneous architectures
3 Model 8or AM+ 0u%ion.
3 CPU and -PU 1or$in) concurrentl.
3 Su''ortin)6de%i)nin) =enchmar$% 1ith hetero)eneou% 'roce%%in).
2 Maintenance of CPU simulation
3 I%%ue% re'orted = Multi2Sim u%er%.
3 Sta=ilit and %u''ort increa%e% da = da.
Conference title 36
The Multi2Sim Simulation Framework

A CPU-GPU Model
for Heterogeneous Comuting
www!multi2sim!org

"afael U#al
$a%id "! &aeli
'ortheastern Uni%ersit(
)oston* MA

You might also like