Master Thesis Defense

Filipo Novo Mór
Supervisor: Dr. César Augusto Missio Marcon
Co-supervisor: Dr. Andrew Rau-Chaplin
www.filipomor.com
master thesis defense

Presentation Outline
1
Introduction
Theoretical
Background
Related
Work
Project
Methodology
Experimental
Results
Conclusions
Filipo Novo Mór

Introduction
 Some concepts
 NoC - Network on Chip
 Tasks
2Filipo Novo Mór
NoC t0 t1
t3
t4
t2
t5

Introduction
 The Task Mapping Problem
3
t0 t1
t3
t4
t2
TASKS
t5
NP-Hard problem!
 power consumption
 communication profile
 execution time
Filipo Novo Mór

Introduction
 Brute-force algorithms are not feasible for solving
NP-Hard problems
 Alternative: to use heuristic methods
 Best solution possible, although there is no
guarantee the best global solution will be found
 Evolutionary Algorithms
 Differential Evolution (DE)
4Filipo Novo Mór

Introduction
 Motivation
 Previous works
 Considering the DE features of:
 Optimization of non-linear problems
 Simplicity and flexibility of its code
 Try finding a more efficient task mapping solver
using DE
5Filipo Novo Mór

Introduction
 Objective
 Implement a new elitist strategy on Single
Objective DE to efficiently solve the Task Mapping
onto NoC Problem
6Filipo Novo Mór

Theoretical Background
7Filipo Novo Mór
Introduction
Theoretical
Background
Related
Work
Project
Methodology
Experimental
Results
Conclusions

 Task Mapping onto NoC Problem
 Traditional
approaches:
 Partitioning
 Mapping
Filipo Novo Mór 8
6
Application App1
t1
t4
t2
t6
t5
t7
t9
t10
t11
t12
t22
t16
t20
t21t18 t17
t19
t15 t14
t13
t12t2
t14
t11
t10
t16
t22
t7t19 t9
t1
t15
t13
t3
t20
t17
t21
t4
n1 n2 n3 n4 n5 n6
Communication Infrastructure
Application App3
pA1
n3
1 pA2 pA3 pA4 pA5 pA6
Application App2
t5t18
t6
Partitioning
Mapping
2 3 4 5
n5 n1 n6 n4 n2
t8
t8
t3
C.A.M.Marcon, 2005

 Task Mapping Algorithms
Filipo Novo Mór 9
 Network Flow Algorithm
 Shortest Tree Algorithm
 A* Algorithm
 Mathematical Inequalities
 Linear programming
 Evolutionary Algorithms
 Genetic programming
 Simulated Annealing

 Evolutionary Algorithms
Filipo Novo Mór 10
SearchingTechniques
EvolutionaryAlgorithmsCalculation Based Enumerative
SingleObjective MultiObjective
SimulatedAnnealing EvolutionaryStrategies Genetic ProgrammingDifferentialEvolution Genetic Algorithms

Filipo Novo Mór 11
vector
initialization
mutation recombination selection

Filipo Novo Mór 12
vector
initialization
 vector initialization
 Population is randomly initialized
 Uniform probabilistic distribution
 If a preliminary solution is available, must add distributed random
deviations to it
 Each individual on the population represent a solution candidate

Filipo Novo Mór 13
𝑿𝒊,𝑮 , 𝐢 = {𝟏, 𝟐, … , 𝑵𝑷}
…
NP
Population
(solution candidate)1
(solution candidate)n

Filipo Novo Mór 14
 mutation
 generate a new mutate vector
 a new parameter vector is generated by the DE by adding the
weighted difference between two population vectors to a third vector
vector
initialization

Filipo Novo Mór 15
𝒇 𝒙 =
𝒊=𝟏
𝒅
𝒙𝒊
𝟐
D. Bingham, 2015

Filipo Novo Mór 16
X2
X1
𝑿 𝒓 𝟐
𝒊
𝑿 𝒓 𝟑
𝒊
𝑿 𝒓 𝟏
𝒊
α
δ
𝑽 𝒊,𝑮 target vector
𝑽𝒊,𝑮+𝟏 = 𝑿 𝒓 𝟏,𝑮
𝒊 + 𝑭 𝑿 𝒓 𝟐,𝑮
𝒊 − 𝑿 𝒓 𝟑,𝑮
𝒊
mutation
factor

Filipo Novo Mór 17
 mutation
 generate a new mutate vector
 a new parameter vector is generated by the DE by adding the
weighted difference between two population vectors to a third vector
 the resulting vector will be used as a donor on the next step
 keeps pacing throughout the solution space
vector
initialization

Filipo Novo Mór 18
 recombination
 enhance the Population diversity
 keep track of good candidate solutions from previous generations
vector
initialization

Filipo Novo Mór 19
𝑽 𝒊,𝑮+𝟏
𝑿 𝒊,𝑮
𝑼 𝒋,𝒊,𝑮+𝟏
D
𝑼𝒋,𝒊,𝑮+𝟏 =
𝑽𝒋,𝒊,𝑮+𝟏 if 𝒓𝒂𝒏𝒅𝒋,𝒊 ≤ 𝑪𝑹
𝑿𝒋,𝒊,𝑮 if 𝒓𝒂𝒏𝒅𝒋,𝒊 > 𝑪𝑹
i = 1, 2, … , 𝑁𝑃
j = 1, 2, … , 𝐷
𝑉𝑖,𝐺+1 ≠ 𝑋𝑖,𝐺

Filipo Novo Mór 20
 selection
 only the best individuals will be kept in the Population
vector
initialization

Filipo Novo Mór 21
𝑿 𝒊,𝑮
𝑼 𝒋,𝒊,𝑮+𝟏
…
Population
Xi,G
(solution candidate)n
Uj,i,G+1

Filipo Novo Mór 22
A - Population
Initialization
Is ui,G+1
better
than xi,G
?
H - Update
Population
B – Population
Evaluation
C - Select
xr1,G, xr2,G and xr3,G
D - Mutation E - Recombination F - Evaluates ui,G+1
no
yes
repeat for n generations
for each individual i in the Population, repeatI - Select
Dominant
Solutions from
Archieve
G
 DE – complete steps

 Population Evaluation on DE
Filipo Novo Mór 23
 ≅ 𝑶 𝒏 𝟐
 how deep would be the
impact on the overall
performance?
X2
X1

Filipo Novo Mór 24
0
200
400
600
800
1000
1200
1400
1600
50 100 500 1000 2000 5000 7500 10000
milliseconds
N
Dominance Algorithms
Execution Time
M&S BF Naive BF Smart
0
1
2
3
4
5
6
50 100 500 1000 2000 5000 7500 10000
milliseconds
N
Mishra & Sandeep Dominance Algorithm
Execution Time
3
5
21
32
63
146
210
287
0 50 100 150 200 250 300 350
50
100
500
1000
2000
5000
7500
10000
Speedup
N
M&S Dominance Algorithm Tested algorithms:
Brute Force “Naïve”: N2 two independent nested loops.
Brute Force “Smart”: N2 two dependent nested loops.
Mishra & Sandeep: heapsort + 1 outer loop with a dynamic
variant linked list.
Tested in a I5 CPU, 8GB RAM, running Kubuntu 14.04. All tests
performed using “nice -20” prioritization.
To generate the data set:
𝑓1 = 1 − 𝑥2, 𝑥 = 𝑟𝑎𝑛𝑑48()
𝑓2 = 1 − 𝑥2, 𝑥 = 𝑟𝑎𝑛𝑑48()

Filipo Novo Mór 25
 Managing the DE archive
 truncate the archive using
the Crowding Distance metric
Kumar and Kesavan, 2015

 Simulated Annealing (SA)
Filipo Novo Mór 26
FCE Frankfurt Consulting Engineers GmbH, 2015

 NASA Numerical Aerodynamic Simulation (NAS)
Filipo Novo Mór 27
 CG - Conjugate Gradient, irregular memory access and communication
 FT - discrete 3D fast Fourier Transform, all-to-all communication
 IS - Integer Sort, random memory access
 LU - Lower-Upper Gauss-Seidel solver. Large number of short messages
 MG - Multi-Grid on a sequence of meshes, long- and short-distance
communication, memory intensive
These applications were selected because they have task
communication based profiles. Therefore they are ideal for the
purposes of this work.

Related Work
28Filipo Novo Mór
Introduction
Theoretical
Background
Related
Work
Project
Methodology
Experimental
Results
Conclusions

Related Work
• J. R. Ku and S. G. Ku [34]
• Two phases:
• clustered high communicating tasks into partitions
• Used NSGA-II algorithm
• Mapped these partitions onto NoC processors.
• Tried to keep high communicating partitions close to each other
• Used a second version of the NSGA-II algorithm
• 15% more efficient then Physical Mapping Algorithm
• C. Deng et al. [41]
• Changed the classical DE
• Included a sorting step before chromosomes recombination
• For high-level task graphs, free of a target hardware architecture
Filipo Novo Mór 29

Related Work
• Sen Zhao et al. [45]
• Proposed a MODE using an adaptative mutation operator.
• The strategy is changed during runtime to try achieving better solutions on the
fly
• The resulting vector is now compared with the whole population, not only with
your ’father’
• Tested using benchmark ZDT functions only
• D. Das, M. Verma and A. Das [58]
• Hardware/software partitioning problem using DE
• Objective functions: execution time, area cost and communication cost
• DE ran 16% faster than PSO
• Quality of acieved solutions were not described
• Zhuo Qingqi et al. [51]
• Solving Task Mapping problem combining two evolutionary algorithms (not DE)
• Parallel approach for searching the solution space
• MPEG-4 and VOPD (Video Objective Plane Decoder) benchmark applications
• Saves 13% on energy and is 3% more efficient in communication latency
Filipo Novo Mór 30

Project Methodology
31Filipo Novo Mór
Introduction
Theoretical
Background
Related
Work
Project
Methodology
Experimental
Results
Conclusions

Project Methodology
Filipo Novo Mór 32
E A C
F B
D
0
1
2
0 1 2
0 1 2
3 4 5
6 7 8
resulting task map
E A C F B  D  
0 1 2 3 4 5 6 7 8
chromosomes
individual
0
1
2
0 1 2
0 1 2
3 4 5
6 7 8
task mapping step
A
C
E
B
D
F
5
5
3
2
5
3
4
1

Project Methodology
Filipo Novo Mór 33
E A C F B  D  
B   A C  D F E
 C  A B E D  F
F D A    B E C
0 1 2 3 4 5 6 7 8
0
1
2
3
0
1
2
0 1 2
0 1 2
3 4 5
6 7 8
A
C
E
B
D
F
5
5
3
2
5
3
4
1

Project Methodology
 Data Structures Modelling
Filipo Novo Mór 34
0 0 3 4 2
1 3 2 4 4
4 2 2 1 0
3 4 0 1 1
t0 t1 t2 t3 t4
Populationsize(NP)Population Dimension (D)
0
1
2
0 1 2
0 1 2
3 4 5
6 7 8
 D = number of existing tasks
 Adherent to SODE and MODE

Project Methodology
 Communication Volume Metric
Filipo Novo Mór 35
Manhattan Distance
𝑴 𝒅 = 𝒙 𝟏 − 𝒙 𝟐 + 𝒚 𝟏 − 𝒚 𝟐
15
2025
10
10
t0 t1
t3
t4
t2
TASKS
t5
10
20
25 15
0
Candidate Solution 1
fo3(solution 1) = 10+0+25+20+15 = 70
1020
25 15
10
10 10
10
1010
Candidate Solution 2
fo3(solution 2) = 10+10+10+10+10+10+10+25+20+15 = 130

Project Methodology
 Load Balance Metric
Filipo Novo Mór 36
75 t0 t1
t3
t4
t2
TASKS
t5
53
75 50
10
65
Candidate Solution 1 Candidate Solution 2
fo2(solution 1) = 29.45
fo2(solution 2) = 54.11
𝑹𝑴𝑺𝑫 =
𝟏
𝒏
𝒊=𝟏
𝒏
𝑿𝒊 − 𝑿 𝟐

Project Methodology
 Modifying DE: rewarding “good” individuals
 Identify most communicating tasks
 proposal 1:
Reward individuals keeping most communicating tasks
near to each other
 Proposal 2:
 Try generate “good” individuals during mutation or
recombination operations
Filipo Novo Mór 37

Project Methodology
 Identifying most communicating tasks
Filipo Novo Mór 38
A
C
E
B
D
F
5
5
3
2
5
3
4
1
A, B: 5
A, C: 5
B, D: 3
D, F: 1
F, D: 4
C, E: 5
E, A: 3
E, B: 2
A, B: 5
A, C: 5
B, D: 3
D, F: 1+4
C, E: 5
E, A: 3
E, B: 2
A, B, C,E: 5+5+3
B, D, A, E: 3+5+2
D, F, B: 5+3
C, E, A: 5+5
E, A, C, B: 3+5+2
A, B, C,E: 13
B, D, A, E: 10
D, F, B: 8
C, E, A: 10
E, A, C, B: 10
tA, tB, tC and tE

Project Methodology
 Proposal 1
Filipo Novo Mór 39
 Ideal bonus value is 10%
 Different bonus values tend to
stuck the evolution (no more
convergence is reach)
 On average, ±14% of solutions at
the final Generation had been
rewarded

Project Methodology
 Proposal 2
Filipo Novo Mór 40
 Proposal 2 was halted:
 No more convergence after 4
generations on average
 Too few tasks? Too small NoC?

Project Methodology
 Validating the DE (Single Objective)
Filipo Novo Mór 41
SO_Proc36_T36_CR0_50_F0_40_Gen1000_Noc6_6_1_Pop20_Test2016061716017308_ft32x1_v2ap01

Project Methodology
 Validating the DE (Multiple Objective)
Filipo Novo Mór 42
 Function ZDT1
ETH Zürich, 2008

Project Methodology
Filipo Novo Mór 43
 Function ZDT2
ETH Zürich, 2008

Project Methodology
 Hypervolume metric
Filipo Novo Mór 44
Kian Sheng Lim et al, 2013

Experimental Results
45Filipo Novo Mór
Introduction
Theoretical
Background
Related
Work
Project
Methodology
Experimental
Results
Conclusions

Parameters Range
NP 10 and 20
G 100, 300, 500, 100, 5000 and 10000
CR 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 06, 0.7, 0.8 and 0.9
F 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 06, 0.7, 0.8 and 0.9
Filipo Novo Mór 46
 Single Objective DE
 NASA NAS applications: IS, CG, FT, MG, LU
 Each test case was executed at least 30 times
 Goal: reduce communication volume

Filipo Novo Mór 47
 Single Objective DE – NASA NAS benchmark

Filipo Novo Mór 48
0 5000 10000 15000 20000 25000
CG - 10 - 100
CG - 10 - 500
CG - 10 - 5000
CG - 20 - 100
CG - 20 - 500
CG - 20 - 5000
121
367
615
1233
6078
12194
245
735
1226
2462
12124
24049
0 20000 40000 60000
FT - 10 - 100
FT - 10 - 500
FT - 10 - 5000
FT - 20 - 100
FT - 20 - 500
FT - 20 - 5000
269
812
1336
2684
13596
27099
550
1595
2709
5428
26841
53608
0 20000 40000 60000
IS - 10 - 100
IS - 10 - 500
IS - 10 - 5000
IS - 20 - 100
IS - 20 - 500
IS - 20 - 5000
279
842
1388
2769
13765
27320
553
1671
2768
5387
27091
54840
0 5000 10000 15000 20000 25000
LU - 10 - 100
LU - 10 - 500
LU - 10 - 5000
LU - 20 - 100
LU - 20 - 500
LU - 20 - 5000
123
372
607
1170
6076
11490
239
721
1196
2375
11971
24141
0 5000 10000 15000 20000 25000
MG - 10 - 100
MG - 10 - 500
MG - 10 - 5000
MG - 20 - 100
MG - 20 - 500
MG - 20 - 5000
127
375
629
1265
6079
12402
245
742
1253
2474
12173
24000

Filipo Novo Mór 49
5120
11377 11556
5040 5147
0
2000
4000
6000
8000
10000
12000
14000
CG FT IS LU MG
Average Execution Time
by benchmark application

Filipo Novo Mór 50
 SODE vs CAFES – NASA NAS benchmark
 NASA NAS applications: IS, CG, FT, MG, LU
 Each test case was executed at least 30 times
 CAFES was set to the best execution parameters found during
preparation tests.
 The same formula was used by CAFES and SODE to calculate the
fitness value
 The comparison focused on the quality of the best candidate solutions
 The comparison considered the five best candidate solutions of each
test case for both tested algorithms

Filipo Novo Mór 51
 SODE vs CAFES – NASA NAS benchmark
969114
989330
2616473
3020149
1124121
1109178
3858343
2503478
655376
485965
SODE
CAFES
SODE
CAFES
SODE
CAFES
SODE
CAFES
SODE
CAFES
CGFTISLUMG
SODE vs CAFES
Top 5 Best Solutions - Mean Values
SODE CAFES SODE CAFES SODE CAFES SODE CAFES SODE CAFES
CG FT IS LU MG
8766 11537 9954
100 914 574
92400
51726
14357
4950
SODE vs CAFES
Top 5 Best Solutions - Standard Deviation
 Mean Values: absolute scalar value for the communication volume
 Standard Deviation: how close are the best solutions from each other

Conclusions
52Filipo Novo Mór
Introduction
Theoretical
Background
Related
Work
Project
Methodology
Experimental
Results
Conclusions

Conclusions
 A new adaptation for the SODE was proposed,
rewarding individuals who kept related
communicating tasks close to each other
 Testes were executed using the NASA NAS
benchmark, showing our implementation was able
to generate feasible solutions.
 Our algorithm was compared to the SA
implementation existing on the CAFES Framework.
 Our implementation reached better solutions on two
of five benchmark applications; achieve similar
results on one application. CAFES achieved better
solution on other two tested applications
 Our implementation has proved to be important on
solving the Task Mapping onto NoC problem,
specially for applications with similar NASA NAS
message exchange profiles
Filipo Novo Mór 53

Filipo Novo Mór
Supervisor: Dr. César Augusto Missio Marcon
Co-supervisor: Dr. Andrew Rau-Chaplin
2016, August 18th
www.filipomor.com
master thesis defense
Thank you!

Master Thesis Defense

More Related Content

What's hot (17)

Viewers also liked (20)

Similar to Master Thesis Defense (20)

More from Filipo Mór (20)

Recently uploaded (20)

Master Thesis Defense

Editor's Notes