Fork Join Parallelism
Fork Join Parallelism
in Java
h t t p : / / g e e . c s . o s w e g o . e d u
Doug Lea
State University of New York at Oswego
[email protected]
http://gee.cs.oswego.edu
1
Outline
Fork/Join Parallel Decomposition
A Fork/Join Framework
h t t p : / / g e e . c s . o s w e g o . e d u
Empirical Results
2
Parallel Decomposition
Goal: Minimize service times by exploiting parallelism
Approach:
h t t p : / / g e e . c s . o s w e g o . e d u
return
join
FORK:
for each part p
create and start task to process p;
JOIN:
for each task t
wait for t to complete;
worker
Main
h t t p : / / g e e . c s . o s w e g o . e d u
serve() {
worker
split;
fork; task ... task
join;
compose; worker
}
worker
Producer
h t t p : / / g e e . c s . o s w e g o . e d u
Consumer
while (!Thread.interrupted()) {
task = channel.take();
process(task);
}
7
Worker Thread Example
interface Channel { // buffer, queue, stream, etc
void put(Object x);
Object take();
}
h t t p : / / g e e . c s . o s w e g o . e d u
• Several variants
h t t p : / / g e e . c s . o s w e g o . e d u
• Shown to scale on
stock MP hardware dequeue
steal
Leads to very portable
application code running
worker
Typically, the only fork
platform-dependent
parameters are: dequeue
• Number of worker
threads
• Problem threshold
yielding
size for using worker
sequential solution
exec
11
Recursive Decomposition
Typical algorithm:
return directlySolve(problem);
else {
in-parallel {
Result l = solve(lefthalf(problem));
Result r = solve(rightHalf(problem);
}
return combine(l, r);
}
}
Why?
int seqFib(int n) {
if (n <= 1)
return n;
else
return seqFib(n-1) + seqFib(n-2);
}
To parallelize:
• Replace function with Task subclass
— Hold arguments/results as instance vars
— Define run() method to do the computation
• Replace recursive calls with fork/join Task mechanics
— Task.coinvoke is convenient here
• But rely on sequential version for small values of n
Threshold value usually an empirical tuning constant
13
Class Fib
f(3) f(2)
h t t p : / / g e e . c s . o s w e g o . e d u
f(1) f(0)
— The local LIFO rule is same as, and not much slower
than recursive procedure calls
18
Jacobi example
Leaf(double[][] A, double[][] B,
int loRow, int hiRow,
int loCol, int hiCol) {
this.A = A; this.B = B;
this.loRow = loRow; this.hiRow = hiRow;
this.loCol = loCol; this.hiCol = hiCol;
}
public synchronized void run() {
boolean AtoB = (steps++ % 2) == 0;
double[][] a = (AtoB)? A : B;
double[][] b = (AtoB)? B : A;
for (int i = loRow; i <= hiRow; ++i) {
for (int j = loCol; j <= hiCol; ++j) {
b[i][j] = 0.25 * (a[i-1][j] + a[i][j-1] +
a[i+1][j] + a[i][j+1]);
double diff = Math.abs(b[i][j] - a[i][j]);
maxDiff = Math.max(maxDiff, diff);
}
}
} }
20
Driver
class Driver extends Task {
final Tree root; final int maxSteps;
Driver(double[][] A, double[][] B,
int firstRow, int lastRow,
h t t p : / / g e e . c s . o s w e g o . e d u
• Fib
h t t p : / / g e e . c s . o s w e g o . e d u
• Matrix multiplication
• Integration
• LU decomposition
• Jacobi
• Sorting
22
Speedups
30
25
Ideal
20 Fib
Speedups
Micro
15 Integ
MM
10 LU
Jacobi
5 Sort
0
123456789111111111122222222223
012345678901234567890
Threads
Times
700
600
500
Seconds
400
300
200
100
0
Fib Micro Integ MM LU Jacobi Sort
Task rates
120000
100000
Tasks/sec per thread
80000
60000
40000
20000
0
Fib Micro Inte− MM LU Jacobi Sort
grate
GC Effects: FIb
30
25
20
Speedup
Ideal
15 Fib−64m
Fib−4m
10 Fib−scaled
0
123456789111111111122222222223
012345678901234567890
Threads
Memory bandwidth effects: Sorting
30
25
20 Ideal
Speedup
Bytes
15
Shorts
Ints
10
Longs
5
0
123456789111111111122222222223
012345678901234567890
Threads
Sync Effects: Jacobi
30
25
20
Speedup
Ideal
15
1step/sync
10steps/sync
10
0
1 2 34 5 67 8 9 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3
012345678901234567890
Threads
Locality effects
0.225
0.2
0.175
Proportion stolen
Fib
0.15 Micro
0.125 Integrate
0.1 MM
LU
0.075
Jacobi
0.05
Sort
0.025
0
123456789111111111122222222223
012345678901234567890
Threads
Other Frameworks
8
5 FJTask
Seconds
Cilk
4
Hood
3 Filaments
0
Fib MM Sort LU Integ Jacob
i