Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
Parallel Programming: Aaron Bloomfield CS 415 Fall 2005
Aaron Bloomfield
CS 415
Fall 2005
1
Predict weather
Predict spread of SARS
Predict path of hurricanes
Predict oil slick propagation
Model growth of bio-plankton/fisheries
Structural simulations
Predict path of forest fires
Model formation of galaxies
Simulate nuclear explosions
2
Parallel Computers
Programming mode types
Shared memory
Message passing
P0
Memory
P1
Memory
...
Pn
Communication
Interconnect
5
http://www.mpi-forum.org
Cache
Cache
...
Pn
Cache
Shared Bus
Global Shared Memory
OpenMP
OpenMP: portable shared memory parallelism
Higher-level API for writing portable multithreaded
applications
Provides a set of compiler directives and library routines
for parallel application programmers
API bindings for Fortran, C, and C++
http://www.OpenMP.org
Approaches
Parallel Algorithms
Parallel Language
Message passing (low-level)
Parallelizing compilers
10
Parallel Languages
CSP - Hoares notation for parallelism as a
network
of
sequential
processes
exchanging messages.
Occam - Real language based on CSP.
Used for the transputer, in Europe.
11
13
Object-Oriented
Concurrent Smalltalk
Threads in Java, Ada, thread libraries for
use in C/C++
This uses a library of parallel routines
14
Functional
NESL, Multiplisp
Id & Sisal (more dataflow)
15
Parallelizing Compilers
Automatically transform a sequential program into
a parallel program.
1. Identify loops whose
executed in parallel.
2. Often done in stages.
iterations
can
be
Data Dependences
Flow dependence - RAW. Read-After-Write. A
"true" dependence. Read a value after it has
been written into a variable.
Anti-dependence - WAR.
Write-After-Read.
Write a new value into a variable after the old
value has been read.
Output dependence - WAW. Write-After-Write.
Write a new value into a variable and then later
on write another value into the same variable.
17
Example
1:
2:
3:
4:
A = 90;
B = A;
C = A+ D
A = 5;
18
Dependencies
A parallelizing compiler must identify loops that do
not have dependences BETWEEN ITERATIONS
of the loop.
Example:
do I = 1, 1000
A(I) = B(I) + C(I)
D(I) = A(I)
end do
19
Example
Fork one thread for each processor
Each thread executes the loop:
do I = my_lo, my_hi
A(I) = B(I) + C(I)
D(I) = A(I)
end do
Wait for all threads to finish before
proceeding.
20
Another Example
do I = 1, 1000
A(I) = B(I) + C(I)
D(I) = A(I+1)
end do
21
22
Parallel Compilers
Two concerns:
Parallelizing code
Compiler will move code around to uncover
parallel operations
Data locality
If a parallel operation has to get data from
another processors memory, thats bad
23
Distributed computing
Take a big task that has natural parallelism
Split it up to may different computers across a
network
Examples:
SETI@Home,
prime
searches, Google Compute, etc.
number