Parallel processing – systolic arrays
Last Updated :
05 May, 2023
The parallel processing approach diverges from traditional Von Neumann architecture. One such approach is the concept of Systolic processing using systolic arrays.
A systolic array is a network of processors that rhythmically compute and pass data through the system. They derived their name from drawing an analogy to how blood rhythmically flows through a biological heart as the data flows from memory in a rhythmic fashion passing through many elements before it returns to memory. It is also an example of pipelining along with parallel computing. It was introduced in the 1970s and was used by Intel to make CMU’s iWarp processor in 1990.
In a systolic array, there are a large number of identical simple processors or processing elements(PEs) that are arranged in a well-organized structure such as a linear or two-dimensional array. Each processing element is connected with the other PEs and has limited private storage.

A Host station is often used for communication with the outside world in the network.
Characteristics:
- Parallel Computing –
Many processes are carried out simultaneously. As the arrays have a non-centralized structure, parallel computing is implemented.
- Pipelinability –
It means that the array can achieve high speed. It shows a linear rate pipelinability.
- Synchronous evaluation –
The computation of data is timed by a global clock and then the data is passed through the network. The global clock synchronizes the array and has fixed-length clock cycles.
- Repeatability –
Most of the arrays have the repetition and interconnection of a single type of PE in the entire network.
- Spatial Locality –
The cells have a local communication interconnection.
- Temporal Locality –
One unit time delay is at least required for the transmission of signals from one cell to another.
- Modularity and regularity –
A systolic array consists of processing units that are modular and have homogeneous interconnection and the computer network can be extended indefinitely.
Advantages of Systolic array –
- High Degree of Parallelism: The arrays employ a high degree of parallelism and can sustain a very high throughput.
- Compact and Efficient: They are highly compact, robust, and efficient.
- Simple and Regular Data and Control Flow: Data and control flow are simple and regular.
Disadvantages of Systolic array –
- Specialized and Inflexible: They are highly specialized and inflexible regarding the problems they can solve.
- Difficult to Build: They are difficult to build.
- Expensive: They are expensive
Systolic Array Applications
Application |
Example |
Digital Signal Processing |
Image and Video processing, speech recognition, data compression |
Neural Networks |
Convolutional Neural Networks, Recurrent Neural Networks, Deep Belief Networks |
Cryptography |
Symmetric Key Encryption, Hash Functions |
Computer Vision |
Object detection and recognition, Facial Recognition, Video analytics |
Features of systolic arrays:
Array of Processing Elements: A systolic array consists of a large number of identical processing elements that are arranged in a regular grid. Each processing element is capable of performing a simple operation on its input data and passing the result to its neighbors.
Pipelined Processing: Systolic arrays are designed to operate in a pipelined manner, with data flowing through the array in a sequential manner. This allows for high throughput and low latency, as each processing element can begin working on the next piece of data as soon as it finishes its current task.
Local Communication: Communication between processing elements in a systolic array is typically done using local connections, which reduces the amount of data that needs to be transmitted over long distances. This can help to reduce latency and increase overall system performance.
Regular Structure: Systolic arrays have a regular, predictable structure, which makes them well-suited to hardware implementations. This regularity also makes it easier to design and optimize algorithms for systolic arrays.
Scalability: Systolic arrays can be easily scaled to handle larger data sets or more complex algorithms by adding more processing elements to the array. This makes them a flexible and adaptable architecture for a wide range of applications.
High Parallelism: The large number of processing elements in a systolic array allows for high levels of parallelism, which can significantly improve performance for tasks that are amenable to parallel processing.
Similar Reads
What is Parallel Processing ?
Parallel processing is used to increase the computational speed of computer systems by performing multiple data-processing operations simultaneously. For example, while an instruction is being executed in ALU, the next instruction can be read from memory. The system can have two or more ALUs and be
2 min read
Operation of SIMD Array Processor
The SIMD form of parallel processing is called Array processing. Figure shows the array processor. A two-dimensional grid of processing elements transmits an instruction stream from a central control processor. As each instruction is transmitted, all elements execute it simultaneously. Each processi
3 min read
Types of Array Processor
Array Processor performs computations on large array of data. These are two types of Array Processors: Attached Array Processor, and SIMD Array Processor. These are explained as following below. 1. Attached Array Processor :To improve the performance of the host computer in numerical computational t
3 min read
Single Program Multiple Data (SPMD) Model
IntroductionSingle Program Multiple Data (SPMD) is a special case of the Multiple Instruction Multiple Data model (MIMD) of Flynn's classification. In the SPMD model, a single program is executed simultaneously on multiple data elements. Here, each processing element (PEs) runs the same program but
2 min read
Instruction Level Parallelism
Instruction Level Parallelism (ILP) is used to refer to the architecture in which multiple operations can be performed parallelly in a particular process, with its own set of resources - address space, registers, identifiers, state, and program counters. It refers to the compiler design techniques a
5 min read
Handler's Classification in Computer Architecture
In 1977, Wolfgang Handler presented a computer architectural classification scheme for determining the degree of parallelism and pipelining built into the computer system hardware. Parallel systems are complicated to the program as compared to the single processor system because parallel system arch
3 min read
Hardware architecture (parallel computing)
Let's discuss about parallel computing and hardware architecture of parallel computing in this post. Note that there are two types of computing but we only learn parallel computing here. As we are going to learn parallel computing for that we should know following terms. Era of computing - The two f
3 min read
Computer Architecture | Flynn's taxonomy
Parallel computing is a computing where the jobs are broken into discrete parts that can be executed concurrently. Each part is further broken down to a series of instructions. Instructions from each part execute simultaneously on different CPUs. Parallel systems deal with the simultaneous use of mu
4 min read
8086 program to determine modulus of first array elements corresponding to another array elements
Problem - Write a program in 8086 microprocessor to determine modulus of corresponding 8 bit n elements of first array with 8-bit n numbers of second array, where size ânâ is stored at offset 500 and the numbers of first array are stored from offset 501 and the numbers of second array are stored fro
2 min read
Difference between SIMD and MIMD
The two basic classifications of parallel processing are SIMD which stands for Single Instruction Multiple Data and MIMD which stands for Multiple Instruction Multiple Data. The use of SIMD enables the processing of many data with a single instruction and is applicable to most operations that are un
5 min read