
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Parallel Data Processing in Java
In this article, we will learn about Parallel Data Processing in Java. Parallel processing of data is important to increase performance, particularly for large amounts of data. Java has its own built-in ways to accomplish things in the background, fully using multi-core processors.
Different Approaches
The following are the two different approaches for parallel data processing in Java ?
Why Parallel Processing?
Parallel data processing is essential in scenarios where ?
- Processing large datasets is necessary to happen quickly.
- CPU-bound computations need more efficient resource utilization.
- Real-time data analysis requires high throughput.
- Applications take advantage of multi-core processors.
Using Java Streams API
The ?Stream' interface in Java, which was introduced in Java 8, is used to manipulate data collections in a declarative fashion. Stream interface can also be used to execute processes in parallel, without making the process too complicated. This means a sequential stream can be declaratively turned into a parallel stream.
A parallel stream can be defined as a stream that splits the data collection elements into multiple streams. Every stream is assigned to a separate chunk and associated with a different thread. The work is divided between multiple threads, with the help of multiprocessors. This way, the CPU resources are efficiently used and kept busy. A sequential stream can be converted into a parallel stream by prefixing the keyword ?parallel'.
Example
Below is an example of parallel sum calculation using streams ?
import java.util.stream.*; import java.util.Collections.*; public class Demo { static long sum_in_parallel(long n) { return Stream.iterate(1L, i->i + 1).limit(n).parallel().reduce(0L, Long::sum); } public static void main(String[] args) { long c = sum_in_parallel(23); System.out.println("Sum, when computed in parallel is " + c); } }
Output
Sum, when computed in parallel is 276
Using Arrays.parallelSort()
Efficient sorting of large arrays is critical in many applications. Java offers Arrays.parallelSort(), which sorts arrays in parallel with the Fork/Join framework internally.
The following are the steps for parallel sorting of an integer array ?
-
new Random().ints(): Creates an array of 10 random integers.
-
Arrays.parallelSort(numbers): Sorts the array using multiple threads.
- Uses Merge Sort and Dual-Pivot QuickSort for efficient sorting.
Arrays.parallelSort(numbers);
Example
Below is an example of parallel sorting of an integer array ?
import java.util.Arrays; import java.util.Random; public class ParallelSortExample { public static void main(String[] args) { int[] values = new Random().ints(10, 1, 100).toArray(); System.out.println("Before sorting: " + Arrays.toString(values)); Arrays.parallelSort(values); System.out.println("After parallel sorting: " + Arrays.toString(values)); } }
Output
Before sorting: [45, 94, 85, 66, 73, 62, 57, 94, 92, 62] After parallel sorting: [45, 57, 62, 62, 66, 73, 85, 92, 94, 94]
Conclusion
Parallel data processing in Java can be achieved efficiently by using Parallel Streams for computation and Arrays.parallelSort() for sorting large datasets. Apply parallel streams while carrying out operations such as summation, mapping, and filtering on large data sets. Apply Arrays.parallelSort() to sort large arrays efficiently.