0% found this document useful (0 votes)
16 views77 pages

Chap 16 Stream 68 144

Uploaded by

ngus9703
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views77 pages

Chap 16 Stream 68 144

Uploaded by

ngus9703
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

946 CHAPTER 16: STREAMS

// Creating recipes with optional number of calories:


Recipe recipe0 = new Recipe("Mahi-mahi", optNOC0);
Recipe recipe1 = new Recipe("Loco moco", optNOC1);

// Querying an Optional:
// System.out.println(recipe1.getCalories()
// .getAsInt()); // NoSuchElementException
System.out.println((recipe1.getCalories().isPresent()
? recipe1.getCalories().getAsInt()
: "Unknown calories.")); // Unknown calories.

recipe0.getCalories().ifPresent(s -> System.out.println(s + " calories."));


System.out.println(recipe0.getCalories().orElse(0) + " calories.");
System.out.println(recipe1.getCalories().orElseGet(() -> 0) + " calories.");
// int noc = recipe1.getCalories() // RuntimeException
// .orElseThrow(() -> new RuntimeException("Unknown calories."));
}
}

Output from the program:


Unknown calories.
3500 calories.
3500 calories.
0 calories.

16.7 Terminal Stream Operations


A stream pipeline does not execute until a terminal operation is invoked on it; that
is, a stream pipeline does not start to process the stream elements until a terminal
operation is initiated. A terminal operation is said to be eager as it executes imme-
diately when invoked—as opposed to an intermediate operation which is lazy.
Invoking the terminal operation results in the intermediate operations of the
stream pipeline to be executed. Understandably, a terminal operation is specified
as the last operation in a stream pipeline, and there can only be one such operation
in a stream pipeline. A terminal operation never returns a stream, which is always
done by an intermediate operation. Once the terminal operation completes, the
stream is consumed and cannot be reused.
Terminal operations can be broadly grouped into three groups:
• Operations with side effects
The Stream API provides two terminal operations, forEach() and forEachOr-
dered(), that are designed to allow side effects on stream elements (p. 948).
These terminal operations do not return a value. They allow a Consumer action,
specified as an argument, to be applied to every element, as they are consumed
from the stream pipeline—for example, to print each element in the stream.
16.7: TERMINAL STREAM OPERATIONS 947

• Searching operations
These operations perform a search operation to determine a match or find an
element as explained below.
All search operations are short-circuit operations; that is, the operation can termi-
nate once the result is determined, whether or not all elements in the stream
have been considered.
Search operations can be further classified into two subgroups:
❍ Matching operations
The three terminal operations anyMatch(), allMatch(), and noneMatch() deter-
mine whether stream elements match a given Predicate specified as an argu-
ment to the method (p. 949). As expected, these operations return a boolean
value to indicate whether the match was successful or not.
❍ Finding operations
The two terminal operations findAny() and findFirst() find any element and
the first element in a stream, respectively, if such an element is available (p.
952). As the stream might be empty and such an element might not exist, these
operations return an Optional.
• Reduction operations
A reduction operation computes a result from combining the stream elements
by successively applying a combining function; that is, the stream elements are
reduced to a result value. Examples of reductions are computing the sum or
average of numeric values in a numeric stream, and accumulating stream ele-
ments into a collection.
We distinguish between two kinds of reductions:
❍ Functional reduction
A terminal operation is a functional reduction on the elements of a stream if it
reduces the elements to a single immutable value which is then returned by the
operation.
The overloaded reduce() method provided by the Stream API can be used to
implement customized functional reductions (p. 955), whereas the terminal
operations count(), min(), and max() implement specialized functional reduc-
tions (p. 953).
Functional reductions on numeric streams are discussed later in this section (p.
972).
❍ Mutable reduction
A terminal operation performs a mutable reduction on the elements of a stream if
it uses a mutable container—for example, a list, a set, or a map—to accumulate
values as it processes the stream elements. The operation returns the mutable
container as the result of the operation.
The Stream API provides two overloaded collect() methods that perform
mutable reduction (p. 964). One overloaded collect() method can be used to
948 CHAPTER 16: STREAMS

implement customized mutable reductions by specifying the functions (sup-


plier, accumulator, combiner) required to perform such a reduction. A second
collect() method accepts a Collector that is used to perform a mutable reduc-
tion. A collector encapsulates the functions required for performing a mutable
reduction. The Stream API provides built-in collectors that allow various con-
tainers to be used for performing mutable reductions (p. 978). When a terminal
operation performs a mutable reduction using a specific container, it is said to
collect to this container.
The toArray() method implements a specialized mutable reduction that returns
an array with the accumulated values (p. 971); that is, the method collects to an
array.

Consumer Action on Stream Elements


We have already used both the forEach() and forEachOrdered() terminal operations
to print elements when the pipeline is executed. These operations allow side effects
on stream elements.
The forEach() method is defined for both streams and collections. In the case of col-
lections, the method iterates over all the elements in the collection, whereas it is a
terminal operation on streams.
Since these terminal operations perform an action on each element, the input
stream to the operation must be finite in order for the operation to terminate.
Counterparts to the forEach() and forEachOrdered() methods for the primitive
numeric types are also defined by the numeric stream interfaces.
void forEach(Consumer<? super T> action)
This terminal operation performs an action on each element of this stream.
This method should not be relied upon to produce deterministic results, as the
order in which the elements are processed is not guaranteed.
void forEachOrdered(Consumer<? super T> action)
This terminal operation performs an action on each element of this stream, but
in the encounter order of the stream if the stream has one.

The difference in behavior of the forEach() and forEachOrdered() terminal opera-


tions is that the forEach() method does not guarantee to respect the encounter
order, whereas the forEachOrdered() method always does, if there is one.
Each operation is applied to both an ordered sequential stream and an ordered par-
allel stream to print CD titles with the help of the consumer printStr:
Consumer<String> printStr = str -> System.out.print(str + "|");

CD.cdList.stream().map(CD::title).forEach(printStr); // (1a)
//Java Jive|Java Jam|Lambda Dancing|Keep on Erasing|Hot Generics|
16.7: TERMINAL STREAM OPERATIONS 949

CD.cdList.stream().parallel().map(CD::title).forEach(printStr); // (1b)
//Lambda Dancing|Hot Generics|Keep on Erasing|Java Jam|Java Jive|

The behavior of the forEach() operation is nondeterministic, as seen at (1a) and


(1b). The output from (1a) and (1b) shows that the forEach() operation respects the
encounter order for an ordered sequential stream, but not necessarily for an
ordered parallel stream. Respecting the encounter order for an ordered parallel
stream would incur overhead that would impact performance, and is therefore
ignored.
On the other hand, the forEachOrdered() operation always respects the encounter
order in both cases, as seen below from the output at (2a) and (2b). However, it is
important to note that, in the case of the ordered parallel stream, the terminal
action on the elements can be executed in different threads, but guarantees that the
action is applied to the elements in encounter order.
CD.cdList.stream().map(CD::title).forEachOrdered(printStr); // (2a)
//Java Jive|Java Jam|Lambda Dancing|Keep on Erasing|Hot Generics|

CD.cdList.stream().parallel().map(CD::title).forEachOrdered(printStr); // (2b)
//Java Jive|Java Jam|Lambda Dancing|Keep on Erasing|Hot Generics|

The discussion above also applies when the forEach() and forEachOrdered() termi-
nal operations are invoked on numeric streams. The nondeterministic behavior of
the forEach() terminal operation for int streams is illustrated below. The terminal
operation on the sequential int stream at (3a) seems to respect the encounter order,
but should not be relied upon. The terminal operation on the parallel int stream at
(3b) can give different results for different runs.
IntConsumer printInt = n -> out.print(n + "|");

IntStream.of(2018, 2019, 2020, 2021, 2022).forEach(printInt); // (3a)


//2018|2019|2020|2021|2022|

IntStream.of(2018, 2019, 2020, 2021, 2022).parallel().forEach(printInt); // (3b)


//2020|2019|2018|2021|2022|

Matching Elements
The match operations determine whether any, all, or none of the stream elements
satisfy a given Predicate. These operations are not reductions, as they do not
always consider all elements in the stream in order to return a result.
Analogous match operations are also provided by the numeric stream interfaces.
boolean anyMatch(Predicate<? super T> predicate)
boolean allMatch(Predicate<? super T> predicate)
boolean noneMatch(Predicate<? super T> predicate)
These three terminal operations determine whether any, all, or no elements of
this stream match the specified predicate, respectively.
950 CHAPTER 16: STREAMS

The methods may not evaluate the predicate on all elements if it is not neces-
sary for determining the result; that is, they are short-circuit operations.
If the stream is empty, the predicate is not evaluated.
The anyMatch() method returns false if the stream is empty.
The allMatch() and noneMatch() methods return true if the stream is empty.
There is no guarantee that these operations will terminate if applied to an infi-
nite stream.

The queries at (1), (2), and (3) below determine whether any, all, or no CDs are jazz
music CDs, respectively. At (1), the execution of the pipeline terminates as soon as
any jazz music CD is found—the value true is returned. At (2), the execution of the
pipeline terminates as soon as a non-jazz music CD is found—the value false is
returned. At (3), the execution of the pipeline terminates as soon as a jazz music
CD is found—the value false is returned.
boolean anyJazzCD = CD.cdList.stream().anyMatch(CD::isJazz); // (1) true
boolean allJazzCds = CD.cdList.stream().allMatch(CD::isJazz); // (2) false
boolean noJazzCds = CD.cdList.stream().noneMatch(CD::isJazz); // (3) false

Given the following predicates:


Predicate<CD> eq2015 = cd -> cd.year().compareTo(Year.of(2015)) == 0;
Predicate<CD> gt2015 = cd -> cd.year().compareTo(Year.of(2015)) > 0;

The query at (4) determines that no CDs were released in 2015. The queries at (5)
and (6) are equivalent. If all CDs were released after 2015, then none were released
in or before 2015 (negation of the predicate gt2015).
boolean noneEQ2015 = CD.cdList.stream().noneMatch(eq2015); // (4) true
boolean allGT2015 = CD.cdList.stream().allMatch(gt2015); // (5) true
boolean noneNotGT2015 = CD.cdList.stream().noneMatch(gt2015.negate()); // (6) true

The code below uses the anyMatch() method on an int stream to determine whether
any year is a leap year.
IntStream yrStream = IntStream.of(2018, 2019, 2020);
IntPredicate isLeapYear = yr -> Year.of(yr).isLeap();
boolean anyLeapYear = yrStream.anyMatch(isLeapYear);
out.println("Any leap year: " + anyLeapYear); // true

Example 16.10 illustrates using the allMatch() operation to determine whether a


square matrix—that is, a two-dimensional array with an equal number of columns
as rows—is an identity matrix. In such a matrix, all elements on the main diagonal
have the value 1 and all other elements have the value 0. The methods isIdentity-
MatrixLoops() and isIdentityMatrixStreams() at (1) and (2) implement this test in
different ways.
The method isIdentityMatrixLoops() at (1) uses nested loops. The outer loop pro-
cesses the rows, whereas the inner loop tests that each row has the correct values. The
outer loop is a labeled loop in order to break out of the inner loop if an element in a
row does not have the correct value—effectively achieving short-circuit execution.
16.7: TERMINAL STREAM OPERATIONS 951

The method isIdentityMatrixStreams() at (2) uses nested numeric streams, where


the outer stream processes the rows and the inner stream processes the elements in
a row. The allMatch() method at (4) in the inner stream pipeline determines that all
elements in a row have the correct value. It short-circuits the execution of the inner
stream if that is not the case. The allMatch() method at (3) in the outer stream pipe-
line also short-circuits its execution if its predicate to process a row returns the
value false. The stream-based implementation for the identity matrix test expresses
the logic more clearly and naturally than the loop-based version.

Example 16.10 Identity Matrix Test

import static java.lang.System.out;

import java.util.Arrays;
import java.util.stream.IntStream;

public class IdentityMatrixTest {


public static void main(String[] args) {
// Matrices to test:
int[][] sqMatrix1 = { {1, 0, 0}, {0, 1, 0}, {0, 0, 1} };
int[][] sqMatrix2 = { {1, 1}, {1, 1} };
isIdentityMatrixLoops(sqMatrix1);
isIdentityMatrixLoops(sqMatrix2);
isIdentityMatrixStreams(sqMatrix1);
isIdentityMatrixStreams(sqMatrix2);
}

private static void isIdentityMatrixLoops(int[][] sqMatrix) { // (1)


boolean isCorrectValue = false;
outerLoop:
for (int i = 0; i < sqMatrix.length; ++i) {
for (int j = 0; j < sqMatrix[i].length; ++j) {
isCorrectValue = j == i ? sqMatrix[i][i] == 1
: sqMatrix[i][j] == 0;
if (!isCorrectValue) break outerLoop;
}
}
out.println(Arrays.deepToString(sqMatrix)
+ (isCorrectValue ? " is ": " is not ") + "an identity matrix.");
}

private static void isIdentityMatrixStreams(int[][] sqMatrix) { // (2)


boolean isCorrectValue =
IntStream.range(0, sqMatrix.length)
.allMatch(i -> IntStream.range(0, sqMatrix[i].length) // (3)
.allMatch(j -> j == i // (4)
? sqMatrix[i][i] == 1
: sqMatrix[i][j] == 0));
out.println(Arrays.deepToString(sqMatrix)
+ (isCorrectValue ? " is ": " is not ") + "an identity matrix.");
}
}
952 CHAPTER 16: STREAMS

Output from the program:


[[1, 0, 0], [0, 1, 0], [0, 0, 1]] is an identity matrix.
[[1, 1], [1, 1]] is not an identity matrix.
[[1, 0, 0], [0, 1, 0], [0, 0, 1]] is an identity matrix.
[[1, 1], [1, 1]] is not an identity matrix.

Finding the First or Any Element


The findFirst() method can be used to find the first element that is available in the
stream. This method respects the encounter order, if the stream has one. It always
produces a stable result; that is, it will produce the same result on identical pipe-
lines based on the same stream source. In contrast, the behavior of the findAny()
method is nondeterministic. Counterparts to these methods are also defined by the
numeric stream interfaces.
Optional<T> findFirst()
This terminal operation returns an Optional describing the first element of this
stream, or an empty Optional if the stream is empty.
This method may return any element if this stream does not have any encoun-
ter order.
It is a short-circuit operation, as it will terminate the execution of the stream
pipeline as soon as the first element is found.
This method throws a NullPointerException if the element selected is null.
Optional<T> findAny()
This terminal operation returns an Optional describing some element of the
stream, or an empty Optional if the stream is empty. This operation has nonde-
terministic behavior.
It is a short-circuit operation, as it will terminate the execution of the stream
pipeline as soon as any element is found.

In the code below, the encounter order of the stream is the positional order of the
elements in the list. The first element returned by the findFirst() method at (1) is
the first element in the CD list.
Optional<CD> firstCD1 = CD.cdList.stream().findFirst(); // (1)
out.println(firstCD1.map(CD::title).orElse("No first CD.")); // (2) Java Jive

Since such an element might not exist—for example, the stream might be empty—
the method returns an Optional<T> object. At (2), the Optional<CD> object returned
by the findFirst() method is mapped to an Optional<String> object that encapsu-
lates the title of the CD. The orElse() method on this Optional<String> object
returns the CD title or the argument string if there is no such CD.
If the encounter order is not of consequence, the findAny() method can be used, as
it is nondeterministic—that is, it does not guarantee the same result on the same
16.7: TERMINAL STREAM OPERATIONS 953

stream source. On the other hand, it provides maximal performance on parallel


streams. At (3) below, the findAny() method is free to return any element from the
parallel stream. It should not come as a surprise if the element returned is not the
first element in the list.
Optional<CD> anyCD2 = CD.cdList.stream().parallel().findAny(); // (3)
out.println(anyCD2.map(CD::title).orElse("No CD.")); // Lambda Dancing

The match methods only determine whether any elements satisfy a Predicate, as
seen at (5) below. Typically, a find terminal operation is used to find the first ele-
ment made available to the terminal operation after processing by the intermediate
operations in the stream pipeline. At (6), the filter() operation will filter the jazz
music CDs from the stream. However, the findAny() operation will return the first
jazz music CD that is filtered and then short-circuit the execution.
boolean anyJazzCD = CD.cdList.stream().anyMatch(CD::isJazz); // (5)
out.println("Any Jazz CD: " + anyJazzCD); // Any Jazz CD: true

Optional<CD> optJazzCD = CD.cdList.stream().filter(CD::isJazz).findAny(); // (6)


optJazzCD.ifPresent(out::println); // <Jaav, "Java Jam", 6, 2017, JAZZ>

The code below uses the findAny() method on an IntStream to find whether any
number is divisible by 7.
IntStream numStream = IntStream.of(50, 55, 65, 70, 75, 77);
OptionalInt intOpt = numStream.filter(n -> n % 7 == 0).findAny();
intOpt.ifPresent(System.out::println); // 70

The find operations are guaranteed to terminate when applied to a finite, albeit
empty, stream. However, for an infinite stream in a pipeline, at least one element
must be made available to the find operation in order for the operation to termi-
nate. If the elements of an initial infinite stream are all discarded by the intermedi-
ate operations, the find operation will not terminate, as in the following pipeline:
Stream.generate(() -> 1).filter(n -> n == 0).findAny(); // Never terminates.

Counting Elements
The count() operation performs a functional reduction on the elements of a stream,
as each element contributes to the count which is the single immutable value
returned by the operation. The count() operation reports the number of elements
that are made available to it, which is not necessarily the same as the number of
elements in the initial stream, as elements might be discarded by the intermediate
operations.
The code below finds the total number of CDs in the streams, and how many of
these CDs are jazz music CDs.
long numOfCDS = CD.cdList.stream().count(); // 5
long numOfJazzCDs = CD.cdList.stream().filter(CD::isJazz).count(); // 3
954 CHAPTER 16: STREAMS

The count() method is also defined for the numeric streams. Below it is used on an
IntStream to find how many numbers between 1 and 100 are divisible by 7.
IntStream numStream = IntStream.rangeClosed(1, 100);
long divBy7 = numStream.filter(n -> n % 7 == 0).count(); // 14

long count()
This terminal operation returns the count of elements in this stream—that is,
the length of this stream.
This operation is a special case of a functional reduction.
The operation does not terminate when applied to an infinite stream.

Finding Min and Max Elements


The min() and max() operations are functional reductions, as they consider all ele-
ments of the stream and return a single value. They should only be applied to a
finite stream, as they will not terminate on an infinite stream. These methods are
also defined by the numeric stream interfaces for the numeric types, but without
the specification of a comparator.
Optional<T> min(Comparator<? super T> cmp)
Optional<T> max(Comparator<? super T> cmp)
These terminal operations return an Optional with the minimum or maximum
element of this stream according to the provided Comparator, respectively, or an
empty Optional if this stream is empty. It throws a NullPointerException if the
minimum element is null.
These operations are a special case of a functional reduction.
These operations do not terminate when applied to an infinite stream.

Both methods return an Optional, as the minimum and maximum elements might
not exist—for example, if the stream is empty. The code below finds the minimum
and maximum elements in a stream of CDs, according to their natural order. The
artist name is the most significant field according to the natural order defined for
CDs (p. 883).
Optional<CD> minCD = CD.cdList.stream().min(Comparator.naturalOrder());
minCD.ifPresent(out::println); // <Funkies, "Lambda Dancing", 10, 2018, POP>
out.println(minCD.map(CD::artist).orElse("No min CD.")); // Funkies

Optional<CD> maxCD = CD.cdList.stream().max(Comparator.naturalOrder());


maxCD.ifPresent(out::println); // <Jaav, "Java Jive", 8, 2017, POP>
out.println(maxCD.map(CD::artist).orElse("No max CD.")); // Jaav

In the code below, the max() method is applied to an IntStream to find the largest
number between 1 and 100 that is divisible by 7.
IntStream iStream = IntStream.rangeClosed(1, 100);
OptionalInt maxNum = iStream.filter(n -> n % 7 == 0).max(); // 98
16.7: TERMINAL STREAM OPERATIONS 955

If one is only interested in the minimum and maximum elements in a collection,


the overloaded methods min() and max() of the java.util.Collections class can be
more convenient to use.

Implementing Functional Reduction: The reduce() Method


A functional reduction combines all elements in a stream to produce a single immu-
table value as its result. The reduction process employs an accumulator that repeat-
edly computes a new partial result based on the current partial result and the
current element in the stream. The stream thus gets shorter by one element. When
all elements have been combined, the last partial result that was computed by the
accumulator is returned as the final result of the reduction process.
The following terminal operations are special cases of functional reduction:
• count(), p. 953.
• min(), p. 954.
• max(), p. 954.
• average(), p. 1000.
• sum(), p. 1001.
The overloaded reduce() method can be used to implement new forms of func-
tional reduction.
Optional<T> reduce(BinaryOperator<T> accumulator)
This terminal operation returns an Optional with the cumulative result of apply-
ing the accumulator on the elements of this stream: e1 ⊕ e2 ⊕ e3 ..., where each
ei is an element of this stream and ⊕ is the accumulator. If the stream is empty,
an empty Optional is returned.
The accumulator must be associative—that is, the result of evaluating an expres-
sion is the same, regardless of how the operands are grouped to evaluate the
expression. For example, the grouping in the expression below allows the sub-
expressions to be evaluated in parallel and their results combined by the accu-
mulator:
ei ⊕ ej ⊕ ek ⊕ el == (ei ⊕ ej) ⊕ (ek ⊕ el)
where ei, ej, ek, and el are operands, and ⊕ is the accumulator. For example,
numeric addition, min, max, and string concatenation are associative opera-
tions, whereas subtraction and division are nonassociative.
The accumulator must also be a non-interfering and stateless function (p. 909).
Note that the method reduces a Stream of type T to a result that is an Optional
of type T.
A counterpart to the single-argument reduce() method is also provided for the
numeric streams.
956 CHAPTER 16: STREAMS

T reduce(T identity, BinaryOperator<T> accumulator)


This terminal operation returns the cumulative result of applying the accumula-
tor on the elements of this stream: identity ⊕ e1 ⊕ e2 ⊕ e3 ..., where ei is an
element of this stream, and ⊕ is the accumulator. The identity value is the initial
value to accumulate. If the stream is empty, the identity value is returned.
The identity value must be an identity for the accumulator—for all ei, identity
⊕ ei == ei. The accumulator must be associative.
The accumulator must also be a non-interfering and stateless function (p. 909).
Note that the method reduces a Stream of type T to a result of type T.
A counterpart to the two-argument reduce() method is also provided for the
numeric streams.
<U> U reduce(
U identity,
BiFunction<U,? super T,U> accumulator,
BinaryOperator<U> combiner)
This terminal operation returns the cumulative result of applying the accumulator
on the elements of this stream, using the identity value of type U as the initial
value to accumulate. If the stream is empty, the identity value is returned.
The identity value must be an identity for the combiner function. The accumula-
tor and the combiner function must also satisfy the following relationship for
all u and t of type U and T, respectively:
u © (identity ⊕ t) == u ⊕ t
where © and ⊕ are the accumulator and combiner functions, respectively.
The combiner function combines two values during stream processing. It may
not be executed for a sequential stream, but for a parallel stream, it will com-
bine cumulative results of segments that are processed concurrently.
Both the accumulator and the combiner must also be non-interfering and stateless
functions (p. 909).
Note that the accumulator has the function type (U, T) -> U, and the combiner
function has the function type (U, U) -> U, where the type parameters U and T
are always the types of the partial result and the stream element, respectively.
This method reduces a Stream of type T to a result of type U.
There is no counterpart to the three-argument reduce() method for the numeric
streams.

The idiom of using a loop for calculating the sum of a finite number of values is
something that is ingrained into all aspiring programmers. A loop-based solution
to calculate the total number of tracks on CDs in a list is shown below, where the
variable sum will hold the result after the execution of the for(:) loop:
int sum = 0; // (1) Initialize the partial result.
for (CD cd : CD.cdList) { // (2) Iterate over the list.
int numOfTracks = cd.noOfTracks(); // (3) Get the current value.
sum = sum + numOfTracks; // (4) Calculate new partial result.
}
16.7: TERMINAL STREAM OPERATIONS 957

Apart from the for(:) loop at (2) to iterate over all elements of the list and read the
number of tracks in each CD at (3), the two necessary steps are:
• Initialization of the variable sum at (1)
• The accumulative operation at (4) that is applied repeatedly to compute a new
partial result in the variable sum, based on its previous value and the number of
tracks in the current CD
The loop-based solution above can be translated to a stream-based solution, as
shown in Figure 16.11. All the code snippets can be found in Example 16.11.

Figure 16.11 Reducing with an Initial Value


// Query: Find the total number of CD tracks.
int totNumOfTracks = CD.cdList // (5)
.stream() // (6)
.mapToInt(CD::noOfTracks) // (7)
.reduce(0, // (8)
(sum, numOfTracks) -> sum + numOfTracks); // (9)

(a) Using the reduce() method with an initial value


Stream<CD> IntStream int
Contents of
CD.cdList
0
Initial
stream() mapToInt() reduce() value
8 + 8
Partial
cd4 cd3 cd2 cd1 cd0
result
cd4 cd3 cd2 cd1 6 + 14

cd4 cd3 cd2 10 + 24

cd4 cd3 8 + 32

+
Final
cd4 10 42
result

+ Accumulator
(b) Stream pipeline

In Figure 16.11, the stream created at (6) internalizes the iteration over the ele-
ments. The mapToInt() intermediate operation maps each CD to its number of
tracks at (7)—the Stream<CD> is mapped to an IntStream. The reduce() terminal oper-
ation with two arguments computes and returns the total number of tracks:
• Its first argument at (8) is the identity element that provides the initial value for
the operation and is also the default value to return if the stream is empty. In
this case, this value is 0.
• Its second argument at (9) is the accumulator that is implemented as a lambda
expression. It repeatedly computes a new partial sum based on the previous
partial sum and the number of tracks in the current CD, as evident from
958 CHAPTER 16: STREAMS

Figure 16.11. In this case, the accumulator is an IntBinaryOperator whose func-


tional type is (int, int) -> int. Note that the parameters of the lambda expres-
sion represent the partial sum and the current number of tracks, respectively.
The stream pipeline in Figure 16.11 is an example of a map-reduce transformation on
a sequential stream, as it maps the stream elements first and then reduces them.
Typically, a filter operation is also performed before the map-reduce transformation.
Each of the following calls can replace the reduce() method call in Figure 16.11, as
they are all equivalent:
reduce(0, (sum, noOfTracks) -> Integer.sum(sum, noOfTracks))
reduce(0, Integer::sum) // Method reference
sum() // Special functional reduction, p. 1001.

In Example 16.11, the stream pipeline at (10) prints the actions taken by the accu-
mulator which is now augmented with print statements. The output at (3) shows
that the accumulator actions correspond to those in Figure 16.11.
The single-argument reduce() method only accepts an accumulator. As no explicit
default or initial value can be specified, this method returns an Optional. If the
stream is not empty, it uses the first element as the initial value; otherwise, it
returns an empty Optional. In Example 16.11, the stream pipeline at (13) uses the
single-argument reduce() method to compute the total number of tracks on CDs.
The return value is an OptionalInt that can be queried to extract the encapsulated
int value.
OptionalInt optSumTracks0 = CD.cdList // (13)
.stream()
.mapToInt(CD::noOfTracks)
.reduce(Integer::sum); // (14)
out.println("Total number of tracks: " + optSumTracks0.orElse(0)); // 42

We can again augment the accumulator with print statements as shown at (16) in
Example 16.11. The output at (5) shows that the number of tracks from the first CD
was used as the initial value before the accumulator is applied repeatedly to the
rest of the values.

Example 16.11 Implementing Functional Reductions

import static java.lang.System.out;

import java.util.Comparator;
import java.util.Optional;
import java.util.OptionalInt;
import java.util.function.BinaryOperator;

public final class FunctionalReductions {


public static void main(String[] args) {

// Two-argument reduce() method:


{
16.7: TERMINAL STREAM OPERATIONS 959

out.println("(1) Find total number of tracks (loop-based version):");


int sum = 0; // (1) Initialize the partial result.
for (CD cd : CD.cdList) { // (2) Iterate over the list.
int numOfTracks = cd.noOfTracks(); // (3) Get the next value.
sum = sum + numOfTracks; // (4) Calculate new partial result.
}
out.println("Total number of tracks: " + sum);
}

out.println("(2) Find total number of tracks (stream-based version):");


int totNumOfTracks = CD.cdList // (5)
.stream() // (6)
.mapToInt(CD::noOfTracks) // (7)
.reduce(0, // (8)
(sum, numOfTracks) -> sum + numOfTracks); // (9)
// .reduce(0, (sum, noOfTracks) -> Integer.sum(sum, noOfTracks));
// .reduce(0, Integer::sum);
// .sum();
out.println("Total number of tracks: " + totNumOfTracks);
out.println();

out.println("(3) Find total number of tracks (accumulator logging): ");


int totNumOfTracks1 = CD.cdList // (10)
.stream()
.mapToInt(CD::noOfTracks)
.reduce(0, // (11)
(sum, noOfTracks) -> { // (12)
int newSum = sum + noOfTracks;
out.printf("Accumulator: sum=%2d, noOfTracks=%2d, newSum=%2d%n",
sum, noOfTracks, newSum);
return newSum;
}
);
out.println("Total number of tracks: " + totNumOfTracks1);
out.println();

// One-argument reduce() method:

out.println("(4) Find total number of tracks (stream-based version):");


OptionalInt optSumTracks0 = CD.cdList // (13)
.stream()
.mapToInt(CD::noOfTracks)
.reduce(Integer::sum); // (14)
out.println("Total number of tracks: " + optSumTracks0.orElse(0));
out.println();

out.println("(5) Find total number of tracks (accumulator logging): ");


OptionalInt optSumTracks1 = CD.cdList // (15)
.stream()
.mapToInt(CD::noOfTracks)
.reduce((sum, noOfTracks) -> { // (16)
int newSum = sum + noOfTracks;
out.printf("Accumulator: sum=%2d, noOfTracks=%2d, newSum=%2d%n",
sum, noOfTracks, newSum);
return newSum;
});
960 CHAPTER 16: STREAMS

out.println("Total number of tracks: " + optSumTracks1.orElse(0));


out.println();

// Three-argument reduce() method:

out.println("(6) Find total number of tracks (accumulator + combiner): ");


Integer sumTracks5 = CD.cdList // (17)
// .stream() // (18a)
.parallelStream() // (18b)
.reduce(Integer.valueOf(0), // (19) Initial value
(sum, cd) -> sum + cd.noOfTracks(), // (20) Accumulator
(sum1, sum2) -> sum1 + sum2); // (21) Combiner
out.println("Total number of tracks: " + sumTracks5);
out.println();

out.println("(7) Find total number of tracks (accumulator + combiner): ");


Integer sumTracks6 = CD.cdList // (22)
// .stream() // (23a)
.parallelStream() // (23b)
.reduce(0,
(sum, cd) -> { // (24) Accumulator
Integer noOfTracks = cd.noOfTracks();
Integer newSum = sum + noOfTracks;
out.printf("Accumulator: sum=%2d, noOfTracks=%2d, "
+ "newSum=%2d%n", sum, noOfTracks, newSum);
return newSum;
},
(sum1, sum2) -> { // (25) Combiner
Integer newSum = sum1 + sum2;
out.printf("Combiner: sum1=%2d, sum2=%2d, newSum=%2d%n",
sum1, sum2, newSum);
return newSum;
}
);
out.println("Total number of tracks: " + sumTracks6);
out.println();

// Compare by CD title.
Comparator<CD> cmpByTitle = Comparator.comparing(CD::title); // (26)
BinaryOperator<CD> maxByTitle =
(cd1, cd2) -> cmpByTitle.compare(cd1, cd2) > 0 ? cd1 : cd2; // (27)

// Query: Find maximum Jazz CD by title:


Optional<CD> optMaxJazzCD = CD.cdList // (28)
.stream()
.filter(CD::isJazz)
.reduce(BinaryOperator.maxBy(cmpByTitle)); // (29a)
// .reduce(maxByTitle); // (29b)
// .max(cmpByTitle); // (29c)
optMaxJazzCD.map(CD::title).ifPresent(out::println);// Keep on Erasing
}
}
16.7: TERMINAL STREAM OPERATIONS 961

Possible output from the program:


(1) Find total number of tracks (loop-based version):
Total number of tracks: 42
(2) Find total number of tracks (stream-based version):
Total number of tracks: 42

(3) Find total number of tracks (accumulator logging):


Accumulator: sum= 0, noOfTracks= 8, newSum= 8
Accumulator: sum= 8, noOfTracks= 6, newSum=14
Accumulator: sum=14, noOfTracks=10, newSum=24
Accumulator: sum=24, noOfTracks= 8, newSum=32
Accumulator: sum=32, noOfTracks=10, newSum=42
Total number of tracks: 42

(4) Find total number of tracks (stream-based version):


Total number of tracks: 42

(5) Find total number of tracks (accumulator logging):


Accumulator: sum= 8, noOfTracks= 6, newSum=14
Accumulator: sum=14, noOfTracks=10, newSum=24
Accumulator: sum=24, noOfTracks= 8, newSum=32
Accumulator: sum=32, noOfTracks=10, newSum=42
Total number of tracks: 42

(6) Find total number of tracks (accumulator + combiner):


Total number of tracks: 42

(7) Find total number of tracks (accumulator + combiner):


Accumulator: sum= 0, noOfTracks=10, newSum=10
Accumulator: sum= 0, noOfTracks=10, newSum=10
Accumulator: sum= 0, noOfTracks= 8, newSum= 8
Combiner: sum1= 8, sum2=10, newSum=18
Combiner: sum1=10, sum2=18, newSum=28
Accumulator: sum= 0, noOfTracks= 6, newSum= 6
Accumulator: sum= 0, noOfTracks= 8, newSum= 8
Combiner: sum1= 8, sum2= 6, newSum=14
Combiner: sum1=14, sum2=28, newSum=42
Total number of tracks: 42

Keep on Erasing

The single-argument and two-argument reduce() methods accept a binary operator


as the accumulator whose arguments and result are of the same type. The three-
argument reduce() method is more flexible and can only be applied to objects. The
stream pipeline below computes the total number of tracks on CDs using the three-
argument reduce() method.
Integer sumTracks5 = CD.cdList // (17)
.stream() // (18a)
// .parallelStream() // (18b)
.reduce(Integer.valueOf(0), // (19) Initial value
(sum, cd) -> sum + cd.noOfTracks(), // (20) Accumulator
(sum1, sum2) -> sum1 + sum2); // (21) Combiner
962 CHAPTER 16: STREAMS

The reduce() method above accepts the following arguments:


• An identity value: Its type is U. In this case, it is an Integer that wraps the value
0. As before, it is used as the initial value. The type of the value returned by the
reduce() method is also U.
• An accumulator: It is a BiFunction<U,T,U>; that is, it is a binary function that
accepts an object of type U and an object of type T and produces a result of type
U. In this case, type U is Integer and type T is CD. The lambda expression imple-
menting the accumulator first reads the number of tracks from the current CD
before the addition operator is applied. Thus the accumulator will calculate the
sum of Integers which are, of course, unboxed and boxed to do the calculation.
As we have seen earlier, the accumulator is repeatedly applied to sum the
tracks on the CDs. Only this time, the mapping of a CD to an Integer is done
when the accumulator is evaluated.
• A combiner: It is a BinaryOperator<U>; that is, it is a binary operator whose argu-
ments and result are of the same type U. In this case, type U is Integer. Thus the
combiner will calculate the sum of Integers which are unboxed and boxed to
do the calculation.
In the code above, the combiner is not executed if the reduce() method is
applied to a sequential stream. However, there is no guarantee that this is
always the case for a sequential stream. If we uncomment (18b) and remove
(18a), the combiner will be executed on the parallel stream.
That the combiner in the three-argument reduce() method is executed for a parallel
stream is illustrated by the stream pipeline at (22) in Example 16.11, that has been
augmented with print statements. There is no output from the combiner when the
stream is sequential. The output at (7) in Example 16.11 shows that the combiner
accumulates the partial sums created by the accumulator when the stream is parallel.

Parallel Functional Reduction


Parallel execution is illustrated in Figure 16.12. Multiple instances of the stream
pipeline are executed in parallel, where each pipeline instance processes a segment
of the stream. In this case, only one CD is allocated to each pipeline instance. Each
pipeline instance thus produces its partial sum, and the combiner is applied in par-
allel on the partial sums to combine them into a final result. No additional synchro-
nization is required to run the reduce() operation in parallel.
Figure 16.12 also illustrates why the initial value must be an identity value. Say we
had specified the initial value to be 3. Then the value 3 would be added multiple
times to the sum during parallel execution. We also see why both the accumulator
and the combiner are associative, as this allows for any two values to be combined
in any order.
When the single-argument and two-argument reduce() methods are applied to a
parallel stream, the accumulator also acts as the combiner. The three-argument
reduce() method can usually be replaced with a map-reduce transformation, making
the combiner redundant.
16.7: TERMINAL STREAM OPERATIONS 963

Figure 16.12 Parallel Functional Reduction

// Query: Find total number of CD tracks.


Integer sumTracks5 = CD.cdList // (17)
.parallelStream() // (18b)
.reduce(Integer.valueOf(0), // (19) Initial value
(sum, cd) -> sum + cd.noOfTracks(), // (20) Accumulator
(sum1, sum2) -> sum1 + sum2); // (21) Combiner

(a) Using the Stream.reduce() method on a parallel stream

Stream<CD>

0
parallel- reduce()
Stream()
cd0 8 + 8

Stream<CD>
+ 14
0
parallel- reduce()
Stream()
cd1 6 + 6

Stream<CD>

+ 42
0
parallel- reduce()
Stream()
cd2 10 + 10

Stream<CD>
+ 18
0
parallel- reduce()
Stream()
cd3 8 + 8 + 28

Stream<CD>

0
+ Accumulator
parallel- reduce()
Stream() +
cd4 10 + 10 Combiner

(b) Parallel functional reduction


964 CHAPTER 16: STREAMS

We conclude the discussion on implementing functional reductions by implement-


ing the max() method that finds the maximum element in a stream according to a
given comparator. A comparator that compares by the CD title is defined at (26). A
binary operator that finds the maximum of two CDs when compared by title is
defined at (27). It uses the comparator defined at (26). The stream pipeline at (28)
finds the maximum of all jazz music CDs by title. The method calls at (29a), (29b),
and (29c) are equivalent.
Comparator<CD> cmpByTitle = Comparator.comparing(CD::title); // (26)
BinaryOperator<CD> maxByTitle =
(cd1, cd2) -> cmpByTitle.compare(cd1, cd2) > 0 ? cd1 : cd2; // (27)

Optional<CD> optMaxJazzCD = CD.cdList // (28)


.stream()
.filter(CD::isJazz)
.reduce(BinaryOperator.maxBy(cmpByTitle)); // (29a)
// .reduce(maxByTitle); // (29b)
// .max(cmpByTitle); // (29c)
optMaxJazzCD.map(CD::title).ifPresent(out::println); // Keep on Erasing

The accumulator at (29a), returned by the BinaryOperator.maxBy() method, will


compare the previous maximum CD and the current CD by title to compute a new
maximum jazz music CD. The accumulator used at (29b) is implemented at (27). It
also does the same comparison as the accumulator at (29a). At (29c), the max()
method also does the same thing, based on the comparator at (26). Note that the
return value is an Optional<CD>, as the stream might be empty. The Optional<CD> is
mapped to an Optional<String>. If it is not empty, its value—that is, the CD title—
is printed.
The reduce() method does not terminate if applied to an infinite stream, as the
method will never finish processing all stream elements.

Implementing Mutable Reduction: The collect() Method


The collect(Collector) method accepts a collector that encapsulates the functions
required to perform a mutable reduction. We discuss predefined collectors imple-
mented by the java.util.stream.Collectors class in a later section (p. 978). The code
below uses the collector returned by the Collectors.toList() method that accumu-
lates the result in a list (p. 980).
List<String> titles = CD.cdList.stream()
.map(CD::title).collect(Collectors.toList());
// [Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

The collect(supplier, accumulator, combiner) generic method provides the general


setup for implementing mutable reduction on stream elements using different
kinds of mutable containers—for example, a list, a map, or a StringBuilder. It uses
one or more mutable containers to accumulate partial results that are combined into
a single mutable container that is returned as the result of the reduction operation.
16.7: TERMINAL STREAM OPERATIONS 965

<R,A> R collect(Collector<? super T,A,R> collector)


This terminal operation performs a reduction operation on the elements of this
stream using a Collector (p. 978).
A Collector encapsulates the functions required for performing the reduction.
The result of the reduction is of type R, and the type parameter A is the interme-
diate accumulation type of the Collector.
<R> R collect(
Supplier<R> supplier,
BiConsumer<R,? super T> accumulator,
BiConsumer<R,R> combiner)
This terminal operation performs a mutable reduction on the elements of this
stream. A counterpart to this method is also provided for numeric streams.
The supplier creates a new mutable container of type R—which is typically
empty. Elements are incorporated into such a container during the reduction
process. For a parallel stream, the supplier can be called multiple times, and
the container returned by the supplier must be an identity container in the sense
that it does not mutate any result container with which it is merged.
The accumulator incorporates additional elements into a result container: A
stream element of type T is incorporated into a mutable container of type R.
The combiner merges two values that are mutable containers of type R. It must
be compatible with the accumulator. There is no guarantee that the combiner is
called if the stream is sequential, but definitely comes into play if the stream is
parallel.
Both the accumulator and the combiner must also be non-interfering and stateless
functions (p. 909).
With the above requirements on the argument functions fulfilled, the collect()
method will produce the same result regardless of whether the stream is
sequential or parallel.

We will use Figure 16.13 to illustrate mutable reduction performed on a sequential


stream by the three-argument collect() method. The figure shows both the code
and the execution of a stream pipeline to create a list containing the number of
tracks on each CD. The stream of CDs is mapped to a stream of Integers at (3), where
each Integer value is the number of tracks on a CD. The collect() method at (4)
accepts three functions as arguments. They are explicitly defined as lambda
expressions to show what the parameters represent and how they are used to per-
form mutable reduction. Implementation of these functions using method refer-
ences can be found in Example 16.12.
• Supplier: The supplier is a Supplier<R> that is used to create new instances of a
mutable result container of type R. Such a container holds the results computed
by the accumulator and the combiner. In Figure 16.13, the supplier at (4)
returns an empty ArrayList<Integer> every time it is called.
966 CHAPTER 16: STREAMS

• Accumulator: The accumulator is a BiConsumer<R, T> that is used to accumulate


an element of type T into a mutable result container of type R. In Figure 16.13,
type R is ArrayList<Integer> and type T is Integer. The accumulator at (5) mutates
a container of type ArrayList<Integer> by repeatedly adding a new Integer
value to it, as illustrated in Figure 16.13b. It is instructive to contrast this accu-
mulator with the accumulator for sequential functional reduction illustrated in
Figure 16.11, p. 957.
• Combiner: The combiner is a BiConsumer<R, R> that merges the contents of the
second argument container with the contents of the first argument container,
where both containers are of type R. As in the case of the reduce(identity,
accumulator, combiner) method, the combiner is executed when the collect()
method is called on a parallel stream.

Figure 16.13 Sequential Mutable Reduction


// Query: Create a list with the number of tracks on each CD.
List<Integer> tracks = CD.cdList // (1)
.stream() // (2a)
.map(CD::noOfTracks) // (3)
.collect(() -> new ArrayList<>(), // (4) Supplier
(cont, noOfTracks) -> cont.add(noOfTracks),// (5) Accumulator
(cont1, cont2) -> cont1.addAll(cont2)); // (6) Combiner
(a) Using the Stream.collect() method on a sequential stream

Stream<CD> Stream<Integer>
Contents of
CD.cdList
stream() map() collect() [] S

cd4 cd3 cd2 cd1 cd0 8 A [8]

cd4 cd3 cd2 cd1 6 A [8, 6]

cd4 cd3 cd2 10 A [8, 6, 10]

cd4 cd3 8 A [8, 6, 10, 8]

cd4 10 A [8, 6, 10, 8, 10]

S Supplier

A Accumulator
(b) Sequential mutual reduction

Parallel Mutable Reduction


Figure 16.14 shows the stream pipeline from Figure 16.13, where the sequential
stream (2a) has been replaced by a parallel stream (2b); in other words, the
collect() method is called on a parallel stream. One possible parallel execution of
the pipeline is also depicted in Figure 16.14b. We see five instances of the pipeline
being executed in parallel. The supplier creates five empty ArrayLists that are used
16.7: TERMINAL STREAM OPERATIONS 967

Figure 16.14 Parallel Mutable Reduction


// Query: Create a list with the number of tracks on each CD.
List<Integer> tracks = CD.cdList // (1)
.parallelStream() // (2b)
.map(CD::noOfTracks) // (3)
.collect(() -> new ArrayList<>(), // (4) Supplier
(cont, noOfTracks) -> cont.add(noOfTracks), // (5) Accumulator
(cont1, cont2) -> cont1.addAll(cont2)); // (6) Combiner
(a) Using the Stream.collect() method on a parallel stream

Stream<CD> Stream<Integer>

S []
parallel-
map() collect()
Stream()
cd0 8 8 A [8]

Stream<CD> Stream<Integer>
C [8, 6]
S []
parallel-
map() collect()
Stream()
cd1 6 6 A [6]

Stream<CD> Stream<Integer>

S [] C [8, 6,
parallel- 10, 8,
map() collect() 10]
Stream()
cd2 10 10 A [10]

Stream<CD> Stream<Integer>
C [10, 8]
S []
parallel-
map() collect()
Stream()
cd3 8 8 A [8] C [10, 8, 10]

Stream<CD> Stream<Integer>
S Supplier
S []
parallel- A Accumulator
map() collect()
Stream()
cd4 10 10 A [10] C Combiner

(b) Parallel mutable reduction


968 CHAPTER 16: STREAMS

as partial result containers by the accumulator, and are later merged by the com-
biner to a final result container. The containers created by the supplier are mutated
by the accumulator and the combiner to perform mutable reduction. The partial
result containers are also merged in parallel by the combiner. It is instructive to
contrast this combiner with the combiner for parallel functional reduction that is
illustrated in Figure 16.12, p. 963.
In Example 16.12, the stream pipeline at (7) also creates a list containing the num-
ber of tracks on each CD, where the stream is parallel, and the lambda expressions
implementing the argument functions of the collect() method are augmented
with print statements so that actions of the functions can be logged. The output
from this parallel mutable reduction shows that the combiner is executed multiple
times to merge partial result lists. The actions of the argument functions shown in
the output are the same as those illustrated in Figure 16.14b. Of course, multiple
runs of the pipeline can show different sequences of operations in the output, but
the final result in the same. Also note that the elements retain their relative position
in the partial result lists as these are combined, preserving the encounter order of
the stream.
Although a stream is executed in parallel to perform mutable reduction, the merg-
ing of the partial containers by the combiner can impact performance if this is too
costly. For example, merging mutable maps can be costly compared to merging
mutable lists. This issue is further explored for parallel streams in §16.9, p. 1009.

Example 16.12 Implementing Mutable Reductions

import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.TreeSet;
import java.util.stream.Stream;

public final class Collecting {


public static void main(String[] args) {

// Query: Create a list with the number of tracks on each CD.


System.out.println("Sequential Mutable Reduction:");
List<Integer> tracks = CD.cdList // (1)
.stream() // (2a)
// .parallelStream() // (2b)
.map(CD::noOfTracks) // (3)
.collect(() -> new ArrayList<>(), // (4) Supplier
(cont, noOfTracks) -> cont.add(noOfTracks), // (5) Accumulator
(cont1, cont2) -> cont1.addAll(cont2)); // (6) Combiner
// .collect(ArrayList::new, ArrayList::add, ArrayList::addAll); // (6a)
// .toList();
System.out.println("Number of tracks on each CD (sequential): " + tracks);
System.out.println();

System.out.println("Parallel Mutable Reduction:");


List<Integer> tracks1 = CD.cdList // (7)
16.7: TERMINAL STREAM OPERATIONS 969

// .stream() // (8a)
.parallelStream() // (8b)
.map(CD::noOfTracks) // (9)
.collect( // (10)
() -> { // (11) Supplier
System.out.println("Supplier: Creating an ArrayList");
return new ArrayList<>();
},
(cont, noOfTracks) -> { // (12) Accumulator
System.out.printf("Accumulator: cont:%s, noOfTracks:%s",
cont, noOfTracks);
cont.add(noOfTracks);
System.out.printf(", mutCont:%s%n", cont);
},
(cont1, cont2) -> { // (13) Combiner
System.out.printf("Combiner: con1:%s, cont2:%s", cont1, cont2);
cont1.addAll(cont2);
System.out.printf(", mutCont:%s%n", cont1);
});
System.out.println("Number of tracks on each CD (parallel): " + tracks1);
System.out.println();

// Query: Create an ordered set with CD titles, according to natural order.


Set<String> cdTitles = CD.cdList // (14)
.stream()
.map(CD::title)
.collect(TreeSet::new, TreeSet::add, TreeSet::addAll);// (15)
System.out.println("CD titles: " + cdTitles);
System.out.println();

// Query: Go bananas.
StringBuilder goneBananas = Stream // (16)
.iterate("ba", b -> b + "na") // (17)
.limit(5)
.peek(System.out::println)
.collect(StringBuilder::new, // (18)
StringBuilder::append,
StringBuilder::append);
System.out.println("Go bananas: " + goneBananas);
}
}

Possible output from the program:


Sequential Mutable Reduction:
Number of tracks on each CD (sequential): [8, 6, 10, 8, 10]

Parallel Mutable Reduction:


Supplier: Creating an ArrayList
Accumulator: cont:[], noOfTracks:8, mutCont:[8]
Supplier: Creating an ArrayList
Accumulator: cont:[], noOfTracks:6, mutCont:[6]
Combiner: con1:[8], cont2:[6], mutCont:[8, 6]
Supplier: Creating an ArrayList
Accumulator: cont:[], noOfTracks:10, mutCont:[10]
Supplier: Creating an ArrayList
970 CHAPTER 16: STREAMS

Accumulator: cont:[], noOfTracks:8, mutCont:[8]


Combiner: con1:[10], cont2:[8], mutCont:[10, 8]
Supplier: Creating an ArrayList
Accumulator: cont:[], noOfTracks:10, mutCont:[10]
Combiner: con1:[10, 8], cont2:[10], mutCont:[10, 8, 10]
Combiner: con1:[8, 6], cont2:[10, 8, 10], mutCont:[8, 6, 10, 8, 10]
Number of tracks on each CD (parallel): [8, 6, 10, 8, 10]

CD titles: [Hot Generics, Java Jam, Java Jive, Keep on Erasing, Lambda Dancing]

ba
bana
banana
bananana
banananana
Go bananas: babanabananabanananabanananana

Example 16.12 also shows how other kinds of containers can be used for mutable
reduction. The stream pipeline at (14) performs mutable reduction to create an
ordered set with CD titles. The supplier is implemented by the constructor refer-
ence TreeSet::new. The constructor will create a container of type TreeSet<String>
that will maintain the CD titles according to the natural order for Strings. The accu-
mulator and the combiner are implemented by the method references TreeSet::add
and TreeSet::addAll, respectively. The accumulator will add a title to a container of
type TreeSet<String> and the combiner will merge the contents of two containers
of type TreeSet<String>.
In Example 16.12, the mutable reduction performed by the stream pipeline at (16)
uses a mutable container of type StringBuilder. The output from the peek() method
shows that the strings produced by the iterate() method start with the initial
string "ba" and are iteratively concatenated with the postfix "na". The limit() inter-
mediate operation truncates the infinite stream to five elements. The collect()
method appends the strings to a StringBuilder. The supplier creates an empty
StringBuilder. The accumulator and the combiner append a CharSequence to a
StringBuilder. In the case of the accumulator, the CharSequence is a String—that is, a
stream element—in the call to the append() method. But in the case of the combiner,
the CharSequence is a StringBuilder—that is, a partial result container when the
stream is parallel. One might be tempted to use a string instead of a StringBuilder,
but that would not be a good idea as a string is immutable.
Note that the accumulator and combiner of the collect() method do not return a
value. The collect() method does not terminate if applied to an infinite stream, as
the method will never finish processing all the elements in the stream.
Because mutable reduction uses the same mutable result container for accumulat-
ing new results by changing the state of the container, it is more efficient than a
functional reduction where a new partial result always replaces the previous par-
tial result.
16.7: TERMINAL STREAM OPERATIONS 971

Collecting to an Array
The overloaded method toArray() can be used to collect or accumulate into an
array. It is a special case of a mutable reduction, and as the name suggests, the
mutable container is an array. The numeric stream interfaces also provide a coun-
terpart to the toArray() method that returns an array of a numeric type.
Object[] toArray()
This terminal operation returns an array containing the elements of this
stream. Note that the array returned is of type Object[].
<A> A[] toArray(IntFunction<A[]> generator)
This terminal operation returns an array containing the elements of this
stream. The provided generator function is used to allocate the desired array.
The type parameter A is the element type of the array that is returned. The size
of the array (which is equal to the length of the stream) is passed to the gener-
ator function as an argument.

The zero-argument method toArray() returns an array of objects, Object[], as


generic arrays cannot be created at runtime. The method needs to store all the ele-
ments before creating an array of the appropriate length. The query at (1) finds the
titles of the CDs, and the toArray() method collects them into an array of objects,
Object[].
Object[] objArray = CD.cdList.stream().map(CD::title)
.toArray(); // (1)
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

The toArray(IntFunction<A>) method accepts a generator function that creates an


array of type A, (A[]), whose length is passed as a parameter by the method to the
generator function. The array length is determined from the number of elements
in the stream. The query at (2) also finds the CD titles, but the toArray() method
collects them into an array of strings, String[]. The method reference defining the
generator function is equivalent to the lambda expression (len -> new String[len]).
String[] cdTitles = CD.cdList.stream().map(CD::title)
.toArray(String[]::new); // (2)
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

Examples of numeric streams whose elements are collected into an array are
shown at (3) and (4). The limit() intermediate operation at (3) converts the infinite
stream into a finite one whose elements are collected into an int array.
int[] intArray1 = IntStream.iterate(1, i -> i + 1).limit(5).toArray();// (3)
// [1, 2, 3, 4, 5]
int[] intArray2 = IntStream.range(-5, 5).toArray(); // (4)
// [-5, -4, -3, -2, -1, 0, 1, 2, 3, 4]

Not surprisingly, when applied to infinite streams the operation results in a fatal
OutOfMemoryError, as the method cannot determine the length of the array and keeps
storing the stream elements, eventually running out of memory.
972 CHAPTER 16: STREAMS

int[] intArray3 = IntStream.iterate(1, i -> i + 1) // (5)


.toArray(); // OutOfMemoryError!

If the sole purpose of using the toArray() operation in a pipeline is to convert


the data source collection to an array, it is far better to use the overloaded
Collection.toArray() methods. For one thing, the size of the array is easily deter-
mined from the size of the collection.
CD[] cdArray1 = CD.cdList.stream().toArray(CD[]::new); // (6) Preferred.
CD[] cdArray2 = CD.cdList.toArray(new CD[CD.cdList.size()]); // (7) Not efficient.

Like any other mutable reduction operation, the toArray() method does not termi-
nate when applied to an infinite stream, unless it is converted into a finite stream
as at (3) above.

Collecting to a List
The method Stream.toList() implements a terminal operation that can be used to
collect or accumulate the result of processing a stream into a list. Compared to the
toArray() instance method, the toList() method is a default method in the Stream
interface. The default implementation returns an unmodifiable list; that is, elements
cannot be added, removed, or sorted. This unmodifiable list is created from the
array into which the elements are accumulated first.
If the requirement is an unmodifiable list that allows null elements, the Stream.to-
List() is the clear and concise choice. Many examples of stream pipelines encoun-
tered so far in this chapter use the toList() terminal operation.
List<String> titles = CD.cdList.stream().map(CD::title).toList();
// [Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]
titles.add("Java Jingles"); // UnsupportedOperationException!

Like any other mutable reduction operation, the toList() method does not termi-
nate when applied to an infinite stream, unless the stream is converted into a finite
stream.
default List<T> toList()
Accumulates the elements of this stream into a List, respecting any encounter
order the stream may have. The returned List is unmodifiable (§12.2, p. 649),
and calls to any mutator method will always result in an UnsupportedOperation-
Exception. The unmodifiable list returned allows null values.
See also the toList() method in the Collectors class (p. 980).
The Collectors.toCollection(Supplier) method is recommended for greater
control.

Functional Reductions Exclusive to Numeric Streams


In addition to the counterparts of the methods in the Stream<T> interface, the fol-
lowing functional reductions are exclusive to the numeric stream interfaces
16.7: TERMINAL STREAM OPERATIONS 973

IntStream, LongStream, and DoubleStream. These reduction operations are designed to


calculate various statistics on numeric streams.
In the methods below, NumType is Int, Long, or Double, and the corresponding numtype is
int, long, or double. These statistical operations do not terminate when applied to an
infinite stream:
numtype sum()
This terminal operation returns the sum of elements in this stream. It returns
zero if the stream is empty.
OptionalDouble average()
This terminal operation returns an OptionalDouble that encapsulates the arith-
metic mean of elements of this stream, or an empty Optional if this stream is
empty.
NumTypeSummaryStatistics summaryStatistics()
This terminal operation returns a NumTypeSummaryStatistics describing various
summary data about the elements of this stream.

Summation
The sum() terminal operation is a special case of a functional reduction that calcu-
lates the sum of numeric values in a stream. The stream pipeline below calculates
the total number of tracks on the CDs in a list. Note that the stream of CD is mapped
to an int stream whose elements represent the number of tracks on a CD. The int
values are cumulatively added to compute the total number of tracks.
int totNumOfTracks = CD.cdList
.stream() // Stream<CD>
.mapToInt(CD::noOfTracks) // IntStream
.sum(); // 42

The query below sums all even numbers between 1 and 100.
int sumEven = IntStream
.rangeClosed(1, 100)
.filter(i -> i % 2 == 0)
.sum(); // 2550

The count() operation is equivalent to mapping each stream element to the value 1
and adding the 1s:
int numOfCDs = CD.cdList
.stream()
.mapToInt(cd -> 1) // CD => 1
.sum(); // 5

For an empty stream, the sum is always zero.


double total = DoubleStream.empty().sum(); // 0.0
974 CHAPTER 16: STREAMS

Averaging
Another common statistics to calculate is the average of values, defined as the
sum of values divided by the number of values. A loop-based solution to calculate
the average would explicitly sum the values, count the number of values, and
do the calculation. In a stream-based solution, the average() terminal operation can
be used to calculate this value. The stream pipeline below computes the average
number of tracks on a CD. The CD stream is mapped to an int stream whose values
are the number of tracks on a CD. The average() terminal operation adds the number
of tracks and counts the values, returning the average as a double value encapsu-
lated in an OptionalDouble.
OptionalDouble optAverage = CD.cdList
.stream()
.mapToInt(CD::noOfTracks)
.average();
System.out.println(optAverage.orElse(0.0)); // 8.4

The reason for using an Optional is that the average is not defined if there are no
values. The absence of a value in the OptionalDouble returned by the method means
that the stream was empty.

Summarizing
The result of a functional reduction is a single value. This means that for calculat-
ing different results—for example, count, sum, average, min, and max—requires
separate reduction operations on a stream.
The method summaryStatistics() does several common reductions on a stream in a sin-
gle operation and returns the results in an object of type NumTypeSummaryStatistics,
where NumType is Int, Long, or Double. An object of this class encapsulates the count,
sum, average, min, and max values of a stream.
The classes IntSummaryStatistics, LongSummaryStatistics, and DoubleSummaryStatistics
in the java.util package define the following constructor and methods, where NumType
is Int (but it is Integer when used as a type name), Long, or Double, and the corresponding
numtype is int, long, or double:

NumTypeSummaryStatistics()
Creates an empty instance with zero count, zero sum, a min value as Num-
Type.MAX_VALUE, a max value as NumType.MIN_VALUE, and an average value of zero.

double getAverage()
Returns the arithmetic mean of values recorded, or zero if no values have been
recorded.
long getCount()
Returns the count of values recorded.
16.7: TERMINAL STREAM OPERATIONS 975

numtype getMax()
Returns the maximum value recorded, or NumType.MIN_VALUE if no values have
been recorded.

numtype getMin()
Returns the minimum value recorded, or NumType.MAX_VALUE if no values have
been recorded.
numtype getSum()
Returns the sum of values recorded, or zero if no values have been recorded.
The method in the IntSummaryStatistics and LongSummaryStatistics classes
returns a long value. The method in the DoubleSummaryStatistics class returns a
double value.

void accept(numtype value)


Records a new value into the summary information, and updates the various
statistics. The method in the LongSummaryStatistics class is overloaded and can
accept an int value as well.
void combine(NumTypeSummaryStatistics other)
Combines the state of another NumTypeSummaryStatistics into this one.

The summaryStatistics() method is used to calculate various statistics for the


number of tracks on two CDs processed by the stream pipeline below. Various get
methods are called on the IntSummaryStatistics object returned by the summary-
Statistics() method, and the statistics are printed.
IntSummaryStatistics stats1 = List.of(CD.cd0, CD.cd1)
.stream()
.mapToInt(CD::noOfTracks)
.summaryStatistics();
System.out.println("Count=" + stats1.getCount()); // Count=2
System.out.println("Sum=" + stats1.getSum()); // Sum=14
System.out.println("Min=" + stats1.getMin()); // Min=6
System.out.println("Max=" + stats1.getMax()); // Max=8
System.out.println("Average=" + stats1.getAverage()); // Average=7.0

The default format of the statistics printed by the toString() method of the
IntSummaryStatistics class is shown below:
System.out.println(stats1);
//IntSummaryStatistics{count=2, sum=14, min=6, average=7.000000, max=8}

Below, the accept() method records the value 10 (the number of tracks on CD.cd2)
into the summary information referenced by stats1. The resulting statistics show
the new count is 3 (=2 +1), the new sum is 24 (=14+10), and the new average is 8.0
(=24.0/3.0). However, the min value was not affected but the max value has
changed to 10.
stats1.accept(CD.cd2.noOfTracks()); // Add the value 10.
System.out.println(stats1);
//IntSummaryStatistics{count=3, sum=24, min=6, average=8.000000, max=10}
976 CHAPTER 16: STREAMS

The code below creates another IntSummaryStatistics object that summarizes the
statistics from two other CDs.
IntSummaryStatistics stats2 = List.of(CD.cd3, CD.cd4)
.stream()
.mapToInt(CD::noOfTracks)
.summaryStatistics();
System.out.println(stats2);
//IntSummaryStatistics{count=2, sum=18, min=8, average=9.000000, max=10}

The combine() method incorporates the state of one IntSummaryStatistics object into
another IntSummaryStatistics object. In the code below, the state of the IntSummary-
Statistics object referenced by stats2 is combined with the state of the IntSummary-
Statistics object referenced by stats1. The resulting summary information is
printed, showing that the new count is 5 (=3 +2), the new sum is 42 (=24+18), and
the new average is 8.4 (=42.0/5.0). However, the min and max values were not
affected.
stats1.combine(stats2); // Combine stats2 with stats1.
System.out.println(stats1);
//IntSummaryStatistics{count=5, sum=42, min=6, average=8.400000, max=10}

Calling the summaryStatistics() method on an empty stream returns an instance of


the IntSummaryStatistics class with a zero value set for all statistics, except for the
min and max values, which are set to Integer.MAX_VALUE and Integer.MIN_VALUE,
respectively. The IntSummaryStatistics class provides a zero-argument constructor
that also returns an empty instance.
IntSummaryStatistics emptyStats = IntStream.empty().summaryStatistics();
System.out.println(emptyStats);
//IntSummaryStatistics{count=0, sum=0, min=2147483647, average=0.000000,
//max=-2147483648}

The summary statistics classes are not exclusive for use with streams, as they pro-
vide a constructor and appropriate methods to incorporate numeric values in
order to calculate common statistics, as we have seen here. We will return to calcu-
lating statistics when we discuss built-in collectors (p. 978).

Summary of Terminal Stream Operations


The terminal operations of the Stream<T> class are summarized in Table 16.5. The
type parameter declarations have been simplified, where any bound <? super T> or
<? extends T> has been replaced by <T>, without impacting the intent of a method.
A reference is provided to each method in the first column.
The last column in Table 16.5 indicates the function type of the corresponding
parameter in the previous column. It is instructive to note how the functional inter-
face parameters provide the parameterized behavior of an operation. For example,
the method allMatch() returns a boolean value to indicate whether all elements of a
stream satisfy a given predicate. This predicate is implemented as a functional
interface Predicate<T> that is applied to each element in the stream.
16.7: TERMINAL STREAM OPERATIONS 977

The interfaces IntStream, LongStream, and DoubleStream define analogous methods to


those shown for the Stream<T> interface in Table 16.5. Methods that are only defined
by the numeric stream interfaces are shown in Table 16.6.

Table 16.5 Terminal Stream Operations

Any type
parameter
Method name + Function type
(ref.) return type Functional interface parameters of parameters
forEach (p. 948) void (Consumer<T> action) T -> void

forEachOrdered void (Consumer<T> action) T -> void


(p. 948)

allMatch (p. 949) boolean (Predicate<T> predicate) T -> boolean

anyMatch (p. 949) boolean (Predicate<T> predicate) T -> boolean

noneMatch (p. boolean (Predicate<T> predicate) T -> boolean


949)

findAny (p. 952) Optional<T> ()

findFirst (p. Optional<T> ()


952)

count (p. 953) long ()

max (p. 954) Optional<T> (Comparator<T> cmp) (T,T) -> int

min (p. 954) Optional<T> (Comparator<T> cmp) (T,T) -> int

reduce (p. 955) Optional<T> (BinaryOperator<T> accumulator) (T,T) -> T

reduce (p. 955) T (T identity, T -> T,


BinaryOperator<T> accumulator) (T,T) -> T

reduce (p. 955) <U> U (U identity, U -> U,


BiFunction<U,T,U> accumulator, (U,T) -> U,
BinaryOperator<U> combiner) (U,U) -> U

collect (p. 964) <R,A> R (Collector<T,A,R> collector) Parameter is


not a functional
interface.
collect (p. 964) <R> R (Supplier<R> supplier, () -> R,
BiConsumer<R,T> accumulator, (R,T) -> void,
BiConsumer<R,R> combiner) (R,R) -> void

toArray (p. 971) Object[] ()

toArray (p. 971) <A> A[] (IntFunction<A[]> generator) int -> A[]

toList (p. 972) List<T> ()


978 CHAPTER 16: STREAMS

Table 16.6 Additional Terminal Operations in the Numeric Stream Interfaces

Method name (ref.) Return type


average (p. 949) OptionalNumType, where NumType is Int, Long, or Double

sum (p. 949) numtype, where numtype is int, long, or double

summaryStatistics (p. NumTypeSummaryStatistics, where NumType is Int, Long, or Double


974)

16.8 Collectors
A collector encapsulates the functions required for performing reduction: the sup-
plier, the accumulator, the combiner, and the finisher. It can provide these func-
tions since it implements the Collector interface (in the java.util.stream package)
that defines the methods to create these functions. It is passed as an argument to
the collect(Collector) method in order to perform a reduction operation. In con-
trast, the collect(Supplier, BiConsumer, BiConsumer) method requires the functions
supplier, accumulator, and combiner, respectively, to be passed as arguments in the
method call.
Details of implementing a collector are not necessary for our purposes, as we will
exclusively use the extensive set of predefined collectors provided by the static fac-
tory methods of the Collectors class in the java.util.stream package (Table 16.7,
p. 1005). In most cases, it should be possible to find a predefined collector for the task
at hand. The collectors use various kinds of containers for performing reduction—
for example, accumulating to a map, or finding the minimum or maximum ele-
ment. For example, the Collectors.toList() factory method creates a collector that
performs mutable reduction using a list as a mutable container. It can be passed to
the collect(Collector) terminal operation of a stream.
It is a common practice to import the static factory methods of the Collectors class
in the code so that the methods can be called by their simple names.
import static java.util.stream.Collectors.*;

However, the practice adopted in this chapter is to assume that only the Collectors
class is imported, enforcing the connection between the static methods and the
class to be done explicitly in the code. Of course, static import of factory methods
can be used once familiarity with the collectors is established.
import java.util.stream.Collectors;

The three-argument collect() method is primarily used to implement mutable


reduction, whereas the Collectors class provides collectors for both functional and
mutable reduction that can be either used in a stand-alone capacity or composed
with other collectors.
16.8: COLLECTORS 979

One group of collectors is designed to collect to a predetermined container, which is


evident from the name of the static factory method that creates it: toCollection,
toList, toSet, and toMap (p. 979). The overloaded toCollection() and toMap() meth-
ods allow a specific implementation of a collection and a map to be used, respec-
tively—for example, a TreeSet for a collection and a TreeMap for a map. In addition,
there is the joining() method that creates a collector for concatenating the input
elements to a String—however, internally it uses a mutable StringBuilder (p. 984).
Collectors can be composed with other collectors; that is, the partial results from
one collector can be additionally processed by another collector (called the down-
stream collector) to produce the final result. Many collectors that can be used as a
downstream collector perform functional reduction such as counting values, find-
ing the minimum and maximum values, summing values, averaging values, and
summarizing common statistics for values (p. 998).
Composition of collectors is utilized to perform multilevel grouping and partitioning
on stream elements (p. 985). The groupingBy() and partitionBy() methods return
composed collectors to create classification maps. In such a map, the keys are deter-
mined by a classifier function, and the values are the result of a downstream collec-
tor, called the classification mapping. For example, the CDs in a stream could be
classified into a map where the key represents the number of tracks on a CD and
the associated value of a key can be a list of CDs with the same number of tracks.
The list of CDs with the same number of tracks is the result of an appropriate
downstream collector.

Collecting to a Collection
The method toCollection(Supplier) creates a collector that uses a mutable con-
tainer of a specific Collection type to perform mutable reduction. A supplier to cre-
ate the mutable container is specified as an argument to the method.
The following stream pipeline creates an ArrayList<String> instance with the titles
of all CDs in the stream. The constructor reference ArrayList::new returns an empty
ArrayList<String> instance, where the element type String is inferred from the con-
text.
ArrayList<String> cdTitles1 = CD.cdList.stream() // Stream<CD>
.map(CD::title) // Stream<String>
.collect(Collectors.toCollection(ArrayList::new));
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

static <T,C extends Collection<T>> Collector<T,?,C>


toCollection(Supplier<C> collectionFactory)
Returns a Collector that accumulates the input elements of type T into a new
Collection of type C, in encounter order. A new empty Collection of type C is
created by the collectionFactory supplier, thus the collection created can be of
a specific Collection type.
980 CHAPTER 16: STREAMS

static <T> Collector<T,?,List<T>> toList()


static <T> Collector<T,?,List<T>> toUnmodifiableList()
Return a Collector that accumulates the input elements of type T into a new
List or an unmodifiable List of type T, respectively, in encounter order.
The toList() method gives no guarantees of any kind for the returned list.
The unmodifiable list returned does not allow null values.
See also the Stream.toList() terminal operation (p. 972).
static <T> Collector<T,?,Set<T>> toSet()
static <T> Collector<T,?,Set<T>> toUnmodifiableSet()
Return an unordered Collector that accumulates the input elements of type T
into a new Set or an unmodifiable Set of type T, respectively.

Collecting to a List
The method toList() creates a collector that uses a mutable container of type List
to perform mutable reduction. This collector guarantees to preserve the encounter
order of the input stream, if it has one. For more control over the type of the list,
the toCollection() method can be used. This collector can be used as a downstream
collector.
The following stream pipeline creates a list with the titles of all CDs in the stream
using a collector returned by the Collectors.toList() method. Although the
returned list is modified, this is implementation dependent and should not be
relied upon.
List<String> cdTitles3 = CD.cdList.stream() // Stream<CD>
.map(CD::title) // Stream<String>
.collect(Collectors.toList());
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]
titles.add("Java Jingles"); // OK

Collecting to a Set
The method toSet() creates a collector that uses a mutable container of type Set to
perform mutable reduction. The collector does not guarantee to preserve the
encounter order of the input stream. For more control over the type of the set, the
toCollection() method can be used.

The following stream pipeline creates a set with the titles of all CDs in the stream.
Set<String> cdTitles2 = CD.cdList.stream() // Stream<CD>
.map(CD::title) // Stream<String>
.collect(Collectors.toSet());
//[Hot Generics, Java Jive, Lambda Dancing, Keep on Erasing, Java Jam]
16.8: COLLECTORS 981

Collecting to a Map
The method toMap() creates a collector that performs mutable reduction to a muta-
ble container of type Map.
static <T,K,U> Collector<T,?,Map<K,U>> toMap(
Function<? super T,? extends K> keyMapper,
Function<? super T,? extends U> valueMapper)

static <T,K,U> Collector<T,?,Map<K,U>> toMap(


Function<? super T,? extends K> keyMapper,
Function<? super T,? extends U> valueMapper,
BinaryOperator<U> mergeFunction)

static <T,K,U,M extends Map<K,U>> Collector<T,?,M> toMap(


Function<? super T,? extends K> keyMapper,
Function<? super T,? extends U> valueMapper,
BinaryOperator<U> mergeFunction,
Supplier<M> mapSupplier)
Return a Collector that accumulates elements of type T into a Map whose keys
and values are the result of applying the provided key and value mapping
functions to the input elements.
The keyMapper function produces keys of type K, and the valueMapper function
produces values of type U.
In the first method, the mapped keys cannot have duplicates—an Illegal-
StateException will be thrown if that is the case.
In the second and third methods, the mergeFunction binary operator is used to
resolve collisions between values associated with the same key, as supplied to
Map.merge(Object, Object, BiFunction).
In the third method, the provided mapSupplier function returns a new Map into
which the results will be inserted.

The collector returned by the method toMap() uses either a default map or one that
is supplied. To be able to create an entry in a Map<K,U> from stream elements of type
T, the collector requires two functions:

• keyMapper: T -> K, which is a Function to extract a key of type K from a stream ele-
ment of type T.
• valueMapper: T -> U, which is a Function to extract a value of type U for a given
key of type K from a stream element of type T.
Additional functions as arguments allow various controls to be exercised on the
map:
• mergeFunction: (U,U) -> U, which is a BinaryOperator to merge two values that are
associated with the same key. The merge function must be specified if collision
of values can occur during the mutable reduction, or a resounding exception
will be thrown.
982 CHAPTER 16: STREAMS

• mapSupplier: () -> M extends Map<K,V>, which is a Supplier that creates a map


instance of a specific type to use for mutable reduction. The map created is a
subtype of Map<K,V>. Without this function, the collector uses a default map.
Figure 16.15 illustrates collecting to a map. The stream pipeline creates a map of
CD titles and their release year—that is, a Map<String, Year>, where K is String and
V is Year. The keyMapper CD::title and the valueMapper CD::year extract the title
(String) and the year (Year) from each CD in the stream, respectively. The entries
are accumulated in a default map (Map<String, Year>).
What if we wanted to create a map with CDs and their release year—that is, a
Map<CD, Year>? In that case, the keyMapper should return the CD as the key—that is,
map a CD to itself. That is exactly what the keyMapper Function.identity() does in
the pipeline below.
Map<CD, Year> mapCDToYear = CD.cdList.stream()
.collect(Collectors.toMap(Function.identity(), CD::year)); // Map<CD, Year>

Figure 16.15 Collecting to a Map


//Query: Create a map of CD titles and their release year.
Map<String, Year> mapTitleToYear = CD.cdList.stream()
.collect(Collectors.toMap(CD::title, CD::year));

(a) Using the Collectors.toMap() method

Stream<CD>
Contents of
the cdList
Map<String,Year>
collect()
Title Year
cd4 cd3 cd2 cd1 cd0
<"Keep on Erasing", 2018>
cd4 cd3 cd2 cd1 <"Java Jam" , 2017>

cd4 cd3 cd2 <"Hot Generics" , 2018>

cd4 cd3
<"Java Jive" , 2017>
cd4
<"Lambda Dancing" , 2018>

(b) Stream pipeline

As there were no duplicates of the key in the previous two examples, there was no
collision of values in the map. In the list dupList below, there are duplicates of CDs
(CD.cd0, CD.cd1). Executing the pipeline results in a runtime exception at (1).
List<CD> dupList = List.of(CD.cd0, CD.cd1, CD.cd2, CD.cd0, CD.cd1);
Map<String, Year> mapTitleToYear1 = dupList.stream()
.collect(Collectors.toMap(CD::title, CD::year)); // (1)
// IllegalStateException: Duplicate key 2017
16.8: COLLECTORS 983

The collision values can be resolved by specifying a merge function. In the pipeline
below, the arguments of the merge function (y1, y2) -> y1 at (1) have the same value
for the year if we assume that a CD can only be released once. Note that y1 and y2
denote the existing value in the map and the value to merge, respectively. The
merge function can return any one of the values to resolve the collision.
Map<String, Year> mapTitleToYear2 = dupList.stream()
.collect(Collectors.toMap(CD::title, CD::year, (y1, y2) -> y1)); // (1)

The stream pipeline below creates a map of CD titles released each year. As more
than one CD can be released in a year, collision of titles can occur for a year. The
merge function (tt, t) -> tt + ":" + t concatenates the titles in each year separated
by a colon, if necessary. Note that tt and t denote the existing value in the map and
the value to merge, respectively.
Map<Year, String> mapTitleToYear3 = CD.cdList.stream()
.collect(Collectors.toMap(CD::year, CD::title,
(tt, t) -> tt + ":" + t));
//{2017=Java Jive:Java Jam, 2018=Lambda Dancing:Keep on Erasing:Hot Generics}

The stream pipeline below creates a map with the longest title released each year.
For greater control over the type of the map in which to accumulate the entries, a
supplier is specified. The supplier TreeMap::new returns an empty instance of a
TreeMap in which the entries are accumulated. The keys in such a map are sorted in
their natural order—the class java.time.Year implements the Comparable<Year>
interface.
TreeMap<Year, String> mapYearToLongestTitle = CD.cdList.stream()
.collect(Collectors.toMap(CD::year, CD::title,
BinaryOperator.maxBy(Comparator.naturalOrder()),
TreeMap::new));
//{2017=Java Jive, 2018=Lambda Dancing}

The merge function specified is equivalent to the following lambda expression,


returning the greater of two strings:
(str1, str2) -> str1.compareTo(str2) > 0 ? str1 : str2

Collecting to a ConcurrentMap
If the collector returned by the Collectors.toMap() method is used in a parallel
stream, the multiple partial maps created during parallel execution are merged by
the collector to create the final result map. Merging maps can be expensive if keys
from one map are merged into another. To address the problem, the Collectors
class provides the three overloaded methods toConcurrentMap(), analogous to the
three toMap() methods, that return a concurrent collector—that is, a collector that
uses a single concurrent map to perform the reduction. A concurrent map is thread-
safe and unordered. A concurrent map implements the java.util.concurrent.Concur-
rentMap interface, which is a subinterface of java.util.Map interface (§23.7, p. 1482).

Using a concurrent map avoids merging of maps during parallel execution, as a


single map is created that is used concurrently to accumulate the results from the
execution of each substream. However, the concurrent map is unordered—any
984 CHAPTER 16: STREAMS

encounter order in the stream is ignored. Usage of the toConcurrentMap() method is


illustrated by the following example of a parallel stream to create a concurrent map
of CD titles released each year.
ConcurrentMap<Year, String> concMapYearToTitles = CD.cdList
.parallelStream()
.collect(Collectors.toConcurrentMap(CD::year, CD::title,
(tt, t) -> tt + ":" + t));
//{2017=Java Jam:Java Jive, 2018=Lambda Dancing:Hot Generics:Keep on Erasing}

Joining
The joining() method creates a collector for concatenating the input elements of
type CharSequence to a single immutable String. However, internally it uses a muta-
ble StringBuilder. Note that the collector returned by the joining() methods per-
forms functional reduction, as its result is a single immutable string.
static Collector<CharSequence,?,String> joining()
static Collector<CharSequence,?,String> joining(CharSequence delimiter)
static Collector<CharSequence,?,String> joining(CharSequence delimiter,
CharSequence prefix,
CharSequence suffix)
Return a Collector that concatenates CharSequence elements into a String. The
first method concatenates in encounter order. So does the second method, but
this method separates the elements by the specified delimiter. The third
method in addition applies the specified prefix and suffix to the result of the
concatenation.
The wildcard ? is a type parameter that is used internally by the collector.
The methods preserve the encounter order, if the stream has one.
Among the classes that implement the CharSequence interface are the String,
StringBuffer, and StringBuilder classes.

The stream pipelines below concatenate CD titles to illustrate the three overloaded
joining() methods. The CharSequence elements are Strings. The strings are concate-
nated in the stream encounter order, which is the positional order for lists. The
zero-argument joining() method at (1) performs string concatenation of the CD
titles using a StringBuilder internally, and returns the result as a string.
String concatTitles1 = CD.cdList.stream() // Stream<CD>
.map(CD::title) // Stream<String>
.collect(Collectors.joining()); // (1)
//Java JiveJava JamLambda DancingKeep on ErasingHot Generics

The single-argument joining() method at (2) concatenates the titles using the spec-
ified delimiter.
String concatTitles2 = CD.cdList.stream()
.map(CD::title)
.collect(Collectors.joining(", ")); // (2) Delimiter
//Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics
16.8: COLLECTORS 985

The three-argument joining() method at (3) concatenates the titles using the spec-
ified delimiter, prefix, and suffix.

String concatTitles3 = CD.cdList.stream()


.map(CD::title)
.collect(Collectors.joining(", ", "[", "]")); // (3) Delimiter, Prefix, Suffix
//[Java Jive, Java Jam, Lambda Dancing, Keep on Erasing, Hot Generics]

Grouping
Classifying elements into groups based on some criteria is a very common opera-
tion. An example is classifying CDs into groups according to the number of tracks
on them (this sounds esoteric, but it will illustrate the point). Such an operation can
be accomplished by the collector returned by the groupingBy() method. The method
is passed a classifier function that is used to classify the elements into different
groups. The result of the operation is a classification map whose entries are the dif-
ferent groups into which the elements have been classified. The key in a map entry
is the result of applying the classifier function on the element. The key is extracted
from the element based on some property of the element—for example, the num-
ber of tracks on the CD. The value associated with a key in a map entry comprises
those elements that belong to the same group. The operation is analogous to the
group-by operation in databases.
There are three versions of the groupingBy() method that provide increasingly more
control over the grouping operation.
static <T,K> Collector<T,?,Map<K,List<T>>> groupingBy(
Function<? super T,? extends K> classifier)

static <T,K,A,D> Collector<T,?,Map<K,D>> groupingBy(


Function<? super T,? extends K> classifier,
Collector<? super T,A,D> downstream)

static <T,K,D,A,M extends Map<K,D>> Collector<T,?,M> groupingBy(


Function<? super T,? extends K> classifier,
Supplier<M> mapSupplier,
Collector<? super T,A,D> downstream)
The Collector returned by the groupingBy() methods implements a group-by
operation on input elements to create a classification map.
The classifier function maps elements of type T to keys of some type K. These
keys determine the groups in the classification map.
The collector returned by the single-argument method produces a classifica-
tion map of type Map<K, List<T>>. The keys in this map are the results from
applying the specified classifier function to the input elements. The input ele-
ments that map to the same key are accumulated into a List by the default
downstream collector Collector.toList().
986 CHAPTER 16: STREAMS

The two-argument method accepts a downstream collector, in addition to the


classifier function. The collector returned by the method is composed with
the specified downstream collector that performs a reduction operation on the
input elements that map to the same key. It operates on elements of type T and
produces a result of type D. The result of type D produced by the downstream
collector is the value associated with the key of type K. The composed collector
thus results in a classification map of type Map<K, D>.
The three-argument method accepts a map supplier as its second parameter. It
creates an empty classification map of type M that is used by the composed col-
lector. The result is a classification map of type M whose key and value types
are K and D, respectively.

Figure 16.16 illustrates the groupingBy() operation by grouping CDs according


to the number of tracks on them. The classifier function CD::noOfTracks extracts
the number of tracks from a CD that acts as a key in the classification map
(Map<Integer, List<CD>>). Since the call to the groupingBy() method in Figure 16.16
does not specify a downstream collector, the default downstream collector
Collector.to-List() is used to accumulate CDs that have the same number of
tracks. The number of groups—that is, the number of distinct keys—is equal to the
number of distinct values for the number of tracks on the CDs. Each distinct value
for the number of tracks is associated with the list of CDs having that value as the
number of tracks.

Figure 16.16 Grouping

// Query: Group by number of tracks.


Map<Integer, List<CD>> map11 = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks)); // (1)

(a) Using the Collectors.groupBy() method

Stream<CD>
Contents of
CD.cdList
collect() Map<Integer,List<CD>>
No. of tracks List of CDs
cd4 cd3 cd2 cd1 cd0
<6 , [ cd1 ]>
cd4 cd3 cd2 cd1

cd4 cd3 cd2 <8 , [ cd0 , cd3 ]>

cd4 cd3
<10 , [ cd2 , cd4 ]>
cd4

(b) Stream pipeline


16.8: COLLECTORS 987

The three stream pipelines below result in a classification map that is equivalent to
the one in Figure 16.16. The call to the groupingBy() method at (2) specifies the
downstream collector explicitly, and is equivalent to the call in Figure 16.16.
Map<Integer, List<CD>> map22 = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks, Collectors.toList())); // (2)

The call to the groupingBy() method at (3) specifies the supplier TreeMap:new so that
a TreeMap<Integer, List<CD>> is used as the classification map.
Map<Integer, List<CD>> map33 = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks, // (3)
TreeMap::new,
Collectors.toList()));

The call to the groupingBy() method at (4) specifies the downstream collector
Collector.toSet() that uses a set to accumulate the CDs for a group.
Map<Integer, Set<CD>> map44 = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks, Collectors.toSet())); // (4)

The classification maps created by the pipelines above will contain the three entries
shown below, but only the groupingBy() method call at (3) can guarantee that the
entries will be sorted in a TreeMap<Integer, List<CD>> according to the natural order
for the Integer keys.
{
6=[<Jaav, "Java Jam", 6, 2017, JAZZ>],
8=[<Jaav, "Java Jive", 8, 2017, POP>,
<Genericos, "Keep on Erasing", 8, 2018, JAZZ>],
10=[<Funkies, "Lambda Dancing", 10, 2018, POP>,
<Genericos, "Hot Generics", 10, 2018, JAZZ>]
}

In general, any collector can be passed as a downstream collector to the


groupingBy() method. In the stream pipeline below, the map value in the classifica-
tion map is a count of the number of CDs having the same number of tracks. The
collector Collector.counting() performs a functional reduction to count the CDs
having the same number of tracks (p. 998).
Map<Integer, Long> map55 = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks, Collectors.counting()));
//{6=1, 8=2, 10=2}

Multilevel Grouping
The downstream collector in a groupingBy() operation can be created by another
groupingBy() operation, resulting in a multilevel grouping operation—also known as
a multilevel classification or cascaded grouping operation. We can extend the multi-
level groupingBy() operation to any number of levels by making the downstream
collector be a groupingBy() operation.
988 CHAPTER 16: STREAMS

The stream pipeline below creates a classification map in which the CDs are first
grouped by the number of tracks in a CD at (1), and then grouped by the musical
genre of a CD at (2).
Map<Integer, Map<Genre, List<CD>>> twoLevelGrp = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks, // (1)
Collectors.groupingBy(CD::genre))); // (2)

Printing the contents of the resulting classification map would show the following
three entries, not necessarily in this order:
{
6={JAZZ=[<Jaav, "Java Jam", 6, 2017, JAZZ>]},
8={JAZZ=[<Genericos, "Keep on Erasing", 8, 2018, JAZZ>],
POP=[<Jaav, "Java Jive", 8, 2017, POP>]},
10={JAZZ=[<Genericos, "Hot Generics", 10, 2018, JAZZ>],
POP=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}
}

The entries of the resulting classification map can also be illustrated as a two-
dimensional matrix, as shown in Figure 16.16, where the CDs are first grouped into
rows by the number of tracks, and then grouped into columns by the musical
genre. The value of an element in the matrix is a list of CDs which have the same
number of tracks (row) and the same musical genre (column).

Figure 16.17 Multilevel Grouping as a Two-Dimensional Matrix

Genre
JAZZ POP

6 [ cd1 ]

No. of tracks 8 [ cd3 ] [ cd0 ]

10 [ cd2 ] [ cd4 ]

The number of groups in the classification map returned by the above pipeline is
equal to the number of distinct values for the number of tracks, as in the single-
level groupingBy() operation. However, each value associated with a key in the outer
classification map is now an inner classification map that is managed by the second-
level groupingBy() operation. The inner classification map has the type Map<Genre,
List<CD>>; in other words, the key in the inner classification map is the musical
genre of the CD and the value associated with this key is a List of CDs with this
musical genre. It is the second-level groupingBy() operation that is responsible for
grouping each CD in the inner classification map. Since no explicit downstream
collector is specified for the second-level groupingBy() operation, it uses the default
downstream collector Collector.toList().
We can modify the multilevel groupingBy() operation to count the CDs that have
the same musical genre and the same number of tracks by specifying an explicit
downstream collector for the second-level groupingBy() operation, as shown at (3).
16.8: COLLECTORS 989

The collector Collectors.counting() at (3) performs a functional reduction by accu-


mulating the count for CDs with the same number of tracks and the same musical
genre in the inner classification map (p. 998).
Map<Integer, Map<Genre, Long>> twoLevelGrp2 = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks,
Collectors.groupingBy(CD::genre,
Collectors.counting()))); // (3)

Printing the contents of the resulting classification map produced by this multi-
level groupingBy() operation would show the following three entries, again not nec-
essarily in this order:
{6={JAZZ=1}, 8={JAZZ=1, POP=1}, 10={JAZZ=1, POP=1}}

It is instructive to compare the entries in the resulting classification maps in the


two examples illustrated here.
To truly appreciate the groupingBy() operation, the reader is highly encouraged to
implement the multilevel grouping examples in an imperative style, without using
the Stream API. Good luck!

Grouping to a ConcurrentMap
If the collector returned by the Collectors.groupingBy() method is used in a parallel
stream, the partial maps created during execution are merged to create the final
map—as in the case of the Collectors.toMap() method (p. 983). Merging maps can
carry a performance penalty. The Collectors class provides the three groupingBy-
Concurrent() overloaded methods, analogous to the three groupingBy() methods,
that return a concurrent collector—that is, a collector that uses a single concurrent
map to perform the reduction. The entries in such a map are unordered. A concur-
rent map implements the java.util.concurrent.ConcurrentMap interface (§23.7,
p. 1482).
Usage of the groupingByConcurrent() method is illustrated by the following example
of a parallel stream to create a concurrent map of the number of CDs that have the
same number of tracks.
ConcurrentMap<Integer, Long> map66 = CD.cdList
.parallelStream()
.collect(Collectors.groupingByConcurrent(CD::noOfTracks,
Collectors.counting()));
//{6=1, 8=2, 10=2}

Partitioning
Partitioning is a special case of grouping. The classifier function that was used for
grouping is now a partitioning predicate in the partitioningBy() method. The predi-
cate function returns the boolean value true or false. As the keys of the resulting
map are determined by the classifier function, the keys are determined by the par-
titioning predicate in the case of partitioning. Thus the keys are always of type
990 CHAPTER 16: STREAMS

Boolean, implying that the classification map can have, at most, two map entries. In
other words, the partitioningBy() method can only create, at most, two partitions
from the input elements. The map value associated with a key in the resulting map
is managed by a downstream collector, as in the case of the groupingBy() method.
There are two versions of the partitioningBy() method:
static <T> Collector<T,?,Map<Boolean,List<T>>> partitioningBy(
Predicate<? super T> predicate)

static <T,D,A> Collector<T,?,Map<Boolean,D>> partitioningBy(


Predicate<? super T> predicate,
Collector<? super T,A,D> downstream)
The collector returned by the first method produces a classification map of
type Map<Boolean, List<T>>. The keys in this map are the results from applying
the partitioning predicate to the input elements. The input elements that map
to the same Boolean key are accumulated into a List by the default downstream
collector Collector.toList().
The second method accepts a downstream collector, in addition to the parti-
tioning predicate. The collector returned by the method is composed with the
specified downstream collector that performs a reduction operation on the
input elements that map to the same key. It operates on elements of type T and
produces a result of type D. The result of type D produced by the downstream
collector is the value associated with the key of type Boolean. The composed
collector thus results in a resulting map of type Map<Boolean, D>.

Figure 16.18 illustrates the partitioningBy() operation by partitioning CDs accord-


ing to the predicate CD::isPop that determines whether a CD is a pop music CD.
The result of the partitioning predicate acts as the key in the resulting map of type
Map<Boolean, List<CD>>. Since the call to the partitioningBy() method in Figure 16.18
does not specify a downstream collector, the default downstream collector Collec-
tor.toList() is used to accumulate CDs that map to the same key. The resulting
map has two entries or partitions: one for CDs that are pop music CDs and one for
CDs that are not. The two entries of the resulting map are also shown below:
{false=[<Jaav, "Java Jam", 6, 2017, JAZZ>,
<Genericos, "Keep on Erasing", 8, 2018, JAZZ>,
<Genericos, "Hot Generics", 10, 2018, JAZZ>],
true=[<Jaav, "Java Jive", 8, 2017, POP>,
<Funkies, "Lambda Dancing", 10, 2018, POP>]}

The values in a partition can be obtained by calling the Map.get() method:


List<CD> popCDs = map1.get(true);
List<CD> nonPopCDs = map1.get(false);

The stream pipeline at (2) is equivalent to the one in Figure 16.18, where the down-
stream collector is specified explicitly.
Map<Boolean, List<CD>> map2 = CD.cdList.stream()
.collect(Collectors.partitioningBy(CD::isPop, Collectors.toList())); // (2)
16.8: COLLECTORS 991

We could have composed a stream pipeline to filter the CDs that are pop music
CDs and collected them into a list. We would have to compose a second pipeline
to find the CDs that are not pop music CDs. However, the partitioningBy() method
does both in a single operation.

Figure 16.18 Partitioning

// Query: Partition by whether it is a pop music CD.


Map<Boolean, List<CD>> map1 = CD.cdList.stream()
.collect(Collectors.partitioningBy(CD::isPop)); // (1)

(a) Using the Collectors.partitionBy() method

Stream<CD>
Contents of
CD.cdList
collect() Map<Boolean,List<CD>>

cd4 cd3 cd2 cd1 cd0


<FALSE,[ cd1 , cd3 , cd4 ]>
cd4 cd3 cd2 cd1

cd4 cd3 cd2 <TRUE ,[ cd0 , cd2 ]>

cd4 cd3

cd4

(b) Stream pipeline

Analogous to the groupingBy() method, any collector can be passed as a down-


stream collector to the partitioningBy() method. In the stream pipeline below, the
downstream collector Collector.counting() performs a functional reduction to
count the number of CDs associated with a key (p. 998).
Map<Boolean, Long> map3 = CD.cdList.stream()
.collect(Collectors.partitioningBy(CD::isPop, Collectors.counting()));
//{false=3, true=2}

Multilevel Partitioning
Like the groupingBy() method, the partitioningBy() operation can be used in mul-
tilevel classification. The downstream collector in a partitioningBy() operation can
be created by another partitioningBy() operation, resulting in a multilevel partition-
ing operation—also known as a cascaded partitioning operation. The downstream
collector can also be a groupingBy() operation.
In the stream pipeline below, the CDs are partitioned at (1): one partition for CDs
that are pop music CDs, and one for those that are not. The CDs that are associated
with a key are grouped by the year in which they were released. Note that the CDs
992 CHAPTER 16: STREAMS

that were released in a year are accumulated into a List by the default downstream
collector Collector.toList() that is employed by the groupingBy() operation at (2).
Map<Boolean, Map<Year, List<CD>>> map1 = CD.cdList.stream()
.collect(Collectors.partitioningBy(CD::isPop, // (1)
Collectors.groupingBy(CD::year))); // (2)

Printing the contents of the resulting map would show the following two entries,
not necessarily in this order.
{false={2017=[<Jaav, "Java Jam", 6, 2017, JAZZ>],
2018=[<Genericos, "Keep on Erasing", 8, 2018, JAZZ>,
<Genericos, "Hot Generics", 10, 2018, JAZZ>]},
true={2017=[<Jaav, "Java Jive", 8, 2017, POP>],
2018=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}}

Filtering Adapter for Downstream Collectors


The filtering() method of the Collectors class encapsulates a predicate and a down-
stream collector to create an adapter for a filtering operation. (See also the filter()
intermediate operation, p. 912.)
static <T,A,R> Collector<T,?,R> filtering(
Predicate<? super T> predicate,
Collector<? super T,A,R> downstream)
Returns a Collector that applies the predicate to input elements of type T to
determine which elements should be passed to the downstream collector. This
downstream collector accumulates them into results of type R, where the type
parameter A is the intermediate accumulation type of the downstream collector.

The following code uses the filtering() operation at (2) to group pop music CDs
according to the number of tracks on them. The groupingBy() operation at (1) cre-
ates the groups based on the number of tracks on the CDs, but the filtering() oper-
ation only allows pop music CDs to pass downstream to be accumulated.
// Filtering downstream from grouping.
Map<Integer, List<CD>> grpByTracksFilterByPopCD = CD.cdList.stream()
.collect(Collectors.groupingBy(CD::noOfTracks, // (1)
Collectors.filtering(CD::isPop, Collectors.toList()))); // (2)

Printing the contents of the resulting map would show the entries below, not nec-
essarily in this order. Note that the output shows that there was one or more CDs
with six tracks, but there were no pop music CDs. Hence the list of CDs associated
with key 6 is empty.
{6=[],
8=[<Jaav, "Java Jive", 8, 2017, POP>],
10=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}

However, if we run the same query using the filter() intermediate stream opera-
tion at (1) prior to grouping, the contents of the result map are different, as shown
below.
16.8: COLLECTORS 993

// Filtering before grouping.


Map<Integer, List<CD>> filterByPopCDGrpByTracks = CD.cdList.stream()
.filter(CD::isPop) // (1)
.collect(Collectors.groupingBy(CD::noOfTracks, Collectors.toList()));

Contents of the result map show that only entries that have a non-empty list as a
value are contained in the map. This is not surprising, as any non-pop music CD is
discarded before grouping, so only pop music CDs are grouped.
{8=[<Jaav, "Java Jive", 8, 2017, POP>],
10=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}

There are no surprises with partitioning, regardless of whether filtering is done


before or after the partitioning, as partitioning always results in a map with two
entries: one for the Boolean.TRUE key and one for the Boolean.FALSE key. The code
below partitions CDs released in 2018 according to whether a CD is a pop music
CD or not.
// Filtering downstream from partitioning.
Map<Boolean, List<CD>> partbyPopCDsFilterByYear = CD.cdList.stream() // (1)
.collect(Collectors.partitioningBy(CD::isPop,
Collectors.filtering(cd -> cd.year().equals(Year.of(2018)),
Collectors.toList()))); // (2)

// Filtering before partitioning.


Map<Boolean, List<CD>> filterByYearPartbyPopCDs = CD.cdList.stream() // (2)
.filter(cd -> cd.year().equals(Year.of(2018)))
.collect(Collectors.partitioningBy(CD::isPop, Collectors.toList()));

Both queries at (1) and (2) above will result in the same entries in the result map:
{false=[<Genericos, "Keep on Erasing", 8, 2018, JAZZ>,
<Genericos, "Hot Generics", 10, 2018, JAZZ>],
true=[<Funkies, "Lambda Dancing", 10, 2018, POP>]}

Mapping Adapter for Downstream Collectors


The mapping() method of the Collectors class encapsulates a mapper function and a
downstream collector to create an adapter for a mapping operation. (See also the map()
intermediate operation, p. 921.)
static <T,U,A,R> Collector<T,?,R> mapping(
Function<? super T,? extends U> mapper,
Collector<? super U,A,R> downstream)
Returns a Collector that applies the mapper function to input elements of type T
and provides the mapped results of type U to the downstream collector that accu-
mulates them into results of type R.
In other words, the method adapts a downstream collector accepting elements of
type U to one accepting elements of type T by applying a mapper function to each
input element before accumulation, where type parameter A is the intermedi-
ate accumulation type of the downstream collector.
994 CHAPTER 16: STREAMS

The mapping() method at (1) creates an adapter that accumulates a set of CD titles
in each year for a stream of CDs. The mapper function maps a CD to its title so that
the downstream collector can accumulate the titles in a set.
Map<Year, Set<String>> titlesByYearInSet = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
Collectors.mapping( // (1)
CD::title, // Mapper
Collectors.toSet()))); // Downstream collector
System.out.println(titlesByYearInSet);
// {2017=[Java Jive, Java Jam],
// 2018=[Hot Generics, Lambda Dancing, Keep on Erasing]}

The mapping() method at (2) creates an adapter that joins CD titles in each year for
a stream of CDs. The mapper function maps a CD to its title so that the down-
stream collector can join the titles.
Map<Year, String> joinTitlesByYear = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
Collectors.mapping( // (2)
CD::title,
Collectors.joining(":"))));
System.out.println(joinTitlesByYear);
// {2017=Java Jive:Java Jam,
// 2018=Lambda Dancing:Keep on Erasing:Hot Generics}

The mapping() method at (3) creates an adapter that counts the number of CD tracks
for each year for a stream of CDs. The mapper function maps a CD to its number
of tracks so that the downstream collector can count the total number of tracks.
Map<Year, Long> TotalNumOfTracksByYear = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
Collectors.mapping( // (3)
CD::noOfTracks,
Collectors.counting())));
System.out.println(TotalNumOfTracksByYear); // {2017=2, 2018=3}

Flat Mapping Adapter for Downstream Collectors


The flatMapping() method of the Collectors class encapsulates a mapper function
and a downstream collector to create an adapter for a flat mapping operation. (See also
the flatMap() intermediate operation, p. 924.)
static <T,U,A,R> Collector<T,?,R> flatMapping(
Function<? super T,? extends Stream<? extends U>> mapper,
Collector<? super U,A,R> downstream)
Returns a Collector that applies the specified mapper function to input elements
of type T and provides the mapped results of type U to the downstream collector
that accumulates them into results of type R.
16.8: COLLECTORS 995

That is, the method adapts a downstream collector accepting elements of type U
to one accepting elements of type T by applying a flat mapping function to each
input element before accumulation, where type parameter A is the intermedi-
ate accumulation type of the downstream collector.
The flat mapping function maps an input element to a mapped stream whose
elements are flattened (p. 924) and passed downstream. Each mapped stream is
closed after its elements have been flattened. An empty stream is substituted
if the mapped stream is null.

Given the lists of CDs below, we wish to find all unique CD titles in the lists. A
stream of CD lists is created at (1). Each CD list is used to create a stream of CDs
whose elements are flattened into the output stream of CDs at (2). Each CD is then
mapped to its title at (3), and unique CD titles are accumulated into a set at (4).
(Compare this example with the one in Figure 16.9, p. 925, using the flatMap()
stream operation.)
// Given lists of CDs:
List<CD> cdListA = List.of(CD.cd0, CD.cd1);
List<CD> cdListB = List.of(CD.cd0, CD.cd1, CD.cd1);

// Find unique CD titles in the given lists:


Set<String> set =
Stream.of(cdListA, cdListB) // (1) Stream<List<CD>>
.collect(Collectors.flatMapping(List::stream, // (2) Flatten to Stream<CD>
Collectors.mapping(CD::title, // (3) Stream<String>
Collectors.toSet()))); // (4) Set<String>

Set of unique CD titles in the CD lists:


[Java Jive, Java Jam]

The collectors returned by the flatMapping() method are designed to be used in


multilevel grouping operations (p. 987), such as downstream from groupingBy() or
partitionBy() operations. Example 16.13 illustrates such a use with the groupingBy()
operation.
In Example 16.13, the class RadioPlaylist at (1) represents a radio station by its
name and a list of CDs played by the radio station. Three CD lists are constructed
at (2) and used to construct three radio playlists at (3). The radio playlists are stored
in a common list of radio playlists at (4). A query is formulated at (5) to find the
unique titles of CDs played by each radio station. Referring to the line numbers in
Example 16.13:
(6) A stream of type Stream<RadioPlaylist> is created from the list radioPlaylists
of type RadioPlaylist.
(7) The radio playlists are grouped according to the name of the radio station
(String).
(8) Each radio playlist of type RadioPlaylist is used as the source of a stream, which
is then flattened into the output stream of type Stream<CD> by the flatMapping()
operation.
(9) Each CD in the stream is mapped to its title.
996 CHAPTER 16: STREAMS

(10) Each unique CD title is accumulated into the result set of each radio station
(Set<String>).
The query at (5) uses four collectors. The result map has the type Map<String,
List<String>>. The output shows the unique titles of CDs played by each radio sta-
tion.

Example 16.13 Flat mapping

import java.util.List;

// Radio station with a playlist.


public class RadioPlaylist { // (1)
private String radioStationName;
private List<CD> Playlist;

public RadioPlaylist(String radioStationName, List<CD> cdList) {


this.radioStationName = radioStationName;
this.Playlist = cdList;
}

public String getRadioStationName() { return this.radioStationName; }


public List<CD> getPlaylist() { return this.Playlist; }
}

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Collectors;

public class CollectorsFlatMapping {


public static void main(String[] args) {
// Some lists of CDs: (2)
List<CD> cdList1 = List.of(CD.cd0, CD.cd1, CD.cd1, CD.cd2);
List<CD> cdList2 = List.of(CD.cd0, CD.cd0, CD.cd3);
List<CD> cdList3 = List.of(CD.cd0, CD.cd4);

// Some radio playlists: (3)


RadioPlaylist pl1 = new RadioPlaylist("Radio JVM", cdList1);
RadioPlaylist pl2 = new RadioPlaylist("Radio JRE", cdList2);
RadioPlaylist pl3 = new RadioPlaylist("Radio JAR", cdList3);

// List of radio playlists: (4)


List<RadioPlaylist> radioPlaylists = List.of(pl1, pl2, pl3);

// Map of radio station names and set of CD titles they played: (5)
Map<String, Set<String>> map = radioPlaylists.stream() // (6)
.collect(Collectors.groupingBy(RadioPlaylist::getRadioStationName, // (7)
Collectors.flatMapping(rpl -> rpl.getPlaylist().stream(), // (8)
Collectors.mapping(CD::title, // (9)
Collectors.toSet())))); // (10)
System.out.println(map);
}
}
16.8: COLLECTORS 997

Output from the program (edited to fit on the page):


{Radio JAR=[Hot Generics, Java Jive],
Radio JVM=[Java Jive, Lambda Dancing, Java Jam],
Radio JRE=[Java Jive, Keep on Erasing]}

Finishing Adapter for Downstream Collectors


The collectingAndThen() method encapsulates a downstream collector and a finisher
function to allow the result of the collector to be adapted by the finisher function.
static <T,A,R,RR> Collector<T,A,RR> collectingAndThen(
Collector<T,A,R> downstream,
Function<R,RR> finisher)
Returns a Collector that performs the operation of the downstream collector on
input elements of type T, followed by applying the finisher function on the
result of type R produced by the downstream collector. The final result is of type
RR, the result of the finisher function. In other words, the method adapts a col-
lector to perform an additional finishing transformation.

In the call to the collectAndThen() method at (1), the collector Collectors.maxBy() at


(2) produces an Optional<Integer> result that is the maximum CD by number of
tracks in each group. The finisher function at (3) extracts the value from the
Optional<Integer> result, if there is one; otherwise, it returns 0. The collectAndThen()
method adapts the Optional<Integer> result of its argument collector to an Integer
value by applying the finisher function.
Map<Year, Integer> maxTracksByYear = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
Collectors.collectingAndThen( // (1)
Collectors.maxBy(Comparator.comparing(CD::noOfTracks)), // (2)
optCD -> optCD.map(CD::noOfTracks).orElse(0))) // (3)
);
System.out.println(maxTracksByYear); // {2017=8, 2018=10}

In the call to the collectAndThen() method at (4), the collector Collectors.averaging-


Double() at (5) produces a result of type Double that is the average number of tracks
in each group. The finisher function at (6) maps the Double average value to a string
with the specified number format.
Map<Genre, String> avgTracksByGenre = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::genre,
Collectors.collectingAndThen( // (4)
Collectors.averagingDouble(CD::noOfTracks), // (5)
d -> String.format("%.1f", d))) // (6)
);
System.out.println(avgTracksByGenre); // {JAZZ=8.0, POP=9.0}
998 CHAPTER 16: STREAMS

Downstream Collectors for Functional Reduction


The collectors we have seen so far perform a mutable reduction to some mutable con-
tainer, except for the functional reduction implemented by the joining() method
(p. 984). The Collectors class also provides static factory methods that implement
collectors which perform functional reduction to compute common statistics, such
as summing, averaging, finding maximum and minimum values, and the like.
Like any other collector, the collectors that perform functional reduction can also
be used in a standalone capacity as a parameter of the collect() method and as a
downstream collector in a composed collector. However, these collectors are most
useful when used as downstream collectors.
Collectors performing functional reduction have counterparts in terminal opera-
tions for streams that provide equivalent reduction operations (Table 16.8, p. 1008).

Counting
The collector created by the Collectors.counting() method performs a functional
reduction to count the input elements.
static <T> Collector<T,?,Long> counting()
The collector returned counts the number of input elements of type T. If there
are no elements, the result is Long.valueOf(0L). Note that the result is of type
Long.
The wildcard ? represents any type, and in the method declaration, it is the
type parameter for the mutable type that is accumulated by the reduction
operation.

In the stream pipeline at (1), the collector Collectors.counting() is used in a stand-


alone capacity to count the number of jazz music CDs.
Long numOfJazzCds1 = CD.cdList.stream().filter(CD::isJazz)
.collect(Collectors.counting()); // (1) Standalone collector
System.out.println(numOfJazzCds1); // 3

In the stream pipeline at (2), the collector Collectors.counting() is used as a down-


stream collector in a grouping operation that groups the CDs by musical genre and
uses the downstream collector to count the number of CDs in each group.
Map<Genre, Long> grpByGenre = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::genre,
Collectors.counting())); // (2) Downstream collector
System.out.println(grpByGenre); // {POP=2, JAZZ=3}
System.out.println(grpByGenre.get(Genre.JAZZ)); // 3

The collector Collectors.counting() performs effectively the same functional


reduction as the Stream.count() terminal operation (p. 953) at (3).
long numOfJazzCds2 = CD.cdList.stream().filter(CD::isJazz)
.count(); // (3) Stream.count()
System.out.println(numOfJazzCds2); // 3
16.8: COLLECTORS 999

Finding Min/Max
The collectors created by the Collectors.maxBy() and Collectors.minBy() methods
perform a functional reduction to find the maximum and minimum elements in
the input elements, respectively. As there might not be any input elements, an
Optional<T> is returned as the result.

static <T> Collector<T,?,Optional<T>> maxBy(Comparator<? super T> cmp)


static <T> Collector<T,?,Optional<T>> minBy(Comparator<? super T> cmp)
Return a collector that produces an Optional<T> with the maximum or mini-
mum element of type T according to the specified Comparator, respectively.

The natural order comparator for CDs defined at (1) is used in the stream pipelines
below to find the maximum CD. The collector Collectors.maxBy() is used as a
standalone collector at (2), using the natural order comparator to find the maxi-
mum CD. The Optional<CD> result can be queried for the value.
Comparator<CD> natCmp = Comparator.naturalOrder(); // (1)

Optional<CD> maxCD = CD.cdList.stream()


.collect(Collectors.maxBy(natCmp)); // (2) Standalone collector
System.out.println("Max CD: "
+ maxCD.map(CD::title).orElse("No CD.")); // Max CD: Java Jive

In the pipeline below, the CDs are grouped by musical genre, and the CDs in
each group are reduced to the maximum CD by the downstream collector
Collectors.maxBy() at (3). Again, the downstream collector uses the natural order
comparator, and the Optional<CD> result in each group can be queried.
// Group CDs by musical genre, and max CD in each group.
Map<Genre, Optional<CD>> grpByGenre = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::genre,
Collectors.maxBy(natCmp))); // (3) Downstream collector
System.out.println(grpByGenre);
//{JAZZ=Optional[<Jaav, "Java Jam", 6, 2017, JAZZ>],
// POP=Optional[<Jaav, "Java Jive", 8, 2017, POP>]}

System.out.println("Title of max Jazz CD: "


+ grpByGenre.get(Genre.JAZZ)
.map(CD::title)
.orElse("No CD.")); // Title of max Jazz CD: Java Jam

The collectors created by the Collectors.maxBy() and Collectors.minBy() methods


are effectively equivalent to the max() and min() terminal operations provided by
the stream interfaces (p. 954), respectively. In the pipeline below, the max() terminal
operation reduces the stream of CDs to the maximum CD at (4) using the natural
order comparator, and the Optional<CD> result can be queried.
Optional<CD> maxCD1 = CD.cdList.stream()
.max(natCmp); // (4) max() on Stream<CD>.
System.out.println("Title of max CD: "
+ maxCD1.map(CD::title)
.orElse("No CD.")); // Title of max CD: Java Jive
1000 CHAPTER 16: STREAMS

Summing
The summing collectors perform a functional reduction to produce the sum of the
numeric results from applying a numeric-valued function to the input elements.
static <T> Collector<T,?,NumType> summingNumType(
ToNumTypeFunction<? super T> mapper)
Returns a collector that produces the sum of a numtype-valued function applied
to the input elements. If there are no input elements, the result is zero. The
result is of NumType.
NumType is Int (but it is Integer when used as a type name), Long, or Double, and
the corresponding numtype is int, long, or double.

The collector returned by the Collectors.summingInt() method is used at (1) as a


standalone collector to find the total number of tracks on the CDs. The mapper
function CD::noOfTracks passed as an argument extracts the number of tracks from
each CD on which the functional reduction is performed.
Integer sumTracks = CD.cdList.stream()
.collect(Collectors.summingInt(CD::noOfTracks)); // (1) Standalone collector
System.out.println(sumTracks); // 42

In the pipeline below, the CDs are grouped by musical genre, and the number of
tracks on CDs in each group summed by the downstream collector is returned by
the Collectors.summingInt() method at (2).
Map<Genre, Integer> grpByGenre = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::genre,
Collectors.summingInt(CD::noOfTracks))); // (2) Downstream collector
System.out.println(grpByGenre); // {POP=18, JAZZ=24}
System.out.println(grpByGenre.get(Genre.JAZZ)); // 24

The collector Collectors.summingInt() performs effectively the same functional


reduction at (3) as the IntStream.sum() terminal operation (p. 973).

int sumTracks2 = CD.cdList.stream() // (3) Stream<CD>


.mapToInt(CD::noOfTracks) // IntStream
.sum();
System.out.println(sumTracks2); // 42

Averaging
The averaging collectors perform a functional reduction to produce the average of
the numeric results from applying a numeric-valued function to the input elements.
static <T> Collector<T,?,Double> averagingNumType(
ToNumTypeFunction<? super T> mapper)
Returns a collector that produces the arithmetic mean of a numtype-valued func-
tion applied to the input elements. If there are no input elements, the result is
zero. The result is of type Double.
NumType is Int, Long, or Double, and the corresponding numtype is int, long, or double.
16.8: COLLECTORS 1001

The collector returned by the Collectors.averagingInt() method is used at (1) as a


standalone collector to find the average number of tracks on the CDs. The mapper
function CD::noOfTracks passed as an argument extracts the number of tracks from
each CD on which the functional reduction is performed.
Double avgNoOfTracks1 = CD.cdList.stream()
.collect(Collectors
.averagingInt(CD::noOfTracks)); // (1) Standalone collector
System.out.println(avgNoOfTracks1); // 8.4

In the pipeline below, the CDs are grouped by musical genre, and the downstream
collector Collectors.averagingInt() at (2) calculates the average number of tracks
on the CDs in each group.
Map<Genre, Double> grpByGenre = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::genre,
Collectors.averagingInt(CD::noOfTracks) // (2) Downstream collector
));
System.out.println(grpByGenre); // {POP=9.0, JAZZ=8.0}
System.out.println(grpByGenre.get(Genre.JAZZ)); // 8.0

The collector created by the Collectors.averagingInt() method performs effectively


the same functional reduction as the IntStream.average() terminal operation (p. 974)
at (3).
OptionalDouble avgNoOfTracks2 = CD.cdList.stream() // Stream<CD>
.mapToInt(CD::noOfTracks) // IntStream
.average(); // (3)
System.out.println(avgNoOfTracks2.orElse(0.0)); // 8.4

Summarizing
The summarizing collector performs a functional reduction to produce summary
statistics (count, sum, min, max, average) on the numeric results of applying a
numeric-valued function to the input elements.
static <T> Collector<T,?,NumTypeSummaryStatistics> summarizingNumType(
ToNumTypeFunction<? super T> mapper)
Returns a collector that applies a numtype-valued mapper function to the input
elements, and returns the summary statistics for the resulting values.
NumType is Int (but it is Integer when used as a type name), Long, or Double, and
the corresponding numtype is int, long, or double.

The collector Collectors.summarizingInt() is used at (1) as a standalone collector to


summarize the statistics for the number of tracks on the CDs. The mapper function
CD::noOfTracks passed as an argument extracts the number of tracks from each CD
on which the functional reduction is performed.
IntSummaryStatistics stats1 = CD.cdList.stream()
.collect(
Collectors.summarizingInt(CD::noOfTracks) // (1) Standalone collector
);
1002 CHAPTER 16: STREAMS

System.out.println(stats1);
// IntSummaryStatistics{count=5, sum=42, min=6, average=8.400000, max=10}

The IntSummaryStatistics class provides get methods to access the individual


results (p. 974).
In the pipeline below, the CDs are grouped by musical genre, and the downstream
collector created by the Collectors.summarizingInt() method at (2) summarizes the
statistics for the number of tracks on the CDs in each group.
Map<Genre, IntSummaryStatistics> grpByGenre = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::genre,
Collectors.summarizingInt(CD::noOfTracks))); // (2) Downstream collector
System.out.println(grpByGenre);
//{POP=IntSummaryStatistics{count=2, sum=18, min=8, average=9.000000, max=10},
// JAZZ=IntSummaryStatistics{count=3, sum=24, min=6, average=8.000000, max=10}}

System.out.println(grpByGenre.get(Genre.JAZZ)); // Summary stats for Jazz CDs.


// IntSummaryStatistics{count=3, sum=24, min=6, average=8.000000, max=10}

The collector returned by the Collectors.summarizingInt() method performs effec-


tively the same functional reduction as the IntStream.summaryStatistics() terminal
operation (p. 974) at (3).
IntSummaryStatistics stats2 = CD.cdList.stream()
.mapToInt(CD::noOfTracks)
.summaryStatistics(); // (3)
System.out.println(stats2);
// IntSummaryStatistics{count=5, sum=42, min=6, average=8.400000, max=10}

Reducing
Collectors that perform common statistical operations, such as counting, averag-
ing, and so on, are special cases of functional reduction that can be implemented
using the Collectors.reducing() method.
static <T> Collector<T,?,Optional<T>> reducing(BinaryOperator<T> bop)
Returns a collector that performs functional reduction, producing an Optional
with the cumulative result of applying the binary operator bop on the input ele-
ments: e1 bop e2 bop e3 ..., where each ei is an input element. If there are no
input elements, an empty Optional<T> is returned.
Note that the collector reduces input elements of type T to a result that is an
Optional of type T.

static <T> Collector<T,?,T> reducing(T identity, BinaryOperator<T> bop)


Returns a collector that performs functional reduction, producing the cumula-
tive result of applying the binary operator bop on the input elements: identity
bop e1 bop e2 ..., where each ei is an input element. The identity value is the
initial value to accumulate. If there are no input elements, the identity value is
returned.
Note that the collector reduces input elements of type T to a result of type T.
16.8: COLLECTORS 1003

static <T,U> Collector<T,?,U> reducing(


U identity,
Function<? super T,? extends U> mapper,
BinaryOperator<U> bop)
Returns a collector that performs a map-reduce operation. It maps each input
element of type T to a mapped value of type U by applying the mapper function,
and performs functional reduction on the mapped values of type U by applying
the binary operator bop. The identity value of type U is used as the initial value
to accumulate. If the stream is empty, the identity value is returned.
Note that the collector reduces input elements of type T to a result of type U.

Collectors returned by the Collectors.reducing() methods effectively perform


equivalent functional reductions as the reduce() methods of the stream interfaces.
However, the three-argument method Collectors.reducing(identity, mapper, bop)
performs a map-reduce operation using a mapper function and a binary operator bop,
whereas the Stream.reduce(identity, accumulator, combiner) performs a reduction
using an accumulator and a combiner (p. 955). The accumulator is a BiFunction<U,T,U>
that accumulates a partially accumulated result of type U with an element of type
T, whereas the bop is a BinaryOperator<U> that accumulates a partially accumulated
result of type U with an element of type U.
The following comparators are used in the examples below:
// Comparator to compare CDs by title.
Comparator<CD> cmpByTitle = Comparator.comparing(CD::title); // (1)
// Comparator to compare strings by their length.
Comparator<String> byLength = Comparator.comparing(String::length); // (2)

The collector returned by the Collectors.reducing() method is used as a standalone


collector at (3) to find the CD with the longest title. The result of the operation is an
Optional<String> as there might not be any input elements. This operation is equiv-
alent to using the Stream.reduce() terminal operation at (4).
Optional<String> longestTitle1 = CD.cdList.stream()
.map(CD::title)
.collect(Collectors.reducing(
BinaryOperator.maxBy(byLength))); // (3) Standalone collector
System.out.println(longestTitle1.orElse("No title"));// Keep on Erasing

Optional<String> longestTitle2 = CD.cdList.stream() // Stream<CD>


.map(CD::title) // Stream<String>
.reduce(BinaryOperator.maxBy(byLength)); // (4) Stream.reduce(bop)

The collector returned by the one-argument Collectors.reducing() method is used


as a downstream collector at (5) to find the CD with the longest title in each group
classified by the year a CD was released. The collector at (5) is equivalent to the col-
lector returned by the Collectors.maxBy(cmpByTitle) method.
Map<Year, Optional<CD>> cdWithMaxTitleByYear = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
1004 CHAPTER 16: STREAMS

Collectors.reducing( // (5) Downstream collector


BinaryOperator.maxBy(cmpByTitle))
));
System.out.println(cdWithMaxTitleByYear);
// {2017=Optional[<Jaav, "Java Jive", 8, 2017, POP>],
// 2018=Optional[<Funkies, "Lambda Dancing", 10, 2018, POP>]}
System.out.println(cdWithMaxTitleByYear.get(Year.of(2018))
.map(CD::title).orElse("No title")); // Lambda Dancing

The collector returned by the three-argument Collectors.reducing() method is


used as a downstream collector at (6) to find the longest title in each group classi-
fied by the year a CD was released. Note that the collector maps a CD to its title.
The longest title is associated with the map value for each group classified by the
year a CD was released. The collector will return an empty string (i.e., the identity
value "") if there are no CDs in the stream. In comparison, the collector
Collectors.mapping() at (7) also maps a CD to its title, and uses the downstream col-
lector Collectors.maxBy(byLength) at (7) to find the longest title (p. 993). The result
in this case is an Optional<String>, as there might not be any input elements.
Map<Year, String> longestTitleByYear = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
Collectors.reducing("", CD::title, // (6) Downstream collector
BinaryOperator.maxBy(byLength))
));
System.out.println(longestTitleByYear); // {2017=Java Jive, 2018=Keep on Erasing}
System.out.println(longestTitleByYear.get(Year.of(2018))); // Keep on Erasing

Map<Year, Optional<String>> longestTitleByYear2 = CD.cdList.stream()


.collect(Collectors.groupingBy(
CD::year,
Collectors.mapping(CD::title, // (7) Downstream collector
Collectors.maxBy(byLength))
));
System.out.println(longestTitleByYear2);
// {2017=Optional[Java Jive], 2018=Optional[Keep on Erasing]}
System.out.println(longestTitleByYear2.get(Year.of(2018))
.orElse("No title.")); // Keep on Erasing

The pipeline below groups CDs according to the year they were released. For each
group, the collector returned by the three-argument Collectors.reducing() method
performs a map-reduce operation at (8) to map each CD to its number of tracks and
accumulate the tracks in each group. This map-reduce operation is equivalent to
the collector returned by the Collectors.summingInt() method at (9).
Map<Year, Integer> noOfTracksByYear = CD.cdList.stream()
.collect(Collectors.groupingBy(
CD::year,
Collectors.reducing( // (8) Downstream collector
0, CD::noOfTracks, Integer::sum)));
System.out.println(noOfTracksByYear); // {2017=14, 2018=28}
System.out.println(noOfTracksByYear.get(Year.of(2018)));// 28
16.8: COLLECTORS 1005

Map<Year, Integer> noOfTracksByYear2 = CD.cdList.stream()


.collect(Collectors.groupingBy(
CD::year,
Collectors.summingInt(CD::noOfTracks))); // (9) Special case collector

Summary of Static Factory Methods in the Collectors Class


The static factory methods of the Collectors class that create collectors are summa-
rized in Table 16.7. All methods are static generic methods, except for the over-
loaded joining() methods that are not generic. The keyword static is omitted, as
are the type parameters of a generic method, since these type parameters are evi-
dent from the declaration of the formal parameters to the method. The type param-
eter declarations have also been simplified, where any bound <? super T> or <?
extends T> has been replaced by <T>, without impacting the intent of a method. A
reference is also provided for each method in the first column.
The last column in Table 16.7 indicates the function type of the corresponding
parameter in the previous column. It is instructive to note how the functional inter-
face parameters provide the parameterized behavior to build the collector
returned by a method. For example, the method averagingDouble() returns a collec-
tor that computes the average of the stream elements. The parameter function
mapper with the functional interface type ToDoubleFunction<T> converts an element
of type T to a double when the collector computes the average for the stream ele-
ments.

Table 16.7 Static Methods in the Collectors Class

Function
Method name Functional interface type of
(ref.) Return type parameters parameters
averagingDouble Collector<T,?,Double> (ToDoubleFunction<T> T -> double
(p. 1000) mapper)

averagingInt Collector<T,?,Double> (ToIntFunction<T> T -> int


(p. 1000) mapper)

averagingLong Collector<T,?,Double> (ToLongFunction<T> T -> long


(p. 1000) mapper)

collectingAndThen Collector<T,A,RR> (Collector<T,A,R> (T,A) -> R,


(p. 997) downstream,
Function<R,RR> R -> RR
finisher)

counting Collector<T,?,Long> ()
(p. 998)

filtering Collector<T,?,R> (Predicate<T> T -> boolean,


(p. 992) predicate,
Collector<T,A,R> (T,A) -> R
downstream)
1006 CHAPTER 16: STREAMS

Table 16.7 Static Methods in the Collectors Class (Continued)

Function
Method name Functional interface type of
(ref.) Return type parameters parameters
flatMapping Collector<T,?,R> (Function<T, T->Stream<U>,
(p. 994) Stream<U>>
mapper,
Collector<U,A,R> (U,A) -> R
downstream)

groupingBy Collector<T,?, (Function<T,K> T -> K


(p. 985) Map<K,List<T>>> classifier)

groupingBy Collector<T,?,Map<K,D>> (Function<T,K> T -> K,


(p. 985) classifier,
Collector<T,A,D> (T,A) -> D
downstream)

groupingBy Collector<T,?,Map<K,D>> (Function<T,K> T -> K,


(p. 985) classifier,
Supplier<Map<K,D>> ()->Map<K,D>,
mapSupplier,
Collector<T,A,D> (T,A) -> D
downstream)

joining Collector ()
(p. 984) <CharSequence,?,String>

joining Collector (CharSequence


(p. 984) <CharSequence,?,String> delimiter)

joining Collector (CharSequence


(p. 984) <CharSequence,?,String> delimiter,
CharSequence prefix,
CharSequence suffix)

mapping Collector<T,?,R> (Function<T,U> T -> U,


(p. 993) mapper,
Collector<U,A,R> (U,A) -> R
downstream)

maxBy Collector<T,?,Optional<T>> (Comparator<T> (T,T) -> T


(p. 999) comparator)

minBy Collector<T,?,Optional<T>> (Comparator<T> (T,T) -> T


(p. 999) comparator)

partitioningBy Collector<T,?, (Predicate<T> T -> boolean


(p. 989) Map<Boolean,List<T>>> predicate)

partitioningBy Collector<T,?, (Predicate<T> T -> boolean,


(p. 989) Map<Boolean,D>> predicate,
Collector<T,A,D> (T,A) -> D
downstream)
16.8: COLLECTORS 1007

Table 16.7 Static Methods in the Collectors Class (Continued)

Function
Method name Functional interface type of
(ref.) Return type parameters parameters
reducing Collector<T,?,Optional<T>> (BinaryOperator<T> (T,T) -> T
(p. 1002) op)

reducing Collector<T,?,T> (T identity, T -> T,


(p. 1002) BinaryOperator<T> (T,T) -> T
op)

reducing Collector<T,?,U> (U identity, U -> U,


(p. 1002) Function<T,U> T -> U,
mapper,
BinaryOperator<U> (U,U) -> U
op)

summarizingDouble Collector<T,?, (ToDoubleFunction<T> T -> double


(p. 1001) DoubleSummaryStatistics> mapper)

summarizingInt Collector<T,?, (ToIntFunction<T> T -> int


(p. 1001) IntSummaryStatistics> mapper)

summarizingLong Collector<T,?, (ToLongFunction<T> T -> long


(p. 1001) LongSummaryStatistics> mapper)

summingDouble Collector<T,?,Double> (ToDoubleFunction<T> T -> double


(p. 978) mapper)

summingInt Collector<T,?,Integer> (ToIntFunction<T> T -> int


(p. 978) mapper)

summingLong Collector<T,?,Long> (ToLongFunction<T> T -> long


(p. 978) mapper)

toCollection Collector<T,?,C> (Supplier<C> () -> C


(p. 979) collFactory)

toList Collector<T,?,List<T>> ()
toUnmodifiableList
(p. 980)

toMap Collector<T,?,Map<K,U>> (Function<T,K> T -> K,


(p. 981) keyMapper,
Function<T,U> T -> U
valueMapper)

toMap Collector<T,?,Map<K,U>> (Function<T,K> T -> K,


(p. 981) keyMapper,
Function<T,U> T -> U,
valueMapper,
BinaryOperator<U> (U,U) -> U
mergeFunction)
1008 CHAPTER 16: STREAMS

Table 16.7 Static Methods in the Collectors Class (Continued)

Function
Method name Functional interface type of
(ref.) Return type parameters parameters
toMap Collector<T,?,Map<K,U>> (Function<T,K> T -> K,
(p. 981) keyMapper,
Function<T,U> T -> U,
valueMapper,
BinaryOperator<U> (U,U) -> U,
mergeFunction,
Supplier<Map<K,U>> ()-> Map<K,U>
mapSupplier)

toSet Collector<T,?,Set<T>> ()
toUnmodifiableSet
(p. 980)

Table 16.8 shows a comparison of methods in the stream interfaces that perform
reduction operations and static factory methods in the Collectors class that imple-
ment collectors with equivalent functionality.

Table 16.8 Method Comparison: The Stream Interfaces and the Collectors Class

Method names in the


stream interfaces Static factory method names in the Collectors class
collect (p. 964) collectingAndThen (p. 997)

count (p. 953) counting (p. 998)

filter (p. 912) filtering (p. 992)

flatMap (p. 924) flatMapping (p. 994)

map (p. 921) mapping (p. 993)

max (p. 954) maxBy (p. 999)

min (p. 954) minBy (p. 999)

reduce (p. 955) reducing (p. 1002)

toList (p. 972) toList (p. 980)

average (p. 972) averagingInt, averagingLong, averagingDouble (p. 1001)

sum (p. 972) summingInt, summingLong, summingDouble (p. 978)

summaryStatistics summarizingInt, summarizingLong, summarizingDouble


(p. 972) (p. 1001)
16.9: PARALLEL STREAMS 1009

16.9 Parallel Streams


The Stream API makes it possible to execute a sequential stream in parallel without
rewriting the code. The primary reason for using parallel streams is to improve
performance, but at the same time ensuring that the results obtained are the same,
or at least compatible, regardless of the mode of execution. Although the API goes
a long way to achieve its aim, it is important to understand the pitfalls to avoid
when executing stream pipelines in parallel.

Building Parallel Streams


The execution mode of an existing stream can be set to parallel by calling the
parallel() method on the stream (p. 933). The parallelStream() method of the
Collection interface can be used to create a parallel stream with a collection as
the data source (p. 897). No other code is necessary for parallel execution, as the data
partitioning and thread management for a parallel stream are handled by the API
and the JVM. As with any stream, the stream is not executed until a terminal opera-
tion is invoked on it.
The isParallel() method of the stream interfaces can be used to determine
whether the execution mode of a stream is parallel (p. 933).

Parallel Stream Execution


The Stream API allows a stream to be executed either sequentially or in parallel—
meaning that all stream operations can execute either sequentially or in parallel. A
sequential stream is executed in a single thread running on one CPU core. The ele-
ments in the stream are processed sequentially in a single pass by the stream oper-
ations that are executed in the same thread (p. 891).
A parallel stream is executed by different threads, running on multiple CPU cores
in a computer. The stream elements are split into substreams that are processed by
multiple instances of the stream pipeline being executed in multiple threads. The
partial results from processing of each substream are merged (or combined) into a
final result (p. 891).
Parallel streams utilize the Fork/Join Framework (§23.3, p. 1447) under the hood
for executing parallel tasks. This framework provides support for the thread man-
agement necessary to execute the substreams in parallel. The number of threads
employed during parallel stream execution is dependent on the CPU cores in the
computer.
Figure 16.12, p. 963, illustrates parallel functional reduction using the three-argument
reduce(identity, accumulator, combiner) terminal operation (p. 962).

Figure 16.14, p. 967, illustrates parallel mutable reduction using the three-argument
collect(supplier, accumulator, combiner) terminal operation (p. 966).
1010 CHAPTER 16: STREAMS

Factors Affecting Performance


There are no guarantees that executing a stream in parallel will improve the per-
formance. In this subsection we look at some factors that can affect performance.

Benchmarking
In general, increasing the number of CPU cores and thereby the number of threads
that can execute in parallel only scales performance up to a threshold for a given
size of data, as some threads might become idle if there is no data left for them to
process. The number of CPU cores boosts performance to a certain extent, but it is
not the only factor that should be considered when deciding to execute a stream in
parallel.
Inherent in the total cost of parallel processing is the start-up cost of setting up the
parallel execution. At the onset, if this cost is already comparable to the cost of
sequential execution, not much can be gained by resorting to parallel execution.
A combination of the following three factors can be crucial in deciding whether a
stream should be executed in parallel:
• Sufficiently large data size
The size of the stream must be large enough to warrant parallel processing;
otherwise, sequential processing is preferable. The start-up cost can be too pro-
hibitive for parallel execution if the stream size is too small.
• Computation-intensive stream operations
If the stream operations are small computations, then the stream size should be
proportionately large as to warrant parallel execution. If the stream operations
are computation-intensive, the stream size is less significant, and parallel exe-
cution can boost performance.
• Easily splittable stream
If the cost of splitting the stream into substreams is higher than processing
the substreams, employing parallel execution can be futile. Collections like
Array-Lists, HashMaps, and simple arrays are efficiently splittable, whereas
LinkedLists and IO-based data sources are less efficient in this regard.

Benchmarking—that is, measuring performance—is strongly recommended to


decide whether parallel execution will be beneficial. Example 16.14 illustrates a
simple scheme where reading the system clock before and after a stream is exe-
cuted can be used to get a sense of how well a stream performs.
The class StreamBenchmarks in Example 16.14 defines five methods, at (1) through
(5), that compute the sum of values from 1 to n. These methods compute the sum
in various ways. Each method is executed with four different values of n; that is,
the stream size is the number of values for summation. The program prints the
benchmarks for each method for the different values of n, which of course can vary,
as many factors can influence the results—the most significant one being the num-
ber of CPU cores on the computer.
16.9: PARALLEL STREAMS 1011

• The methods seqSumRangeClosed() at (1) and parSumRangeClosed() at (2) perform


the computation on a sequential and a parallel stream, respectively, that are
created with the closeRange() method.
return LongStream.rangeClosed(1L, n).sum(); // Sequential stream
...
return LongStream.rangeClosed(1L, n).parallel().sum(); // Parallel stream
Benchmarks from Example 16.14:
Size Sequential Parallel
1000 0.05681 0.11031
10000 0.06698 0.13979
100000 0.71274 0.52627
1000000 7.02237 4.37249
The terminal operation sum() is not computation-intensive. The parallel stream
starts to show better performance when the number of values approaches
100000. The stream size is then significantly large for the parallel stream to
show better performance. Note that the range of values defined by the argu-
ments of the rangeClosed() method can be efficiently split into substreams, as its
start and end values are provided.
• The methods seqSumIterate() at (3) and parSumIterate() at (4) return a sequential
and a parallel sequential stream, respectively, that is created with the iterate()
method.
return LongStream.iterate(1L, i -> i + 1).limit(n).sum(); // Sequential
...
return LongStream.iterate(1L, i -> i + 1).limit(n).parallel().sum(); // Parallel
Benchmarks from Example 16.14:
Size Sequential Parallel
1000 0.08645 0.34696
10000 0.35687 1.27861
100000 3.24083 11.38709
1000000 29.92285 117.87909
The method iterate() creates an infinite stream, and the limit() intermediate
operation truncates the stream according to the value of n. The performance of
both streams degrades fast when the number of values increases. However, the
parallel stream performs worse than the sequential stream in all cases. The val-
ues generated by the iterate() method are not known before the stream is exe-
cuted, and the limit() operation is also stateful, making the process of splitting
the values into substreams inefficient in the case of the parallel stream.
• The method iterSumLoop() at (5) uses a for(;;) loop to compute the sum.
Benchmarks from Example 16.14:
Size Iterative
1000 0.00586
10000 0.02330
100000 0.22352
1000000 2.49677
Using a for(;;) loop to calculate the sum performs best for all values of n com-
pared to the streams, showing that significant overhead is involved in using
streams for summing a sequence of numerical values.
1012 CHAPTER 16: STREAMS

In Example 16.14, the methods measurePerf() at (6) and xqtFunctions() at (13) create
the benchmarks for functions passed as parameters. In the measurePerf() method,
the system clock is read at (8) and the function parameter func is applied at (9). The
system clock is read again at (10) after the function application at (9) has com-
pleted. The execution time calculated at (10) reflects the time for executing the
function. Applying the function func evaluates the lambda expression or the method
reference implementing the LongFunction interface. In Example 16.14, the function
parameter func is implemented by method references that call methods, at (1) through
(5), in the StreamBenchmarks class whose execution time we want to measure.
public static <R> double measurePerf(LongFunction<R> func, long n) { // (6)
// ...
double start = System.nanoTime(); // (8)
result = func.apply(n); // (9)
double duration = (System.nanoTime() - start)/1_000_000; // (10) ms.
// ...
}

Example 16.14 Benchmarking

import java.util.function.LongFunction;
import java.util.stream.LongStream;
/*
* Benchmark the execution time to sum numbers from 1 to n values
* using streams.
*/
public final class StreamBenchmarks {

public static long seqSumRangeClosed(long n) { // (1)


return LongStream.rangeClosed(1L, n).sum();
}

public static long paraSumRangeClosed(long n) { // (2)


return LongStream.rangeClosed(1L, n).parallel().sum();
}

public static long seqSumIterate(long n) { // (3)


return LongStream.iterate(1L, i -> i + 1).limit(n).sum();
}

public static long paraSumIterate(long n) { // (5)


return LongStream.iterate(1L, i -> i + 1).limit(n).parallel().sum();
}

public static long iterSumLoop(long n) { // (5)


long result = 0;
for (long i = 1L; i <= n; i++) {
result += i;
}
return result;
}
16.9: PARALLEL STREAMS 1013

/*
* Applies the function parameter func, passing n as parameter.
* Returns the average time (ms.) to execute the function 100 times.
*/
public static <R> double measurePerf(LongFunction<R> func, long n) { // (6)
int numOfExecutions = 100;
double totTime = 0.0;
R result = null;
for (int i = 0; i < numOfExecutions; i++) { // (7)
double start = System.nanoTime(); // (8)
result = func.apply(n); // (9)
double duration = (System.nanoTime() - start)/1_000_000; // (10)
totTime += duration; // (11)
}
double avgTime = totTime/numOfExecutions; // (12)
return avgTime;
}

/*
* Executes the functions in the varargs parameter funcs
* for different stream sizes.
*/
public static <R> void xqtFunctions(LongFunction<R>... funcs) { // (13)
long[] sizes = {1_000L, 10_000L, 100_000L, 1_000_000L}; // (14)

// For each stream size ...


for (int i = 0; i < sizes.length; ++i) { // (15)
System.out.printf("%7d", sizes[i]);
// ... execute the functions passed in the varargs parameter funcs.
for (int j = 0; j < funcs.length; ++j) { // (16)
System.out.printf("%10.5f", measurePerf(funcs[j], sizes[i]));
}
System.out.println();
}
}

public static void main(String[] args) { // (17)

System.out.println("Streams created with the rangeClosed() method:");// (18)


System.out.println(" Size Sequential Parallel");
xqtFunctions(StreamBenchmarks::seqSumRangeClosed,
StreamBenchmarks::paraSumRangeClosed);

System.out.println("Streams created with the iterate() method:"); // (19)


System.out.println(" Size Sequential Parallel");
xqtFunctions(StreamBenchmarks::seqSumIterate,
StreamBenchmarks::paraSumIterate);

System.out.println("Iterative solution with an explicit loop:"); // (20)


System.out.println(" Size Iterative");
xqtFunctions(StreamBenchmarks::iterSumLoop);
}
}
1014 CHAPTER 16: STREAMS

Possible output from the program:


Streams created with the rangeClosed() method:
Size Sequential Parallel
1000 0.05681 0.11031
10000 0.06698 0.13979
100000 0.71274 0.52627
1000000 7.02237 4.37249
Streams created with the iterate() method:
Size Sequential Parallel
1000 0.08645 0.34696
10000 0.35687 1.27861
100000 3.24083 11.38709
1000000 29.92285 117.87909
Iterative solution with an explicit loop:
Size Iterative
1000 0.00586
10000 0.02330
100000 0.22352
1000000 2.49677

Side Effects
Efficient execution of parallel streams that produces the desired results requires the
stream operations (and their behavioral parameters) to avoid certain side effects.
• Non-interfering behaviors
The behavioral parameters of stream operations should be non-interfering
(p. 909)—both for sequential and parallel streams. Unless the stream data
source is concurrent, the stream operations should not modify it during the
execution of the stream. See building streams from collections (p. 897).
• Stateless behaviors
The behavioral parameters of stream operations should be stateless (p. 909)—
both for sequential and parallel streams. A behavioral parameter implemented
as a lambda expression should not depend on any state that might change dur-
ing the execution of the stream pipeline. The results from a stateful behavioral
parameter can be nondeterministic or even incorrect. For a stateless behavioral
parameter, the results are always the same.
Shared state that is accessed by the behavior parameters of stream operations in
a pipeline is not a good idea. Executing the pipeline in parallel can lead to race
conditions in accessing the global state, and using synchronization code to pro-
vide thread-safety may defeat the purpose of parallelization. Using the three-
argument reduce() or collect() method can be a better solution to encapsulate
shared state.
The intermediate operations distinct(), skip(), limit(), and sorted() are state-
ful (p. 915, p. 915, p. 917, p. 929). See also Table 16.3, p. 938. They can carry extra
16.9: PARALLEL STREAMS 1015

performance overhead when executed in a parallel stream, as such an opera-


tion can entail multiple passes over the data and may require significant data
buffering.

Ordering
An ordered stream (p. 891) processed by operations that preserve the encounter
order will produce the same results, regardless of whether it is executed sequen-
tially or in parallel. However, repeated execution of an unordered stream—
sequential or parallel—can produce different results.
Preserving the encounter order of elements in an ordered parallel stream can incur
a performance penalty. The performance of an ordered parallel stream can be
improved if the ordering constraint is removed by calling the unordered() interme-
diate operation on the stream (p. 932).
The three stateful intermediate operations distinct(), skip(), and limit() can
improve performance in a parallel stream that is unordered, as compared to one
that is ordered (p. 915, p. 915, p. 917). The distinct() operation need only buffer
any occurrence of a duplicate value in the case of an unordered parallel stream,
rather than the first occurrence. The skip() operation can skip any n elements in the
case of an unordered parallel stream, not necessarily the first n elements. The
limit() operation can truncate the stream after any n elements in the case of an
unordered parallel stream, and not necessarily after the first n elements.
The terminal operation findAny() is intentionally nondeterministic, and can return
any element in the stream (p. 952). It is specially suited for parallel streams.
The forEach() terminal operation ignores the encounter order, but the forEachOrdered()
terminal operation preserves the order (p. 948). The sorted() stateful intermediate
operation, on the other hand, enforces a specific encounter order, regardless of
whether it executed in a parallel pipeline (p. 929).

Autoboxing and Unboxing of Numeric Values


As the Stream API allows both object and numeric streams, and provides support
for conversion between them (p. 934), choosing a numeric stream when possible
can offset the overhead of autoboxing and unboxing in object streams.
As we have seen, in order to take full advantage of parallel execution, composition
of a stream pipeline must follow certain rules to facilitate parallelization. In sum-
mary, the benefits of using parallel streams are best achieved when:
• The stream data source is of a sufficient size and the stream is easily splittable
into substreams.
• The stream operations have no adverse side effects and are computation-
intensive enough to warrant parallelization.
1016 CHAPTER 16: STREAMS

Review Questions

16.1 Given the following code:


import java.util.*;

public class RQ1 {


public static void main(String[] args) {
List<String> values = Arrays.asList("X", "XXX", "XX", "XXXX");
int value = values.stream()
.mapToInt(v -> v.length())
.filter(v -> v != 4)
.reduce(1, (x, y) -> x * y);
System.out.println(value);
}
}

What is the result?


Select the one correct answer.
(a) 4
(b) 6
(c) 24
(d) The program will throw an exception at runtime.

16.2 Which statement is true about the Stream methods?


(a) The filter() method discards elements from the stream that match the given
Predicate.
(b) The findFirst() method always returns the first element in the stream.
(c) The reduce() method removes elements from the stream that match the given
Predicate.
(d) The sorted() method sorts the elements in a stream according to their natural
order, or according to a given Comparator.

16.3 Given the following code:


import java.util.stream.*;

public class RQ3 {


public static void main(String[] args) {
IntStream values = IntStream.range(0, 5);
// (1) INSERT CODE HERE
System.out.println(sum);
}
}

Which of the following statements when inserted independently at (1) will result
in a compile-time error?
Select the two correct answers.
(a) int sum = values.reduce(0, (x, y) -> x + y);
(b) int sum = values.parallel().reduce(0, (x, y) -> x + y);
REVIEW QUESTIONS 1017

(c) int sum = values.reduce((x, y) -> x + y).orElse(0);


(d) int sum = values.reduce(0, (x, y) -> x + y).orElse(0);
(e) int sum = values.parallel().reduce((x, y) -> x + sum).orElse(0);
(f) int sum = values.sum();

16.4 Given the following code:


import java.util.stream.*;

public class RQ4 {


public static void main(String[] args) {
IntStream values = IntStream.range(0, 5);
// (1) INSERT CODE HERE
System.out.println(value);
}
}

Which of the following statements, when inserted independently at (1), will result
in the value 4 being printed?
Select the two correct answers.
(a) int value = values.reduce(0, (x, y) -> x + 1);
(b) int value = values.reduce((x, y) -> x + 1).orElse(0);
(c) int value = values.reduce(0, (x, y) -> y + 1);
(d) int value = values.reduce(0, (x, y) -> y);
(e) int value = values.reduce(1, (x, y) -> y + 1);
(f) long value = values.count();

16.5 Given the following code:


import java.util.*;
import java.util.stream.*;

public class RQ5 {


public static void main(String[] args) {
List<String> values = List.of("AA", "BBB", "C", "DD", "EEE");
Map<Integer, List<String>> map = null;
// (1) INSERT CODE HERE
map.forEach((i, s) -> System.out.println(i + " " + s));
}
}

Which statement when inserted independently at (1) will result in the output
1 [C]?
Select the one correct answer.
(a) map = values.stream()
.collect(Collectors.groupingBy(s -> s.length(),
Collectors.filtering(s -> !s.contains("C"),
Collectors.toList())));
(b) map = values.stream()
.collect(Collectors.groupingBy(s -> s.length(),
Collectors.filtering(s -> s.contains("C"),
Collectors.toList())));
1018 CHAPTER 16: STREAMS

(c) map = values.stream()


.filter(s -> !s.contains("C"))
.collect(Collectors.groupingBy(s -> s.length(),
Collectors.toList()));
(d) map = values.stream()
.filter(s -> s.contains("C"))
.collect(Collectors.groupingBy(s -> s.length(),
Collectors.toList()));

16.6 Given the following code:


import java.util.stream.*;

public class RQ7 {


public static void main(String[] args) {
Stream<String> values = Stream.generate(() -> "A");
boolean value = values.peek(v -> System.out.print("B"))
.takeWhile(v -> !v.equals("A"))
.peek(v -> System.out.print("C"))
.anyMatch(v -> v.equals("A"));
System.out.println(value);
}
}

What is the result?


Select the one correct answer.
(a) Btrue
(b) Ctrue
(c) BCtrue
(d) Bfalse
(e) Cfalse
(f) BCfalse

16.7 Given the following code:


import java.util.stream.*;

public class RQ9 {


public static void main(String[] args) {
IntStream.range('a', 'e')
.mapToObj(i -> String.valueOf((char) i).toUpperCase())
.filter(s -> "AEIOU".contains(s))
.forEach(s -> System.out.print(s));
}
}

What is the result?


Select the one correct answer.
(a) A
(b) AE
(c) BCD
(d) The program will fail to compile.
REVIEW QUESTIONS 1019

16.8 Given the following code:


import java.util.stream.*;

public class RQ10 {


public static void main(String[] args) {
IntStream.range(0, 5)
.filter(i -> i % 2 != 0)
.forEach(i -> System.out.println(i));
}
}

Which of the following statements will produce the same result as the program?
Select the two correct answers.
(a) IntStream.rangeClosed(0, 5)
.filter(i -> i % 2 != 0)
.forEach(i -> System.out.println(i));
(b) IntStream.range(0, 10)
.takeWhile(i -> i < 5)
.filter(i -> i % 2 != 0)
.forEach(i -> System.out.println(i));
(c) IntStream.range(0, 10)
.limit(5)
.filter(i -> i % 2 != 0)
.forEach(i -> System.out.println(i));
(d) IntStream.generate(() -> {int x = 0; return x++;})
.takeWhile(i -> i < 4)
.filter(i -> i % 2 != 0)
.forEach(i -> System.out.println(i));
(e) var x = 0;
IntStream.generate(() -> return x++)
.limit(5)
.filter(i -> i % 2 != 0)
.forEach(i -> System.out.println(i));

16.9 Given the following code:


import java.util.function.*;
import java.util.stream.*;

public class RQ11 {


public static void main(String[] args) {
Stream<String> abc = Stream.of("A", "B", "C");
Stream<String> xyz = Stream.of("X", "Y", "Z");
String value = Stream.concat(xyz, abc).reduce((a, b) -> b + a).get();
System.out.println(value);
}
}

What is the result?


Select the one correct answer.
(a) ABCXYZ
(b) XYZABC
1020 CHAPTER 16: STREAMS

(c) ZYXCBA
(d) CBAZYX

16.10 Which statement produces a different result from the other statements?
Select the one correct answer.
(a) Stream.of("A", "B", "C", "D", "E")
.filter(s -> s.compareTo("B") < 0)
.collect(Collectors.groupingBy(s -> "AEIOU".contains(s)))
.forEach((x, y) -> System.out.println(x + " " + y));
(b) Stream.of("A", "B", "C", "D", "E")
.filter(s -> s.compareTo("B") < 0)
.collect(Collectors.partitioningBy(s -> "AEIOU".contains(s)))
.forEach((x, y) -> System.out.println(x + " " + y));
(c) Stream.of("A", "B", "C", "D", "E")
.collect(Collectors.groupingBy(s -> "AEIOU".contains(s),
Collectors.filtering(s -> s.compareTo("B") < 0,
Collectors.toList())))
.forEach((x, y) -> System.out.println(x + " " + y));
(d) Stream.of("A", "B", "C", "D", "E")
.collect(Collectors.partitioningBy(s -> "AEIOU".contains(s),
Collectors.filtering(s -> s.compareTo("B") < 0,
Collectors.toList())))
.forEach((x, y) -> System.out.println(x + " " + y));

16.11 Given the following code:


import java.util.stream.*;

public class RQ13 {


public static void main(String[] args) {
Stream<String> strings = Stream.of("i", "am", "ok").parallel();
IntStream chars = strings.flatMapToInt(line -> line.chars()).sorted();
chars.forEach(c -> System.out.print((char)c));
}
}

What is the result?


Select the one correct answer.
(a) iamok
(b) aikmo
(c) amiok
(d) The result from running the program is unpredictable.
(e) The program will throw an exception at runtime.

16.12 Which of the following statements are true about the Stream methods?
Select the two correct answers.
(a) The filter() method accepts a Function.
(b) The peek() method accepts a Function.
(c) The peek() method accepts a Consumer.
REVIEW QUESTIONS 1021

(d) The forEach() method accepts a Consumer.


(e) The map() method accepts a Predicate.
(f) The max() method accepts a Predicate.
(g) The findAny() method accepts a Predicate.

16.13 Which Stream methods are terminal operations?


Select the two correct answers.
(a) peek()
(b) forEach()
(c) map()
(d) filter()
(e) sorted()
(f) min()

16.14 Which Stream methods have short-circuit execution?


Select the two correct answers.
(a) collect()
(b) limit()
(c) flatMap()
(d) anyMatch()
(e) reduce()
(f) sum()

16.15 Given the following code:


import java.util.stream.*;

public class RQ17 {


public static void main(String[] args) {
Stream<String> values = Stream.of("is", "this", "", null, "ok", "?");
// (1) INSERT CODE HERE
System.out.println(c);
}
}

Which statement inserted independently at (1) produces the output 6?


Select the one correct answer.
(a) long c = values.count();
(b) long c = values.collect(Collectors.counting());
(c) int c = values.mapToInt(v -> 1).reduce(0, (x, y) -> x + 1);
(d) long c = values.collect(Collectors.reducing(0L, v -> 1L, Long::sum));
(e) int c = values.mapToInt(v -> 1).sum();
(f) Insert any of the above.

16.16 Which code produces identical results?


Select the two correct answers.
(a) Set<String> set1 = Stream.of("XX", "XXXX", "", null, "XX", "X")
.filter(v -> v != null)
.collect(Collectors.toSet());
1022 CHAPTER 16: STREAMS

set1.stream()
.mapToInt(v -> v.length())
.sorted()
.forEach(v -> System.out.print(v));
(b) Set<Integer> set2 = Stream.of("XX", "XXXX", "", null, "XX", "X")
.map(v -> (v == null) ? 0 : v.length())
.filter(v -> v != 0)
.collect(Collectors.toSet());
set2.stream()
.sorted()
.forEach(v -> System.out.print(v));
(c) List<Integer> list1 = Stream.of("XX", "XXXX", "", null, "XX", "X")
.map(v -> (v == null) ? 0 : v.length())
.filter(v -> v != 0)
.toList();
list1.stream()
.sorted()
.forEach(v -> System.out.print(v));
(d) List<Integer> list2 = Stream.of("XX", "XXXX", "", null, "XX", "X")
.map(v -> (v == null) ? 0 : v.length())
.filter(v -> v != 0)
.distinct()
.toList();
list2.stream()
.sorted()
.forEach(v -> System.out.print(v));

You might also like