Last Updated on July 22, 2024

In Java Streams, Collectors are a crucial component that help you transform a stream of elements into a summarized result.

Collectors act as terminal operations within a stream pipeline. They take the processed elements from the stream and combine them into a desired output format.

They can transform a stream into various data structures like Lists, Sets, Maps, or even custom collections. Additionally, they can perform aggregations like calculating sums, averages, or finding minimum/maximum values.

Collectors are part of Java Streams and let’s take a look at what streams are in java.

Java Streams

Java Streams are a powerful feature introduced in Java 8 that allows you to process collections of data in a concise and declarative way.

Streams operate on existing collections of data like arrays, Lists, Sets, or even custom data structures.

Important differences in stream operations from traditional acting against collections are: Parallel processing, Declarative approach and Lazy evaluation.

Lets see what each concept means in context of streams.

Parallel Processing

Streams can be processed in parallel utilizing multiple cores of the system.

They leverage the Fork/Join Framework introduced in Java 7 for parallel processing. This framework divides the work of processing a stream into smaller tasks and distributes them across available cores on your CPU.

Declarative Approach

Traditional programming often uses an imperative approach, where you write code with detailed instructions for the computer to follow

The declarative style refers to how you express what you want to achieve with your data, rather than explicitly defining every step involved in processing it.

You focus on what you want the final outcome to be, and the stream handles the underlying processing details.

Lazy Evaluation

In traditional programming with loops and iterators, the entire data collection is typically processed even if you only need a specific subset of elements or a single value.

Stream operations are typically lazy, meaning the actual computations on the data happen only when necessary. 

This means the actual computations on the data elements are deferred until a terminal operation is called.

This can improve efficiency for large datasets.

Java Stream Collectors

Stream Collectors are a powerful feature in the Java 8 Stream API that allow you to collect and process data efficiently. 

collect() method

The method accepts a Collector that describes how the elements of the stream should be collected and aggregated as an argument. 

There are two variants of the Java Stream collect() method:

<R> R collect(Supplier<R> supplier,BiConsumer<R,? super T> accumulator, BiConsumer<R,R> combiner)

Performs a mutable reduction operation on the elements of this stream. 

A mutable reduction is one in which the reduced value is a mutable result container, such as an ArrayList, and elements are incorporated by updating the state of the result rather than by replacing the result.

<R,A> R collect(Collector<? super T,A,R> collector)

Performs a mutable reduction operation on the elements of this stream using a Collector.

A Collector encapsulates the functions used as arguments to collect(Supplier, BiConsumer, BiConsumer), allowing for reuse of collection strategies and composition of collect operations such as multiple-level grouping or partitioning.

Collectors Class

Collectors class is an implementation of collector Interface.

public interface Collector<T, A, R> {
   Supplier<A> supplier();
   BiConsumer<A, T> accumulator();
   Function<A, R> finisher();
   BinaryOperator<A> combiner();
   Set<Characteristics> characteristics();
}

Supplier<T> supplier()

This method defines a function that creates a new container to hold the final result. For example, a List collector might use an empty ArrayList as the supplier.

BiConsumer<T, E> accumulator(T container, E element)

This method defines an accumulator function that combines each element from the stream with the current result in the container. This function iteratively builds the final output.

BinaryOperator<T> combiner(T leftResult, T rightResult)

In the case of parallel processing of streams, the Collector might include a combiner function. This function merges the partial results from different threads into a single final result.

Function<T, R> finisher(T container)

Some collectors might utilize a finisher function to perform any final modifications or transformations on the accumulated result before returning it.

It  implements various useful reduction operations, such as accumulating elements into collections, summarizing elements according to various criteria, etc.

The Collectors class provides a lot of Collector implementation to help us out.

Some of them are given examples below.

collect() to List

The toList collector can be used for collecting all Stream elements into a List instance. 

package org.codeline;
import java.util.List;
import java.util.stream.Collectors;

public class ToList {
    public static void main(String[] args) {

        List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);

        List<Integer> evenNumbers = numbers.stream().map(number -> number * 3).collect(Collectors.toList());
        System.out.println(evenNumbers);  // [3, 6, 9, 12, 15, 18]
    }
}

collect() to Set

package org.codeline;
import java.util.List;
import java.util.Set;
import java.util.stream.Collectors;

public class ToSet {
    public static void main(String[] args) {

        List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);

        Set<Integer> tripleNumbers = numbers.stream().map(number -> number * 3).collect(Collectors.toSet());
        System.out.println(tripleNumbers);  // [3, 6, 9, 12, 15, 18]
    }
}

collect() to Map

This method accepts two arguments for mapping key and the corresponding value in the Map.

In this example, Function.identity() is used to create a function that takes a number and returns the same number

package org.codeline;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

public class toMap {
public static void main(String[] args) {

List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);

Map<Integer, String> mapTripleNumbers = numbers.parallelStream().map(number -> number * 3)
.collect(Collectors.toMap(Function.identity(), x -> String.valueOf(x)));
System.out.println(mapTripleNumbers); //{18=18, 3=3, 6=6, 9=9, 12=12, 15=15}

}
}

collect(Collectors.counting())

Counting is a simple collector that allows for the counting of all Stream elements.

package org.codeline;
import java.util.List;
import java.util.stream.Collectors;

public class Counting {
public static void main(String[] args) {

List<Integer> numbers = List.of(1, 2, 3, 4, 5, 6);

Long count = numbers.stream().collect(Collectors.counting());
System.out.println(count); // 6
}
}

Custom Collector

We’ll create a custom collector that calculates the sum of squares for a list of integers. This custom collector will demonstrate how to create a more specialized reduction operation.

public class SumOfSquaresCollector implements Collector<Integer, long[], Long> {

    @Override
    public Supplier<long[]> supplier() {
        return () -> new long[1];
    }

    @Override
    public BiConsumer<long[], Integer> accumulator() {
        return (sum, num) -> sum[0] += (long) num * num;
    }

    @Override
    public BinaryOperator<long[]> combiner() {
        return (sum1, sum2) -> {
            sum1[0] += sum2[0];
            return sum1;
        };
    }

    @Override
    public Function<long[], Long> finisher() {
        return sum -> sum[0];
    }

    @Override
    public Set<Characteristics> characteristics() {
        return Set.of(Characteristics.UNORDERED);
    }
}

Supplier method returns a supplier that creates a new accumulator. We use a single-element long array to store the running sum.

Accumulator method defines how to incorporate a new element into the accumulator. It squares the input number and adds it to the accumulator.

Combiner method combines two accumulators, which is useful for parallel processing. It adds the value from the second accumulator to the first.

Finisher method converts the accumulator to the final result type. It simply returns the value stored in the accumulator array.

Characteristics method declares the characteristics of the collector. UNORDERED means the collection order doesn’t affect the result.

the main program creates a stream from the list and applies the collector to calculate the sum of squares.

package org.codeline;
import java.util.List;
public class SquaresCollector {
    public static void main(String[] args) {
        List<Integer> numbers = List.of(1, 2, 3, 4, 5);
        long sumOfSquares = numbers.stream()
                .collect(new SumOfSquaresCollector());
        System.out.println("Sum of squares: " + sumOfSquares); //55
    }
}

Conclusion

Collectors eliminate the need for explicit loops and conditional statements for common aggregation tasks. This leads to cleaner and more readable code.

Java provides a rich set of built-in collectors for operations like counting, summing, averaging, finding minimum or maximum values, grouping, partitioning, and joining elements.

We can also create custom collectors for specific needs.

Scroll to Top