Java 8 – Find Duplicate Elements in a Stream

Introduction

Finding duplicate elements in a collection is a common task in many applications. Java 8 provides a simple and efficient way to find duplicates using the Stream API. By leveraging Collectors, Set, and filtering operations, we can easily identify the duplicate elements in a Stream.

In this guide, we will learn how to find duplicate elements in a stream using Java 8.

Solution Steps

  1. Define the Input Stream: Create a Stream of elements to be processed.
  2. Use a Set to Track Seen Elements: Use a Set to store elements that have been encountered.
  3. Filter Duplicates: Filter elements that are already present in the set to identify duplicates.
  4. Collect and Display the Duplicates: Collect the duplicates into a List or print them.

Java Program

import java.util.Arrays;
import java.util.List;
import java.util.Set;
import java.util.HashSet;
import java.util.stream.Collectors;

public class FindDuplicatesInStream {
    public static void main(String[] args) {
        // Step 1: Define the input list and create a Stream
        List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 3, 2, 6, 7, 2, 4);

        // Step 2-3: Use a Set to track seen elements and filter duplicates
        Set<Integer> seen = new HashSet<>();
        List<Integer> duplicates = numbers.stream()
            .filter(n -> !seen.add(n))  // Add element to Set and check if it's already there
            .collect(Collectors.toList());  // Collect duplicates into a List

        // Step 4: Display the duplicates
        System.out.println("Duplicate elements: " + duplicates);
    }
}

Output

Duplicate elements: [3, 2, 2, 4]

Explanation

Step 1: Define the Input Stream

We begin by defining a list of integers:

List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 3, 2, 6, 7, 2, 4);

This list contains several duplicate elements, including 2, 3, and 4.

Step 2-3: Use a Set to Track Seen Elements and Filter Duplicates

We use a HashSet to keep track of elements that we have already encountered:

Set<Integer> seen = new HashSet<>();

The filter() method checks if adding the current element to the Set is successful. If seen.add(n) returns false, it means the element is already present, and thus it is a duplicate:

.filter(n -> !seen.add(n))

If an element is found to be a duplicate, it is collected using Collectors.toList().

Step 4: Display the Duplicates

The resulting list of duplicates is printed to the console:

System.out.println("Duplicate elements: " + duplicates);

Conclusion

In Java 8, finding duplicate elements in a Stream is easy and efficient using the filter() method and a Set to track previously seen elements. By using the Stream API, you can quickly identify duplicates and perform further operations on them, making this approach highly flexible for handling collections.

Comments