Introduction
Removing duplicate words from a string is a common text-processing task. This can be useful in various scenarios, such as cleaning up user input, preparing data for analysis, or simply improving the readability of text. In this blog post, we'll explore how to remove duplicate words from a string using traditional methods as well as Java 8 features.
Table of Contents
- Using a Traditional Approach
- Using Java 8 Streams
- Complete Example Program
- Conclusion
1. Using a Traditional Approach
The traditional approach involves using a HashSet
to store words as we iterate through the string. Since a HashSet
does not allow duplicate values, which helps remove duplicates.
Example:
import java.util.HashSet;
import java.util.Set;
public class RemoveDuplicateWordsTraditional {
public static void main(String[] args) {
String input = "Java is great and Java is fun and Java is powerful";
String result = removeDuplicateWords(input);
System.out.println("Original String: " + input);
System.out.println("String after removing duplicates: " + result);
}
public static String removeDuplicateWords(String input) {
String[] words = input.split("\\s+");
Set<String> wordSet = new HashSet<>();
StringBuilder result = new StringBuilder();
for (String word : words) {
if (!wordSet.contains(word)) {
wordSet.add(word);
result.append(word).append(" ");
}
}
return result.toString().trim();
}
}
Output:
Original String: Java is great and Java is fun and Java is powerful
String after removing duplicates: Java is great and fun powerful
2. Using Java 8 Streams
Java 8 Streams provide a modern and concise way to handle this task. We can use streams to filter out duplicate words and then join the result back into a string.
Example:
import java.util.Arrays;
import java.util.LinkedHashSet;
import java.util.Set;
import java.util.stream.Collectors;
public class RemoveDuplicateWordsStreams {
public static void main(String[] args) {
String input = "Java is great and Java is fun and Java is powerful";
String result = removeDuplicateWords(input);
System.out.println("Original String: " + input);
System.out.println("String after removing duplicates: " + result);
}
public static String removeDuplicateWords(String input) {
Set<String> wordSet = Arrays.stream(input.split("\\s+"))
.collect(Collectors.toCollection(LinkedHashSet::new));
return String.join(" ", wordSet);
}
}
Output:
Original String: Java is great and Java is fun and Java is powerful
String after removing duplicates: Java is great and fun powerful
Explanation:
-
Traditional Approach:
- Split the input string into words using
split("\\s+")
. - Use a
HashSet
to store unique words. - Iterate through the words, adding each unique word to the
HashSet
and appending it to the result string. - Trim the result string to remove any trailing spaces.
- Split the input string into words using
-
Java 8 Streams:
- Split the input string into a stream of words using
Arrays.stream(input.split("\\s+"))
. - Collect the words into a
LinkedHashSet
to maintain insertion order while removing duplicates. - Join the words back into a single string using
String.join(" ", wordSet)
.
- Split the input string into a stream of words using
3. Complete Example Program
Here is a complete program that demonstrates both methods to remove duplicate words from a string.
Example Code:
import java.util.Arrays;
import java.util.HashSet;
import java.util.LinkedHashSet;
import java.util.Set;
import java.util.stream.Collectors;
public class RemoveDuplicateWordsExample {
public static void main(String[] args) {
String input = "Java is great and Java is fun and Java is powerful";
// Using Traditional Approach
String resultTraditional = removeDuplicateWordsTraditional(input);
System.out.println("Using Traditional Approach:");
System.out.println("Original String: " + input);
System.out.println("String after removing duplicates: " + resultTraditional);
// Using Java 8 Streams
String resultStreams = removeDuplicateWordsStreams(input);
System.out.println("\nUsing Java 8 Streams:");
System.out.println("Original String: " + input);
System.out.println("String after removing duplicates: " + resultStreams);
}
public static String removeDuplicateWordsTraditional(String input) {
String[] words = input.split("\\s+");
Set<String> wordSet = new HashSet<>();
StringBuilder result = new StringBuilder();
for (String word : words) {
if (!wordSet.contains(word)) {
wordSet.add(word);
result.append(word).append(" ");
}
}
return result.toString().trim();
}
public static String removeDuplicateWordsStreams(String input) {
Set<String> wordSet = Arrays.stream(input.split("\\s+"))
.collect(Collectors.toCollection(LinkedHashSet::new));
return String.join(" ", wordSet);
}
}
Output:
Using Traditional Approach:
Original String: Java is great and Java is fun and Java is powerful
String after removing duplicates: Java is great and fun powerful
Using Java 8 Streams:
Original String: Java is great and Java is fun and Java is powerful
String after removing duplicates: Java is great and fun powerful
4. Conclusion
Removing duplicate words from a string can be efficiently achieved using traditional approaches and Java 8 Streams. The traditional approach is straightforward and easy to understand, while Java 8 Streams provides a more modern and concise way to handle the task. Both methods ensure that the resulting string contains only unique words, maintaining the order of their first appearance in the input string.
By understanding these different methods, you can choose the one that best fits your needs and coding style. Happy coding!
Comments
Post a Comment
Leave Comment