About Ryan Wang

Backreferences in Java Regular Expressions

Backreferences in Java Regular Expressions is another important feature provided by Java. To understand backreferences, we need to understand group first. Group in regular expression means treating multiple characters as a single unit. They are created by placing the characters to be grouped inside a set of parentheses – ”()”. Each set of parentheses corresponds to a group.

Backreferences are convenient, because it allows us to repeat a pattern without writing it again. We can just refer to the previous defined group by using \#(# is the group number). This will make more sense after you read the following two examples.
 
 

Example 1: Finding Repeated Pattern

(\d\d\d)\1 matches 123123, but does not match 123456 in a row. This indicates that the referred pattern needs to be exactly the name.

String str = "123456";
Pattern p = Pattern.compile("(\\d\\d\\d)\\1");
Matcher m = p.matcher(str);
System.out.println(m.groupCount());
while (m.find()) {
	String word = m.group();
	System.out.println(word + " " + m.start() + " " + m.end());
}
1
123123 0 6

Example 2: Finding Duplicate Words

String pattern = "\\b(\\w+)\\b[\\w\\W]*\\b\\1\\b";
Pattern p = Pattern.compile(pattern, Pattern.CASE_INSENSITIVE);
String phrase = "unique is not duplicate but unique, Duplicate is duplicate.";
Matcher m = p.matcher(phrase);
while (m.find()) {
	String val = m.group();
	System.out.println("Matching subsequence is \"" + val + "\"");
	System.out.println("Duplicate word: " + m.group(1) + "\n");
}
Matching subsequence is “unique is not duplicate but unique”
Duplicate word: unique
Matching subsequence is “Duplicate is duplicate”
Duplicate word: Duplicate

Note: This is not a good method to use regular expression to find duplicate words. From the example above, the first “duplicate” is not matched.

Why Use Backreferences?

 

Reference: Backreferences in Java Regular Expressions from our JCG partner Xiaoran Wang at the Programcreek blog.
Related Whitepaper:

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Use Java? If you do, you know that Java software can be used to drive application logic of Web services or Web applications. Perhaps you use it for desktop applications? Or, embedded devices? Whatever your use of Java code, functional errors are the enemy!

To combat this enemy, your team might already perform functional testing. Even so, you're taking significant risks if you have not yet implemented a comprehensive team-wide quality management strategy. Such a strategy alleviates reliability, security, and performance problems to ensure that your code is free of functionality errors.Read this article to learn about this simple four-step strategy that is proven to make Java code more reliable, more secure, and easier to maintain.

Get it Now!  

Leave a Reply


3 + = eleven



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books