Home » Java » Core Java » Guava Splitter vs StringUtils

About Tom Jefferys

Tom Jefferys

Guava Splitter vs StringUtils

So I recently wrote a post about good old reliable Apache Commons StringUtils, which provoked a couple of comments, one of which was that Google Guava provides better mechanisms for joining and splitting Strings. I have to admit, this is a corner of Guava I’ve yet to explore. So thought I ought to take a closer look, and compare with StringUtils, and I have to admit I was surprised at what I found.

Splitting strings eh? There can’t be many different ways of doing this surely?

Well Guava and StringUtils do take a sylisticly different approach. Lets start with the basic usage.

// Apache StringUtils...
String[] tokens1 = StringUtils.split('one,two,three',',');

// Guava splitter...
Iterable<String> tokens2 = Splitter.on(',').split('one,two,three');

So, my first observation is that Splitter is more object orientated. You have to create a splitter object, which you then use to do the splitting. Whereas the StringUtils splitter methods uses a more functional style, with static methods.

Here I much prefer Splitter. Need a reusable splitter that splits comma separated lists? A splitter that also trims leading and trailing white space, and ignores empty elements? Not a problem:

Splitter niceCommaSplitter = Splitter.on(',')

niceCommaSplitter.split('one,, two,  three'); //'one','two','three'
niceCommaSplitter.split('  four  ,  five  '); //'four','five'

That looks really useful, any other differences?

The other thing to notice is that Splitter returns an Iterable<String>, whereas StringUtils.split returns a String array.

Don’t really see that making much of a difference, most of the time I just want to loop through the tokens in order anyway!

I also didn’t think it was a big deal, until I examined the performance of the two approaches. To do this I tried running the following code:

final String numberList = 'One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Ten';

long start = System.currentTimeMillis();  
for(int i=0; i<1000000; i++) {
    StringUtils.split(numberList , ',');   
System.out.println(System.currentTimeMillis() - start);

start = System.currentTimeMillis();
for(int i=0; i<1000000; i++) {
    Splitter.on(',').split(numberList );
System.out.println(System.currentTimeMillis() - start);

On my machine this output the following times:


Guava’s Splitter is almost 10 times faster!

Now this is a much bigger difference than I was expecting, Splitter is over 10 times faster than StringUtils. How can this be? Well, I suspect it’s something to do with the return type. Splitter returns an Iterable<String>, whereas StringUtils.split gives you an array of Strings! So Splitter doesn’t actually need to create new String objects.

It’s also worth noting you can cache your Splitter object, which results in an even faster runtime.

Blimey, end of argument? Guava’s Splitter wins every time?

Hold on a second. This isn’t quite the full story. Notice we’re not actually doing anything with the result of the Strings? Like I mentioned, it looks like the Splitter isn’t actually creating any new Strings. I suspect it’s actually deferring this to the Iterator object it returns.

So can we test this?

Sure thing. Here’s some code to repeatedly check the lengths of the generated substrings:

final String numberList = 'One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Ten';
long start = System.currentTimeMillis();  
for(int i=0; i<1000000; i++) {
  final String[] numbers = StringUtils.split(numberList, ',');
    for(String number : numbers) {
System.out.println(System.currentTimeMillis() - start);

Splitter splitter = Splitter.on(',');
start = System.currentTimeMillis();
for(int i=0; i<1000000; i++) {
  Iterable<String> numbers = splitter.split(numberList);
    for(String number : numbers) {
System.out.println(System.currentTimeMillis() - start);

On my machine this outputs:


Guava’s Splitter is almost 4 times slower!

Indeed, I was expecting them to be about the same, or maybe Guava slightly faster, so this is another surprising result. Looks like by returning an Iterable, Splitter is trading immediate gains, for longer term pain. There’s also a moral here about making sure performance tests are actually testing something useful.

In conclusion I think I’ll still use Splitter most of the time. On small lists the difference in performance is going to be negligible, and Splitter just feels much nicer to use. Still I was surprised by the result, and if you’re splitting lots of Strings and performance is an issue, it might be worth considering switching back to Commons StringUtils.

Reference: Guava Splitter vs StringUtils from our JCG partner Tom Jefferys at the Tom’s Programming Blog blog.

Do you want to know how to develop your skillset to become a Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you our best selling eBooks for FREE!


1. JPA Mini Book

2. JVM Troubleshooting Guide

3. JUnit Tutorial for Unit Testing

4. Java Annotations Tutorial

5. Java Interview Questions

6. Spring Interview Questions

7. Android UI Design


and many more ....




  1. except guava’s Splitter and Joiner, i think that StringUtils is more richer. It’s time for guava team to improve their ‘Strings’ utility class ;)

  2. It’s worth mentioning that the Splitter Iterator delays the actual splitting, whereas StringUtils.split does it all up front. This may make a difference in certain use cases, like when searching for the first match and still needing all the preceding values, but not the following ones, or when you only need a subset of the return values and never store the others to variables. It’s also a boon when parsing large strings as it doesn’t have to store the whole array in memory at one time. There might also be cases where returning an Iterator makes the code simpler than having to wrap an array with one. Sometimes maintenance cost is worth more than actual performance, especially if it’s not in the critical path.

    Granted it’s probably true that most splitting is done on short strings and optimizing on edge cases may not be the best idea for general purpose tools. But now you have two tools that are good at two different scenarios!

  3. Beware of micro benchmarks: http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java For example, it is conceivable that the second loop in your first example is simply ignored by the JVM because it does not have any side effects. There are many factors that could significantly affect your results.

  4. “There’s also a moral here about making sure performance tests are actually testing something useful.”

    Maybe follow your own advice. How is your test useful?

    Also, were your tests written in Groovy. All the strings have single quotes?

    here are my results if you sum the lengths and print it out:

    39000000: 375
    39000000: 427

    summing the lengths with a leading whitespace in one value

    40000000: 357
    40000000: 436

    … and trimming the results
    39000000: 456
    39000000: 586

    Not nearly as dramatic as you exclaim. Now, this difference is over 1 million trial so the only question I would think would be one of style. If you are using StringUtils and not using guava anywhere else is it worth loading the jar just to split a string with a more functional vs imperative style? Vice-versa if you are not using StringUtils. is it worth switching for negligible impact? You might even make the same argument for using java.lang.String’s own split.

Leave a Reply

Your email address will not be published. Required fields are marked *


Want to take your Java skills to the next level?

Grab our programming books for FREE!

Here are some of the eBooks you will get:

  • Spring Interview QnA
  • Multithreading & Concurrency QnA
  • JPA Minibook
  • JVM Troubleshooting Guide
  • Advanced Java
  • Java Interview QnA
  • Java Design Patterns