Core Java

A Tale of Two Iterators

When you look at the most popular Java interview questions, you might encounter the one about fail-fast and fail-safe iterators:

What’s the difference between fail-fast and fail-safe iterators?

The simplified answer is that:

Fail-fast iterator throws ConcurrentModificationException if the collection is modified while iterating, but fail-safe doesn’t.

Even though it totally makes sense, it’s not clear what the interviewer means by fail-safe. Java specification does not define this term when it comes to iterators. But there are four policies for concurrent modification instead.

Concurrent Modification

Firstly, let’s define what concurrent modification is. Concurrent modification occurs when for example we have an active iterator from the collection and there are some changes made to that collection, but they do not come from our iterator. The most obvious example is when we have multiple threads – one thread is iterating and the second one adds or removes the elements from the same collection. However, we can also get ConcurrentModificationException when we work in a single-threaded environment:

List<String> cities = new ArrayList<>();
cities.add(“Warsaw”);
cities.add(“Prague”);
cities.add(“Budapest”);
 
Iterator<String> cityIterator = cities.iterator();
cityIterator.next();
cities.remove(1);
cityIterator.next(); // throws ConcurrentModificationException

Fail-Fast

The snippet above is the example of fail-fast iterator. As you can see, as soon as we had tried to get the second element from the iterator, the ConcurrentModificationException was thrown. How can an iterator know if the collection was modified after you had created it? You could have a timestamp such as lastModified in the collection. When you create an iterator, you would need to make a copy of this field and store it in the iterator object. Then, whenever you would call next() method, you just need to compare lastModified from the collection with the copy from the iterator. A very similar approach can be found in ArrayList implementation, for instance. There is a modCount instance variable which holds the number of modifications made to the list:

final void checkForComodification() {
   if (modCount != expectedModCount)
       throw new ConcurrentModificationException();
}

It is important to mention that fail-fast iterators work on the best-effort basis – there is no guarantee that ConcurrentModificationException will be thrown if there is a concurrent modification, so we should not rely on that behavior – it should be rather used to detect bugs. Most of the non-concurrent collections provide fail-fast iterators.

Weakly Consistent

Most concurrent collections from java.util.concurrent package (such as ConcurrentHashMap and most Queues) provide weakly consistent iterators. What it means is very well explained in the documentation:

  • they may proceed concurrently with other operations
  • they will never throw ConcurrentModificationException
  • they are guaranteed to traverse elements as they existed upon construction exactly once, and may (but are not guaranteed to) reflect any modifications subsequent to construction.

Snapshot

In this policy, the iterator is associated with the state of the collection from the moment when the iterator was created – our snapshot of the collection. Any change made to the initial collection creates a fresh version of the underlying data structure. Of course, our snapshot is untouched, so it doesn’t reflect any changes made to the collection after the iterator was created. This is the old good copy-on-write (COW) technique. It completely solves the concurrent modification problem, so no ConcurrentModificationException can be thrown. Additionally, the iterators do not support element-changing operations. Copy-on-write collections are usually too expensive to use, but it might be a good idea to give it a try if mutations happen significantly less often the traversals. The examples are CopyOnWriteArrayList and CopyOnWriteArraySet.

Undefined

Undefined behavior can be found in the legacy collections such as Vector and Hashtables. Both of them have standard iterators with fail-fast behavior, but they also expose the implementations of Enumeration interface, which do not define behavior when a concurrent modification occurs. You might see some items being repeated or skipped, or even some weird exceptions flying around. It’s better not to play with this beast!

Published on Java Code Geeks with permission by Grzegorz Mirek, partner at our JCG program. See the original article here: A Tale of Two Iterators

Opinions expressed by Java Code Geeks contributors are their own.

Grzegorz Mirek

Grzegorz is a software developer from Cracow, Poland. He started his adventure with Java roughly 6 years ago when he was at university and since that time, he keeps expanding his knowledge in this field. He is especially interested in JVM performance and optimisations and this is what he mostly blogs about.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button