Blame Iterator

java.util.Enumeration and java.util.Iterator have a prominent place in Java Collection Framework. Personally, I only use them locally inside a method to traverse a collection. I don't use it as a return type or pass in as a parameter. So far so good.

Recently we had to invoke a public method in another module, which happen to return an instance of java.util.Iterator. The hard part is, in certain cases, we need to traverse the data twice, but the default implementation of java.util.Iterator is not rewindable. A workaround is to save the data in a class variable collection during the first pass, so the second pass can operate on the cached data. This hideous idea was quickly dismissed.

We could've requested the other team to return the underlying collection. But it was close to release and they had every reason to turn it down. In addition, they are hard to work with ...

I forgot how we solved it. But the lesson is, don't use java.util.Iterator, or java.util.Enumeration in public API, either as return type, or method parameters. Instead, use the underlying collection interface directly (e.g., java.util.List, java.util.Set, java.util.Queue). More reasons:

  • Iterator is too generic.
    You may view it as its advantage, as it shields the client code from the underlying implementation. Most of the time, I do need to know one thing or two about the underlying data structure. For instance, if it's of type java.util.List, I know it's ordered and can traverse it with index instead of Iterator. If it's of type java.util.Set, I know it contains distinct elements and I would traverse it with Iterator. Interfaces like List and Set already provide us with enough isolation and transparency.

  • Iterator is too restrictive.
    It doesn't rewind or tell you its size. It exposes a very narrow view of the underlying data structure. This interface only has 3 methods: hasNext(), next(), and remove(). I've never used remove(), since removing elements this way is very hard to keep track of.


Anonymous said...

Under Java 5, they could have used java.lang.Iterable to allow iterating more than once.

Under older Java (or if you just get an iterator anyway), you can always batch it up yourself, unless it has way too many elements and memory becomes an issue.

The "remove()" method is handy for avoiding ConcurrentModificationExceptions if you need to change data while iterating, but it doesn't help for adding or reordering items.

And another interesting item is that the basic idea of Iterables (but immutable) is the core data concept in LISP.

Mostly, I don't much mind Iterators/Iterables. What if it comes from a huge data source and couldn't be in memory to start with? If the language made Iterators easier (see for instance LISP or also Python's generators), then you get scalability and convenience together.

Anonymous said...

Depending on their implementation, you could have asked for a ResettableIterator from commons-collections.

Anna said...

Great and Useful Article.

Online Java Course

Java Online Training

Java Course Online

J2EE training

online J2EE training

Best Recommended books for Spring framework

Java Interview Questions

Java Training Institutes in Chennai

Java Training in Chennai

J2EE Training in Chennai

java j2ee training institutes in chennai

Jack sparrow said...

That is nice article from you , this is informative stuff . Hope more articles from you . I also want to share some information about devops training online and devops tutorial videos