Java allows you to process data in collections or streams. It’s very easy to think of streams as a technique to turn one collection into another. This can lead to some rather casual code where streaming data is repeatedly collected to some sort of collection, passed as a whole collection, and then processed some more.
For 6 elements, who cares!
The above suffers from a code-smell, which is the constant collection and restreaming of a stream, and most people would probably notice that and remove some of the interim lists if it was all one method.
Most people would. I’ve seen people not do this.
However, if the above were using subroutines to process things, it’s quite easy to optimise the simplicity of the subroutines’ APIs and make them receive and return a collection. This was you can end up with the above behaviour.
The solution is to look at the pipeline of data processing at the high level in terms of filter, map, and reduce type functions and try to model it around streams.
Treat Streams as Though They’re Infinite
We have small containers these days and we want them to get the most of their resources. A small container, if running continuously can process an unbounded stream of data. If we imagine that all our data is a potentially infinite stream, and design our software to use streaming to avoid getting all of it into memory, then two good things happen:
- We optimise the max memory requirement of our streams to be as low as possible for ALL cases
- We HAVE to use the Streaming API properly and we end up with cleaner code, as the declarative aspect of the Stream API helps describe what’s happening in the data conversion. We probably even lost some horribly named temporary variables in the process…
The above code then becomes:
Opinions expressed by Java Code Geeks contributors are their own.