Foreword: The two issues described here, were discovered and fixed more than a year ago. This article only serves as historical proof, and a beginners’ guide on tackling file descriptor leaks in Java.
In Ultra ESB we use an in-memory RAM disk file cache for fast and garbage-free payload handling. Some time back, we faced an issue on our shared SaaS AS2 Gateway where this cache was leaking file descriptors over time. Eventually leading to
too many open files errors when the system
ulimit was hit.
The Legion of the Bouncy Castle: leftovers from your stream-backed MIME parts?
With some simple tooling we found that BC had the habit of calling
getContent() on MIME parts in order to determine their type (say,
instanceof checks). True, this wasn’t a crime in itself; but most of our MIME parts were file-backed, with a file-cache file on the other end – meaning that every
getContent() opens a new stream to the file. So now there are stray streams (and hence file descriptors) pointing to our file cache.
Enough of these, and we would exhaust the file descriptor quota allocated to the Ultra ESB (Java) process.
Solution? Make ’em lazy!
We didn’t want to mess with the BC codebase. So we found a simple solution: create all file-backed MIME parts with “lazy” streams. Our (former) colleague Rajind wrote a
LazyFileInputStream – inspired by
jboss-vfs – that opens the actual file only when a
read is attempted.
BC was happy, and so was the file cache; but we were the happiest.
Hibernate JPA: cleaning up after supper, a.k.a closing consumed streams
Another bug we spotted was that some database operations were leaving behind unclosed file handles. Apparently this was only when we were feeding stream-backed blobs to Hibernate, where the streams were often coming from file cache entries.
After some digging, we came up with a theory that Hibernate was not closing the underlying streams of these blob entries. (It made sense because the
java.sql.Blob interface does not expose any methods that Hibernate could use to manipulate the underlying data sources.) This was a problem, though, because the discarded streams (and the associated file handles) would not get released until the next GC.
This would have been fine for a short-term app, but a long-running one like ours could easily run out of file descriptors; such as in case of a sudden and persistent spike.
Solution? Make ’em self-closing!
We didn’t want to lose the benefits of streaming, but we didn’t have control over our streams either. You might say we should have placed our streams in auto-closeable constructs (say, try-with-resources). Nice try; but sadly, Hibernate was reading them outside of our execution scope (especially in
@Transactional flows). As soon as we started closing the streams within our code scope, our database operations started to fail miserably – screaming “stream already closed!”.
When in Rome, do as Romans do, they say.
So, instead of messing with Hibernate, we decided we would take care of the streams ourselves.
Rajind (yeah, him again) hacked together a
SelfClosingInputStream wrapper. This would keep track of the amount of data read from the underlying stream, and close it up as soon as the last byte was read.
(We did consider using existing options like
AutoCloseInputStream from Apache
commons-io; but it occurred that we needed some customizations here and there – like detailed trace logging.)
When it comes to resource management in Java, it is quite easy to over-focus on memory and CPU (processing), and forget about the rest. But virtual resources – like ephemeral ports and per-process file descriptors – can be just as important, if not more.
Especially on long-running processes like our AS2 Gateway SaaS application, they can literally become silent killers.
You can detect this type of “leaks” in two main ways:
- “single-cycle” resource analysis: run a single, complete processing cycle, comparing resource usage before and after
- long-term monitoring: continuously recording and analyzing resource metrics to identify trends and anomalies
In any case, fixing the leak is not too difficult; once you have a clear picture of what you are dealing with.
Good luck with hunting down your resource-hog d(a)emons!