Recently, I’ve been working on a Java application that suffered some serious performance issues. Among many problems, the one which really got my attention was a relatively slow allocation rate of new objects (the application was allocating a massive amount of rather large objects). As it later turned out, the reason was that a significant amount of allocations was happening outside TLAB.
What is TLAB?
In Java, new objects are allocated in Eden. It’s a memory space shared between threads. If you take into account that multiple threads can allocate new objects at the same time, it becomes obvious that some sort of synchronization mechanism is needed. How could it be solved? Allocation queue? Some kind of mutex? Even though these are decent solutions, there is a better one. Here is where TLAB comes into play. TLAB stands for Thread Local Allocation Buffer and it is a region inside Eden which is exclusively assigned to a thread. In other words, only a single thread can allocate new objects in this area. Each thread has own TLAB. Thanks to that, as long as objects are allocated in TLABs, there is no need for any type of synchronization. Allocation inside TLAB is a simple
pointer bump (that’s why it’s sometimes called pointer bump allocation)
– so the next free memory address is used.
TLAB Gets Full
As you can imagine, TLAB is not infinite and at some point, it starts getting full. If a thread needs to allocate a new object which does not fit into the current TLAB (because it’s almost full), two things can happen:
- thread gets a new TLAB
- the object is allocated outside TLAB
JVM decides what will happen based on several parameters. If the first option is chosen, the thread’s current TLAB becomes “retired”, and the allocation is done in new TLAB. In the second scenario, the allocation is done in a shared region of Eden and that’s why some sort of synchronization is needed. As usually, synchronization comes at the price.
Too Large Objects
By default, TLABs are dynamically resized for each thread individually. Size of TLAB is recalculated based on the size of Eden, the number of threads, and their allocation rates. Changing them might impact the TLABs sizing – however, because an allocation rate usually varies, there is no easy formula for that. When a thread needs to allocate a large object (e.g. large array) which would never fit into the TLAB, then it will be allocated in a shared region of Eden, which again, means synchronization. This is exactly what was going on in my application. Because the certain objects were just too big, they were never allocated in TLAB.
Having some objects allocated outside TLAB isn’t necessarily a bad thing – this is a typical situation that happens before minor GC. The problem is when there is a huge number of allocations outside TLAB as compared to the ones inside TLAB. If that’s the case, there are two options available:
- make the objects smaller
- try to adjust TLAB sizing
In my case, adjusting TLAB size manually was not the best option. There were only few object types that were notoriously allocated outside TLAB. As usually, fixing the code was the best option. After I had slimmed the objects down significantly, they fitted into TLAB and the allocation inside TLAB to allocation outside TLAB ratio was back to normal.