Migrating from javaagent to JVMTI: our experience

Nikita Salnikov TarnovskiMarch 24th, 2014Last Updated: March 23rd, 2014

0 26 4 minutes read

When you need to gather data from within the JVM, you will find yourself working dangerously close to the Java Virtual Machine internals. Luckily, there are ways you can avoid getting bogged down by JVM implementation details. The fathers of Java have given you not one but two beautiful tools to work with.

In this post we will explain the differences between the two approaches and explain why we recently ported a significant part of our algorithms.

Javaagent

The first option is to use the java.lang.instrument interface. This approach loads your monitoring code into the JVM itself using the -javaagent startup parameter. Being an all Java option, javaagents tend to be the first path to take if your background is in Java development. The best way to illustrate how you can benefit from the approach is via an example.

Let us create a truly simple agent, which would be responsible for monitoring all method invocations in your code. And when the agent faces a method invocation, it will log the invocation to the standard output stream:

import org.objectweb.asm.*;

public class MethodVisitorNotifyOnMethodEntry extends MethodVisitor {
   public MethodVisitorNotifyOnMethodEntry(MethodVisitor mv) {
       super(Opcodes.ASM4, mv);
       mv.visitMethodInsn(Opcodes.INVOKESTATIC, Type.getInternalName(MethodVisitorNotifyOnMethodEntry.class), "callback", "()V");
   }

   public static void callback() {
        System.out.println("Method called!");    }
}

You can use the example above, package it as a javaagent (essentially a small JAR file with a special MANIFEST.MF), and launch it using the agent’s premain() method similar to the following:

java -javaagent:path-to/your-agent.jar com.yourcompany.YourClass

When launched, you would see a bunch of “Method called!” messages in your log files. And in our case nothing more. But the concept is powerful, especially when combined with bytecode instrumentation tools such as ASM or cgLib as in our example above.

In order to keep the example easy to understand, we have skipped some details. But it is relatively simple – when using java.lang.instrument package you start by writing your own agent class, implementing public static void premain(String agentArgs, Instrumentation inst). Then you need to register your ClassTransformer with inst.addTransformer. As you most likely wish to avoid direct manipulation of class bytecode, you would use some bytecode manipulation library, such as ASM in the example we used. With it, you just have to implement a couple more interfaces – ClassVisitor (skipped for brevity) and MethodVisitor.

JVMTI

The second path to take will eventually lead you to JVMTI. JVM Tool Interface (JVM TI) is a standard native API that allows native libraries capture events and control the Java Virtual Machine. Access to JVMTI is usually packaged in a specific library called an agent.

The example below demonstrates the very same callback registration already seen in the javaagent section, but this time it is implemented as a JVMTI call:

void JNICALL notifyOnMethodEntry(jvmtiEnv *jvmti_env, JNIEnv* jni_env, jthread thread, jmethodID method) {
    fputs("method was called!\n", stdout);
}

int prepareNotifyOnMethodEntry(jvmtiEnv *jvmti) {
    jvmtiError error;
    jvmtiCapabilities requestedCapabilities, potentialCapabilities;
    memset(&requestedCapabilities, 0, sizeof(requestedCapabilities));

    if((error = (*jvmti)->GetPotentialCapabilities(jvmti, &potentialCapabilities)) != JVMTI_ERROR_NONE) return 0;

    if(potentialCapabilities.can_generate_method_entry_events) {
       requestedCapabilities.can_generate_method_entry_events = 1;
    }
    else {
       //not possible on this JVM
       return 0;
    }

    if((error = (*jvmti)->AddCapabilities(jvmti, &requestedCapabilities)) != JVMTI_ERROR_NONE) return 0;

    jvmtiEventCallbacks callbacks;
    memset(&callbacks, 0, sizeof(callbacks));
    callbacks.MethodEntry = notifyOnMethodEntry;

    if((error = (*jvmti)->SetEventCallbacks(jvmti, &callbacks, sizeof(callbacks))) != JVMTI_ERROR_NONE) return 0;
    if((error = (*jvmti)->SetEventNotificationMode(jvmti, JVMTI_ENABLE,    JVMTI_EVENT_METHOD_ENTRY, (jthread)NULL)) != JVMTI_ERROR_NONE) return 0;

    return 1;
}

There are several differences between the approaches. For example, you can get more information via JVMTI than the agent. But the most crucial difference between the two is derived from the loading mechanics. While the Instrumentation agents are loaded inside the heap, they are governed by the same JVM. Whereas the JVMTI agents are not governed by the JVM rules and are thus not affected by the JVM internals such as the GC or runtime error handling. What it means, is best explained via our own experience.

Migrating from -javaagent to JVMTI

When we started building our memory leak detector three years ago we did not pay much attention to pros and cons of those approaches. Without much hesitation we implemented the solution as a -javaagent.

Throughout the years we have started to understand implications. Some of which were not too pleasant, thus in our latest release we have ported a significant part of our memory leak detection mechanics to the native code. What made us jump to such conclusion?

First and foremost – when residing in the heap you need to accommodate yourself next to the application’s own memory structures. Which, as learned through painful experience can lead to problems in itself. When your app has already filled the heap close to the full extent the last thing you need is a memory leak detector that would only seem to speed up the arrival of the OutOfMemoryError.

But the added heap space was lesser of the evils haunting us. The real problem was related to the fact that our data structures were cleaned using the same garbage collector that the monitored application itself was using. This resulted in longer and more frequent GC pauses.

While most applications did not mind the few extra percentage points we added to heap consumption, we learned that the unpredictable impact on Full GC pauses was something we needed to get rid of.

To make things worse – how Plumbr works is that it monitors all object creations and collections. When you monitor something, you need to keep track. Keeping track tends to create objects. Created objects will be eligible for GC. And when it is now GC you are monitoring, you have just created a vicious circle – the more objects are garbage collected, the more monitors you create triggering even more frequent GC runs, etc.

When keeping track of objects, we are notified about the death of objects by the JVMTI. However, JVMTI does not permit the use of JNI during those callbacks. So if we keep the statistics about tracked objects in Java, it is not possible to instantly update the statistics when we are notified of changes. Instead the changes need to be cached and applied when we know the JVM is in the correct state. This created unnecessary complexity and delays in updating the actual statistics.

Reference:

Migrating from javaagent to JVMTI: our experience from our JCG partner Ago Allikmaa at the Plumbr Blog blog.