About Tomasz Nurkiewicz

Java EE developer, Scala enthusiast. Enjoying data analysis and visualization. Strongly believes in the power of testing and automation.

Which Java thread consumes my CPU?

What do you do when your Java application consumes 100% of the CPU? Turns out you can easily find the problematic thread(s) using built-in UNIX and JDK tools. No profilers or agents required.

For the purpose of testing we’ll use this simple program:

public class Main {
    public static void main(String[] args) {
        new Thread(new Idle(), 'Idle').start();
        new Thread(new Busy(), 'Busy').start();
    }
}

class Idle implements Runnable {

    @Override
    public void run() {
        try {
            TimeUnit.HOURS.sleep(1);
        } catch (InterruptedException e) {
        }
    }
}

class Busy implements Runnable {
    @Override
    public void run() {
        while(true) {
            'Foo'.matches('F.*');
        }
    }
}

As you can see, it starts two threads. Idle is not consuming any CPU (remember, sleeping threads consume memory, but not CPU) while Busy eats the whole core as regular expression parsing and executing is a surprisingly complex process. Let’s run this program and forget about it. How can we quickly find out that Busy is the problematic piece of our software? First of all we use top to find out the process id ( PID) of the java process consuming most of the CPU. This is quite straightforward:

$ top -n1 | grep -m1 java

This will display the first line of top output containing ‘ java‘ sentence:

22614 tomek     20   0 1360m 734m  31m S    6 24.3   7:36.59 java

The first column is the PID, let’s extract it. Unfortunately it turned out that top uses ANSI escape codes for colors – invisible characters that are breaking tools like grep and cut. Luckily I found a perl script to remove these characters and was finally able to extract the PID of java process exhausting my CPU:

$ top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

The cut -f1 -d' ' invocation simply takes the first value out of space-separated columns:

22614

Now when we now the problematic JVM PID, we can use top -H to find problematic Linux threads. The -H option prints a list of all threads as opposed to processes, the PID column now represents the internal Linux thread ID:

$ top -n1 -H | grep -m1 java
$ top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '

The output is surprisingly similar, but the first value is now the thread ID:

25938 tomek     20   0 1360m 748m  31m S    2 24.8   0:15.15 java
25938

So we have a process ID of our busy JVM and Linux thread ID (most likely from that process) consuming our CPU. Here comes the best part: if you look at jstack output (available in JDK), each thread has some mysterious ID printed next to its name:

'Busy' prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000]
    java.lang.Thread.State: RUNNABLE
        at java.util.regex.Pattern$Node.study(Pattern.java:3010)

That’s right, the nid=0x645a parameter is the same as thread ID printed by top -H. Of course to not make it too simple, top uses decimal notation while jstack prints in hex. Again there is a simple solution, printf ‘%x’:

$ printf '%x' 25938
6552

Let’s wrap all we have now into a script and combine the results:

#!/bin/bash
PID=$(top -n1 | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' ')
NID=$(printf '%x' $(top -n1 -H | grep -m1 java | perl -pe 's/\e\[?.*?[\@-~] ?//g' | cut -f1 -d' '))
jstack $PID | grep -A500 $NID | grep -m1 '^$' -B 500

PID holds the java PID and NID holds the thread ID, most likely from that JVM. The last line simply dumps the JVM stack trace of the given PID and filters out (using grep) the thread which has matching nid. Guess what, it works:

$ ./profile.sh
'Busy' prio=10 tid=0x7f3bf800 nid=0x6552 runnable [0x7f25c000]
    java.lang.Thread.State: RUNNABLE
        at java.util.regex.Pattern$Node.study(Pattern.java:3010)
        at java.util.regex.Pattern$Curly.study(Pattern.java:3854)
        at java.util.regex.Pattern$CharProperty.study(Pattern.java:3355)
        at java.util.regex.Pattern$Start.<init>(Pattern.java:3044)
        at java.util.regex.Pattern.compile(Pattern.java:1480)
        at java.util.regex.Pattern.<init>(Pattern.java:1133)
        at java.util.regex.Pattern.compile(Pattern.java:823)
        at java.util.regex.Pattern.matches(Pattern.java:928)
        at java.lang.String.matches(String.java:2090)
        at com.blogspot.nurkiewicz.Busy.run(Main.java:27)
        at java.lang.Thread.run(Thread.java:662)

Running the script multiple times (or with watch, see below) will capture Busy thread in different places, but almost always inside regular expression parsing – which is our problematic piece!

Multiple threads

In case your application has multiple CPU-hungry threads, you can use watch -n1 ./profile.sh command to run the script every second and get semi real-time stack dumps, most likely from different threads. Testing with the following program:

new Thread(new Idle(), 'Idle').start();
new Thread(new Busy(), 'Busy-1').start();
new Thread(new Busy(), 'Busy-2').start();

you’ll see stack traces either of Busy-1 or of Busy-2 threads (in different places inside Pattern class), but never Idle.

Reference: Which Java thread consumes my CPU? from our JCG partner Tomasz Nurkiewicz at the Java and neighbourhood blog.

Related Whitepaper:

Bulletproof Java Code: A Practical Strategy for Developing Functional, Reliable, and Secure Java Code

Use Java? If you do, you know that Java software can be used to drive application logic of Web services or Web applications. Perhaps you use it for desktop applications? Or, embedded devices? Whatever your use of Java code, functional errors are the enemy!

To combat this enemy, your team might already perform functional testing. Even so, you're taking significant risks if you have not yet implemented a comprehensive team-wide quality management strategy. Such a strategy alleviates reliability, security, and performance problems to ensure that your code is free of functionality errors.Read this article to learn about this simple four-step strategy that is proven to make Java code more reliable, more secure, and easier to maintain.

Get it Now!  

One Response to "Which Java thread consumes my CPU?"

  1. Mariano says:

    You should consider taking a look at this one: https://github.com/despegar/threadsanalyzer.
    It shows the top N cpu consuming threads of a JVM process.

Leave a Reply


− 2 = five



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
Do you want to know how to develop your skillset and become a ...
Java Rockstar?

Subscribe to our newsletter to start Rocking right now!

To get you started we give you two of our best selling eBooks for FREE!

Get ready to Rock!
You can download the complementary eBooks using the links below:
Close