About Tal Weiss

Tal is co-founder and CEO at Takipi where he leads a team of hackers that build tools that help fix Java and Scala code in production. His main interests are tools for developers, solving nasty bugs (preferably multi-threaded) and Jazz drumming.

5 techniques to improve your server logging

In the recent time we’ve seen a plethora of tools that help you make sense of your logs. Open-source projects such as Scribe and LogStash, on-premise tools like Splunk, and hosted services such as SumoLogic and PaperTrail. These all help you reduce mass amounts of log data into something more meaningful.

And there’s one thing they all share in common. They’re all dependent on the data you actually put into the log. That task of getting the quality and quantity of the data right  then falls on to you. With that in mind, here are 5 things that you should be constantly mindful of when logging:

1. Hello my (Thread) name is..

The Thread name property is one of Java’s most undervalued methods, as it’s mainly descriptive. The place where it plays the biggest part is in multi-threaded logging. Most logging frameworks will automatically log the current thread’s name. However, this would mostly look something like “http-nio-8080-exec-3” – a name assigned by the thread-pool or container.

The thread name is prime real-estate in your log, and you want to make sure you use it right. This means assigning it with detailed context, such as the Servlet or task its fulfilling right now, and some dynamic context such as a user or message ID.

As such, an entry point into your code should start with something like this:

Thread.currentThread().setName(MyTask.class.getName() + “: “+ message.getID());

2. Distributed identifiers

In a SOA or message driven architecture, the execution of a task or request may span across multiple machines. When processing of one fails, connecting the dots between the machines and their state will be key to understanding what happened.

For log analyzers to group relevant log messages across machines, you’ll need to provide them with a unique ID by by which they can do that,  as part of each log message. This means that every inbound operation into your system should have a unique ID which will travel with it until it is completed.

A persistent identifier such as user ID may not be a good vessel, as one user may have multiple operations happening for him, which will make isolation harder. UUIDs (albeit long) can be a good choice here, and can also loaded into the thread name or a dedicated TLS.

3. Don’t text + drive. Don’t log + loop.

Often times you’ll see a piece of code running in a tight loop and performs a log operation. The underlying assumption is that the code will run a limited number of times.

That may very well be the case when things are working well. But when the code gets unexpected input the loop may not break. In that case you’re not just dealing with an infinite loop, you’re also dealing with code that’s writing infinite amounts of data to disk or network.

Left to its own devices this can take a server or an entire cluster down.

When possible, do not log within tight loops. This is especially true when catching errors.

void readData {
	while (hasNext()) {
		try {
			readData();
		}
		catch (Exception e) {
			// this isn’t recommend - you can catch, but log outside the loop
			logger.error("error reading " X + " from " Y, e);
		}
	}
}

4. Uncaught handlers

Westeros has the Wall its a last line of defense (fat good that’s gonna help them). You have Thread.uncaughtExcceptionHandlers. So make sure you use them. If you don’t install one such a handler, you run the risk of throwing exceptions into the ether with very little context, and very little control of if and where they end up being logged.

Notice that even within an uncaught exception handler, which has no access to variables within the terminated thread you still get a reference to the Thread object. If you adhered to step #1, you’ll still get a meaningful thread.getName() to log to provide you with some more context.

5. Catch external calls

Whenever you make an API call which leaves the JVM, the chances of an exception increase dramatically. This includes Web service, Http, DB, file , OS or any other JNI calls. Treat each call as if it will explode (it most likely will at one point).

For the most part, the reasons for API call fail have to do with unexpected input provided to them by you. Having those values available for you in the log is a key part in fixing the issue.

try {
	return s3client.generatePresignedUrl(request);
}
catch (Exception e) {
	String err = String.format("Error generating request: %s bucket: %s key: %s. method: %s", request, bucket, path, method);
	log.error(err, e); //you can also throw a nested exception here with err instead.
}

Server debugging with Takipi

Takipi is built to make server debugging even better by making your logs smarter and more rich in information. Here are 3 features which may make your life easier next time you’re debugging a server:

  1. Server debugging. When Takipi detects an exception or error within your JVM, it collects the code and variables which caused it. The end result is that for each error in your app you can jump right into the source code which executed on that machine, and see all the variable values which caused it – locals, objects, arrays and strings that were there, right at the moment of error.

    Takipi1

  2. Log integration. Takipi automatically add a tiny debug hyperlink to each log error , so you can use to jump directly into the actual source code and variable values which caused it.

    takipi2

  3. Distributed debugging. If the call into the code which failed was made from another machine running Takipi, you’ll see the source code and variable values across the distributed call chain. So if machine A called into B which called into C which failed, you’ll see the code and variable across that entire chain between.

    Takipi3

Click here to try Takipi

Related Whitepaper:

Functional Programming in Java: Harnessing the Power of Java 8 Lambda Expressions

Get ready to program in a whole new way!

Functional Programming in Java will help you quickly get on top of the new, essential Java 8 language features and the functional style that will change and improve your code. This short, targeted book will help you make the paradigm shift from the old imperative way to a less error-prone, more elegant, and concise coding style that’s also a breeze to parallelize. You’ll explore the syntax and semantics of lambda expressions, method and constructor references, and functional interfaces. You’ll design and write applications better using the new standards in Java 8 and the JDK.

Get it Now!  

6 Responses to "5 techniques to improve your server logging"

  1. Peter Verhas says:

    After reading the article I had two thought in my head:

    1. The items “4. Uncaught handlers” and “5. Catch external calls” have nothing to do with logging.
    2. The article was written to promote Takipi.

  2. Tal Weiss says:

    Hey Peter,

    Thanks for the comment. I wrote the post, so I’ll address points #4 and #5 and see if I can provide some value :) With regards to uncaught exception handlers, these callbacks were placed there by the designers of the language primarily for the purpose of providing developers with a centralized location where they can log uncaught exceptions.

    At the point, where the thread has already terminated there isn’t really anything else you can do other than logging the event. The language provides you with both the thread object and the exception which caused it to fail. It’s imperative for an app to have a callback placed there to avoid those exceptions not being logged, or logged in the wrong placed (such as the container log).

    The combination of these technique with setting of an informative thread.name value and additional TLS variables loaded into the thread to describe its state at the moment of failure is a fundamental technique for logging in a high scale server environment,

    With regards to point #5, more often often than not you see calls made to sensitive API functions that have a high probability of failing with no logging built into the code. When they do and not enough info is available to understand why as a result of partial For example if you’re doing a file.open which is the kind of method which likes to throw exceptions at you, you need to need to make sure your log file will contain enough information to explain why it failed when it does.

    If the exception it throws contains that info – don’t catch. If on the other hand the name of the file was parsed dynamically based on variable state which isn’t expressed in the final file name then I would recommend to catch, parse a much more meaningful exception message – comprising of those variables and then re-throw or log.

    Hope that helps,

    Tal

  3. Cd-MaN says:

    An other point: ThreadPoolExecutor by default “swallows” exceptions in the runnables / callables (uncaughtExceptionHandler won’t be called). The proper way to handle this is to subclass ThreadPoolExecutor and implement afterExecute: http://stackoverflow.com/a/2248203/1265

  4. Thanks for the article. I am very committed to ensuring that all our applications use logging properly and efficiently. Most logging frameworks provide a format option to indicate the thread. Personally I favor using SLF4j over Logback. SLF4j allows your application to capture all Apache and java util (even System.out/err are available) into a single logging strategy. Another common mistake is the unnecessary inclusion of isDebugEnabled(), isInfoEnabled(), etc, for single lines it is overkill as it is done in the source. Using SLF4j, you can avoid unnecessary concatenation using parameters log.debug(“here is your value {}, myvar);.

Leave a Reply


seven − = 6



Java Code Geeks and all content copyright © 2010-2014, Exelixis Media Ltd | Terms of Use | Privacy Policy
All trademarks and registered trademarks appearing on Java Code Geeks are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries.
Java Code Geeks is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.

Sign up for our Newsletter

20,709 insiders are already enjoying weekly updates and complimentary whitepapers! Join them now to gain exclusive access to the latest news in the Java world, as well as insights about Android, Scala, Groovy and other related technologies.

As an extra bonus, by joining you will get our brand new e-books, published by Java Code Geeks and their JCG partners for your reading pleasure! Enter your info and stay on top of things,

  • Fresh trends
  • Cases and examples
  • Research and insights
  • Two complimentary e-books