The Stack Trace Strikes Back

Comments (1)

Howdy all. Welcome to part two of three of what was originally conceived as a one part series. It's entirely possible that I'll get all George Lucas on you years from now and produce some more of these posts that are a complete letdown and affront to your childhood memories, but I digress. For now, rest assured that this post will knock your socks off as a follow-up to my last tidbit on decompiling Java code.

Without further ado, I give you...Stack Wars II: The Stacktrace Strikes Back (I'm completely aware that I'm abusing the metaphor here, but isn't that really what blogging is all about?).

Standard disclaimer: This post is intended for a technical audience with a focus on production support. Also, everything here is Java focused, but you can certainly apply some of the concepts in a .NET environment as well...you'll just have to create your own screencaps to replace the examples I've included below.

So, what is a stack trace, and why should you care? Well, one question at a time please.

What is a stack trace?

Wikipedia says a Stack Trace is, "A report of the active stack frames instantiated by the execution of a program." Now, I vaguely understand the Wikipedia definition, but I have also have a computer science degree from a second tier state university, so let me try to translate for those of you who were smart enough to get degrees in something besides CompSci: a Stacktrace is a snapshot of a program's behavior at a point in time. In the Java world, a stack trace will tell you which method was being executed at the time the trace was generated, along with its complete call stack, and usually line numbers as well. Take a look at the following simple stack trace below as an example:

stack_trace.png

Why should you care?

Good question. I'd venture a guess and say that about 99.99% of the world doesn't need to know nor care about stack traces. But here you are reading this post none-the-less, so here's why they're important:

1) Good programmers almost always print stack traces out in log files when an error in a program occurs. This gives us a useful tool to track down bugs. Whether you're just reporting information to a support team somewhere, or getting a little sassy and trying to fix a problem yourself, the stack trace is like a map for finding treasure buried deep in code. Except that instead of finding actual treasure, you're just finding a logic error. And instead of getting rich, you just get to complain about a problem, and maybe fix it.

2) You can tell the JVM to generate a stack trace for a running process. Doing so allows us to take a snapshot of the JVM at an arbitrary point in time, and see what all its threads are up to. This is useful when trying to figure out why a process (Tomcat for instance) is zombied (i.e. it's running, but not responding to requests), or when you're trying to fix deadlock issues, which are particularly difficult to run down.

Hit the jump for learning more about reading and interpreting stack traces!

Reading Stack Traces

Stack Traces range from very easy to very difficult to read, depending on what you're looking at. For instance, a stack trace of a single-threaded, stand-alone Java program is almost trivial to interpret, whereas trying to find dead-locks in a Java application server that has 100+ concurrent threads can be pretty cumbersome. Let's use our last example again to walk through interpreting a "simple" stack trace:

stack_trace.png

When reading traces, I recommend first taking a look at the top line of the stack trace to see what error is occurring. In this case, it looks like we're getting a NullPointerException in the "doEvenMoreStuff" method on line 22 of Broken.java:

stack_trace_1st_line.png

After getting a rough idea of the error being thrown, it's worthwhile to read the stack from the bottom up; least specific to most specific. In our example, for instance, we see from the last line of the trace that the first call in the stack originated in the main method of Broken.java. Since we all know that "main" is the first thing invoked when running a program from the command-line, we can feel confident that stepping through Broken.java beginning on line 36 in the main method will help us fully understand what's gone wrong in the code. So we look at the next line up in the trace, where we see a call from main to the method, "doStuff". "doStuff" calls "doMoreStuff" on line 16 of Broken.java, which in turn calls the "doEvenMoreStuff" method. Finally we get back to the NullPointerException, which occurs on line 22:

stack_trace_read_info.png

At this point, you have a few options to continue your troubleshooting.

1) Assuming you're trying to fix a problem with some commercially supported software, you can open a trouble ticket with the vendor. Provide symptoms that you've noticed and the stack trace you've captured. The support team can then pass off the trace to an engineer for a looksy, and maybe you end up with a patch somewhere down the road.

2) Google the hell out of the exception. The internet is a strange and beautiful place, and you'd be amazed how many times you can find answers out there. For instance, let's say that instead of seeing, "NullPointerException" in the first line of your stack trace, you instead see something like, "AnalyticsConfigurationException". You start Googling around for things like, "ALUI AnalyticsConfigurationException", "AnalyticsConfigurationException", etc. and see what you see. Maybe you find nothing, but maybe you find some blog post that tells you this error means that you have a typo in line 18 of some configuration file.

Just to repeat...Google the hell out of the exception. Low cost, high potential reward in doing so. You know those smart tech guys who sit in the corner and don't really get enough sun...I guarantee you that one of their best tricks is knowing how to use the internet to find information.

3) You want to be a maverick like John McCain, and decide to decompile the code to fix the problem yourself. Well God bless you, and God bless the United States of America...take a look here for more details.

4) You happen to have the source code for this application available, and decide to match up the stack trace to the code. This is pretty easy to do when you have the original source, as the line numbers in your stack trace will match up to the code you have. So just read the line numbers in the stack from bottom up, and step through the code accordingly; like this:

stacktrace_with_code_main.png

dostuff.png

doMoreStuff.png

even_more_stuff.png

That's it for now. I hope this rambling has helped you get a handle on why stack traces are useful, and how to start making sense of them. Be sure to stay tuned for next week's installment, "Return of the Stack Trace", where we walk through interpreting some more involved traces...Same bat time, same bat channel. Until then, may the force be with you (Mixing metaphors is also totally acceptable in the wonderful world of blogging).

1 Comments

omidk.myopenid.com Author Profile Page on November 11, 2008 11:03 AM

This should be mandatory reading for anyone in technical consulting or working with any kind of enterprise software.

How do you think I can enforce this?

Leave a comment

Recent Entries

Oh ALBPM, why you gotta go and load images like 'dat?
Howdy all.  Hope the nice weather is finding happy, healthy, and allergy free. As we were doing spring ALBPM house-keeping…
Import Content into Publisher
These are exciting times with Aqualogic (ALI) and its direction as a Portal technology after Oracle's acquisition of BEA Systems. …
PTSpy log messages linked to ... hair loss?
For the past few years, researchers have been perplexed about the causes behind the increased rate of hair loss in ALUI portal administrators.  So, a team…