Debugging JITed Code With GDB
  1. Introduction
  2. Quickstart
  3. Example with clang and lli
Written by Reid Kleckner
Introduction

Without special runtime support, debugging dynamically generated code with GDB (as well as most debuggers) can be quite painful. Debuggers generally read debug information from the object file of the code, but for JITed code, there is no such file to look for.

Depending on the architecture, this can impact the debugging experience in different ways. For example, on most 32-bit x86 architectures, you can simply compile with -fno-omit-framepointer for GCC and -fdisable-fp-elim for LLVM. When GDB creates a backtrace, it can properly unwind the stack, but the stack frames owned by JITed code have ??'s instead of the appropriate symbol name. However, on Linux x86_64 in particular, GDB relies on the DWARF CFA debug information to unwind the stack, so even if you compile your program to leave the frame pointer untouched, GDB will usually be unable to unwind the stack past any JITed code stack frames.

In order to communicate the necessary debug info to GDB, an interface for registering JITed code with debuggers has been designed and implemented for GDB and LLVM. At a high level, whenever LLVM generates new machine code, it also generates an object file in memory containing the debug information. LLVM then adds the object file to the global list of object files and calls a special function (__jit_debug_register_code) marked noinline that GDB knows about. When GDB attaches to a process, it puts a breakpoint in this function and loads all of the object files in the global list. When LLVM calls the registration function, GDB catches the breakpoint signal, loads the new object file from LLVM's memory, and resumes the execution. In this way, GDB can get the necessary debug information.

At the time of this writing, LLVM only supports architectures that use ELF object files and it only generates symbols and DWARF CFA information. However, it would be easy to add more information to the object file, so we don't need to coordinate with GDB to get better debug information.

Quickstart

In order to debug code JITed by LLVM, you need to install a recent version of GDB. The interface was added on 2009-08-19, so you need a snapshot of GDB more recent than that. Either download a snapshot of GDB or checkout CVS as instructed here. Here are the commands for doing a checkout and building the code:

$ cvs -z 3 -d :pserver:anoncvs@sourceware.org:/cvs/src co gdb
$ mv src gdb   # You probably don't want this checkout called "src".
$ cd gdb
$ ./configure --prefix="$GDB_INSTALL"
$ make
$ make install

You can then use -jit-emit-debug in the LLVM command line arguments to enable the interface.

Example with clang and lli

For example, consider debugging running lli on the following C code in foo.c:

#include <stdio.h>

void foo() {
    printf("%d\n", *(int*)NULL);  // Crash here
}

void bar() {
    foo();
}

void baz() {
    bar();
}

int main(int argc, char **argv) {
    baz();
}

Here are the commands to run that application under GDB and print the stack trace at the crash:

# Compile foo.c to bitcode.  You can use either clang or llvm-gcc with this
# command line.  Both require -fexceptions, or the calls are all marked
# 'nounwind' which disables DWARF CFA info.
$ clang foo.c -fexceptions -emit-llvm -c -o foo.bc

# Run foo.bc under lli with -jit-emit-debug.  If you built lli in debug mode,
# -jit-emit-debug defaults to true.
$ $GDB_INSTALL/gdb --args lli -jit-emit-debug foo.bc
...

# Run the code.
(gdb) run
Starting program: /tmp/gdb/lli -jit-emit-debug foo.bc
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7f55164 in foo ()

# Print the backtrace, this time with symbols instead of ??.
(gdb) bt
#0  0x00007ffff7f55164 in foo ()
#1  0x00007ffff7f550f9 in bar ()
#2  0x00007ffff7f55099 in baz ()
#3  0x00007ffff7f5502a in main ()
#4  0x00000000007c0225 in llvm::JIT::runFunction(llvm::Function*,
    std::vector<llvm::GenericValue,
    std::allocator<llvm::GenericValue> > const&) ()
#5  0x00000000007d6d98 in
    llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*,
    std::vector<std::string,
    std::allocator<std::string> > const&, char const* const*) ()
#6  0x00000000004dab76 in main ()

As you can see, GDB can correctly unwind the stack and has the appropriate function names.


Valid CSS Valid HTML 4.01 Reid Kleckner
The LLVM Compiler Infrastructure
Last modified: $Date: 2009-01-01 23:10:51 -0800 (Thu, 01 Jan 2009) $