This document contains the release notes for the LLVM compiler
infrastructure, release 1.1. Here we describe the status of LLVM, including any
known problems and bug fixes from the previous release. The most up-to-date
version of this document can be found on the LLVM 1.1 web site. If you are
not reading this on the LLVM web pages, you should probably go there because
this document may be updated after the release.
For more information about LLVM, including information about potentially more
current releases, please check out the main
web site. If you have questions or comments, the LLVM developer's mailing
list is a good place to send them.
Note that if you are reading this file from CVS, this document applies
to the next release, not the current one. To see the release notes for
the current or previous releases, see the releases page.
This is the second public release of the LLVM compiler infrastructure. This
release is primarily a bugfix release, dramatically improving the C/C++
front-end and improving support for C++ in the LLVM core. This release also
includes a few new features, such as a simple profiler, support for Mac OS X,
better interoperability with external source bases, a new example language
front-end, and improvements in a few optimizations. The performance of several
LLVM components has been improved, and several gratuitous type-safety issues in
the C front-end have been fixed.
At this time, LLVM is known to correctly compile and run all non-unwinding C
& C++ SPEC CPU2000 benchmarks, the Olden benchmarks, and the Ptrdist
benchmarks. It has also been used to compile many other programs. LLVM
now also works with a broad variety of C++ programs, though it has still
received much less testing than the C front-end.
The LLVM native code generators are very stable but do not currently support
unwinding (exception throwing or longjmping), which prevent them from
working with programs like the 253.perlbmk in SPEC CPU2000. The C
backend and the rest of LLVM supports these programs, so you can
still use LLVM with them. Support for unwinding will be added in a future
release.
This release implements the following new features:
- A new
LLVM profiler, similar to gprof, is available.
- LLVM and the C/C++ front-end now compile on Mac OS X! Mac OS X users can
now explore the LLVM optimizer with the C backend and interpreter. Note that
LLVM requires GCC 3.3 on Mac OS X.
- LLVM has been moved
into an 'llvm' C++ namespace for easier integration with third-party
code. Note that due to lack of namespace support in GDB 5.x, you will probably
want to upgrade to GDB 6 or better to debug LLVM code.
-
The build system now copies Makefiles dynamically from the source tree to the
object tree as subdirectories are built. This means that:
-
New directories can be added to the source tree, and the build will
automatically pick them up (i.e. no need to edit configure.ac and
re-run configure).
-
You will need to build LLVM from the top of the object tree once to ensure
that all of the Makefiles are copied into the object tree subdirectories.
- A front-end for "Stacker" (a simple Forth-like language) is now
included in the main LLVM tree.
Additionally, Reid Spencer, the author, contributed a document
describing his experiences writing Stacker and the language itself.
This document is invaluable for others writing front-ends targetting LLVM.
- The configure script will now configure all projects placed in the
llvm/projects directory.
- The -tailcallelim pass can now introduce "accumulator" variables
to transform functions in many common cases that it could not before.
- The -licm pass can now sink instructions out the bottom of loops
in addition to being able to hoist them out the top.
- The -basicaa pass (the default alias analysis pass) has been
upgraded to be significantly more
precise.
- LLVM 1.1 implements a simple size optimization for LLVM bytecode files.
This means that the 1.1 files are smaller than 1.0, but LLVM 1.0 won't
read 1.1 bytecode files.
- The gccld program produces a runner script that includes command-line options to load the necessary shared objects.
In this release, the following missing features were implemented:
- The interpreter does not support
invoke or unwind
- Interpreter does not support the
vaarg instruction
- llvm-nm cannot read archive
files
- Interpreter does not handle
setne constant expression
In this release, the following Quality of Implementation issues were
fixed:
- The C++ front-end now compiles functions to
use the linkonce linkage type
more, giving the optimizer more freedom.
- The C front-end now generates
type-safe code in several cases that it did not before, allowing
optimization of code that could not be optimized previously.
- The LLVM build system has been taught to catch some common configuration
problems that caused it to get
horribly confused.
- The LLVM header files are now
-Wold-style-cast clean.
- The LLVM bytecode reader has been sped up a lot (up to 4x in some
cases).
- In C++, methods and functions in anonymous namespaces now get internal linkage.
- Constant initializers now generate loops instead of potentially huge amounts of straight-line code.
- Code for running C++ destructors is now properly shared when possible. Before, the C++ front-end
generated N^2 amounts of duplicated cleanup code in some cases.
- The JIT used to generate code for
all functions pointed to by globals before the program
started execution. Now, it waits until the first time the functions are
called to
compile them. This dramatically speeds up short runs of large C++ programs,
which often have large numbers of functions pointed to by vtables.
In this release, the following bugs in the previous release were fixed:
Bugs in the LLVM Core:
- [inliner] Inlining invoke with PHI in unwind target is broken
- [linker] linkonce globals should link successfully to external globals
- [constmerge] Constant merging pass merges constants with external linkage
- [scalarrepl] Scalar Replacement of aggregates is decimating structures it shouldn't be
- [instcombine] Resolving invoke inserts cast after terminator
- llvm-as crashes when labels are used in phi nodes
- [build problem] Callgraph.cpp not pulled in from libipa.a
- Variables in scope of output setjmp
calls should be volatile (Note that this does not affect correctness on
many platforms, such as X86).
- [X86] Emission of global bool initializers broken
- [gccld] The -r (relinking) option does not work correctly
- [bcreader] Cannot read shift constant expressions from bytecode file
- [lowersetjmp] Lowersetjmp pass breaks dominance properties!
- SymbolTable::getUniqueName is very inefficient
- bugpoint must not pass -R<directory> to Mach-O linker
- [buildscripts] Building into objdir with .o in it fails
- [setjmp/longjmp] Linking C programs which use setjmp/longjmp sometimes fail with references to the C++ runtime library!
- AsmParser Misses Symbol Redefinition Error
- gccld -Lfoo -lfoo fails to find ./foo/libfoo.a
- [bcreader] Incorrect cast causes misread forward constant references
- [adce] ADCE considers blocks without postdominators to be unreachable
- [X86] div and rem constant exprs invalidate iterators!
- [vmcore] Symbol table doesn't rename colliding variables during type resolution
- Archive reader does not understand 4.4BSD/Mac OS X long filenames
- [llvm-ar] Command line arguments have funny syntax
Bugs in the C/C++ front-end:
- C++ frontend can crash when compiling virtual base classes
- C backend fails on constant cast expr to ptr-to-anonymous struct
- #ident is not recognized by C frontend
- C front-end miscompiles the builtin_expect intrinsic!
- 1.0 precompiled libstdc++ does not include wchar_t support
- llvmgcc asserts when compiling functions renamed with asm's
- C frontend crashes on some programs with lots of types.
- llvm-gcc crashes compiling global union initializer
- C front-end crash on empty structure
- CFrontend crashes when compiling C99 compound expressions
- llvm-gcc infinite loops on "case MAXINT:"
- [C++] Catch blocks make unparsable labels
- [C++] Initializing array with constructible objects fail
- llvm-gcc tries to add bools
- [c++] C++ Frontend lays out superclasses like anonymous bitfields!
- C front-end miscompiles unsigned enums whose LLVM types are signed
- Casting a string constant to void crashes llvm-gcc
- [llvmg++] Enum types are incorrectly shrunk to smaller than 'int' size
- [llvmg++] Cannot use pointer to member to initialize global
- [llvm-gcc] ?: operator as lvalue not implemented
- [C/C++] Bogus warning about taking the address of 'register' variable
- crash assigning into an array in a struct which contains a bitfield.
- Oversized integer bitfields cause crash
- [llvm-gcc] Bitfields & large array don't mix well
- [llvm-gcc] Complex division is not supported
- [llvm-gcc] Illegal union field reference
- [llvmg++] Front-end attempts to return structure by value
- [llvmg++] Pointer to member initializers not supported in constructors
- [llvm-gcc] crash on union initialization
- [llvm-g++] ?: expressions do not run correct number of destructors!
- [llvm-gcc] Pointer & constant results in invalid shift
- [llvmg++] call through array of pointers to member functions causes assertion
LLVM has been extensively tested on Intel and AMD machines running Red
Hat Linux and has been tested on Sun UltraSPARC workstations running Solaris 8.
Additionally,
LLVM works on Mac OS X 10.3 and above, but only with the C backend or
interpreter (no native backend for the PowerPC is available yet).
NOTE: You may have to reset the LLVMGCCDIR variable in your
Makefile.config to use the target triplet
powerpc-apple-darwin7.0.0 instead of whatever configure
detected; otherwise you will get errors trying to build crtend. This
is fixed in 1.2.
The core LLVM infrastructure uses "autoconf" for portability, so hopefully we
work on more platforms than that. However, it is likely that we
missed something and that minor porting is required to get LLVM to work on
new platforms. We welcome portability patches and error messages.
This section contains all known problems with the LLVM system, listed by
component. As new problems are discovered, they will be added to these
sections. If you run into a problem, please check the LLVM bug database and submit a bug if
there isn't already one.
- Inline assembly is not yet supported.
- "long double" is transformed by the front-end into "double". There is no
support for floating point data types of any size other than 32 and 64
bits.
- The following Unix system functionality has not been tested and may not
work:
- sigsetjmp, siglongjmp - These are not turned into the
appropriate invoke/unwind instructions. Note that
setjmp and longjmp are compiled correctly.
- getcontext, setcontext, makecontext
- These functions have not been tested.
- Although many GCC extensions are supported, some are not. In particular,
the following extensions are known to not be supported:
- Local Labels: Labels local to a block.
- Labels as Values: Getting pointers to labels and computed gotos.
- Nested Functions: As in Algol and Pascal, lexical scoping of functions.
- Constructing Calls: Dispatching a call to another function.
- Extended Asm: Assembler instructions with C expressions as operands.
- Constraints: Constraints for asm operands.
- Asm Labels: Specifying the assembler name to use for a C symbol.
- Explicit Reg Vars: Defining variables residing in specified registers.
- Return Address: Getting the return or frame address of a function.
- Vector Extensions: Using vector instructions through built-in functions.
- Target Builtins: Built-in functions specific to particular targets.
- Thread-Local: Per-thread variables.
- Pragmas: Pragmas accepted by GCC.
The following GCC extensions are partially supported. An ignored
attribute means that the LLVM compiler ignores the presence of the attribute,
but the code should still work. An unsupported attribute is one which is
ignored by the LLVM compiler and will cause a different interpretation of
the program.
- Variable Length:
Arrays whose length is computed at run time.
Supported, but allocated stack space is not freed until the function returns (noted above).
- Function Attributes:
Declaring that functions have no side effects or that they can never
return.
Supported: format, format_arg, non_null,
constructor, destructor, unused,
deprecated, warn_unused_result, weak
Ignored: noreturn, noinline,
always_inline, pure, const, nothrow,
malloc, no_instrument_function, cdecl
Unsupported: used, section, alias,
visibility, regparm, stdcall,
fastcall, all other target specific attributes
- Variable Attributes:
Specifying attributes of variables.
Supported: cleanup, common, nocommon,
deprecated, transparent_union,
unused, weak
Unsupported: aligned, mode, packed,
section, shared, tls_model,
vector_size, dllimport,
dllexport, all target specific attributes.
- Type Attributes: Specifying attributes of types.
Supported: transparent_union, unused,
deprecated, may_alias
Unsupported: aligned, packed,
all target specific attributes.
- Other Builtins:
Other built-in functions.
We support all builtins which have a C language equivalent (e.g.,
__builtin_cos), __builtin_alloca,
__builtin_types_compatible_p, __builtin_choose_expr,
__builtin_constant_p, and __builtin_expect (ignored).
The following extensions are known to be supported:
- Statement Exprs: Putting statements and declarations inside expressions.
- Typeof:
typeof
: referring to the type of an expression.
- Lvalues: Using
?:
, ",
" and casts in lvalues.
- Conditionals: Omitting the middle operand of a
?:
expression.
- Long Long: Double-word integers.
- Complex: Data types for complex numbers.
- Hex Floats:Hexadecimal floating-point constants.
- Zero Length: Zero-length arrays.
- Empty Structures: Structures with no members.
- Variadic Macros: Macros with a variable number of arguments.
- Escaped Newlines: Slightly looser rules for escaped newlines.
- Subscripting: Any array can be subscripted, even if not an lvalue.
- Pointer Arith: Arithmetic on
void
-pointers and function pointers.
- Initializers: Non-constant initializers.
- Compound Literals: Compound literals give structures, unions,
or arrays as values.
- Designated Inits: Labeling elements of initializers.
- Cast to Union: Casting to union type from any member of the union.
- Case Ranges: `case 1 ... 9' and such.
- Mixed Declarations: Mixing declarations and code.
- Function Prototypes: Prototype declarations and old-style definitions.
- C++ Comments: C++ comments are recognized.
- Dollar Signs: Dollar sign is allowed in identifiers.
- Character Escapes:
\e
stands for the character <ESC>.
- Alignment: Inquiring about the alignment of a type or variable.
- Inline: Defining inline functions (as fast as macros).
- Alternate Keywords:
__const__
, __asm__
, etc., for header files.
- Incomplete Enums:
enum foo;
, with details to follow.
- Function Names: Printable strings which are the name of the current function.
- Unnamed Fields: Unnamed struct/union fields within structs/unions.
- Attribute Syntax: Formal syntax for attributes.
If you run into GCC extensions which have not been included in any of these
lists, please let us know (also including whether or not they work).
For this release, the C++ front-end is considered to be fully functional but
has not been tested as thoroughly as the C front-end. It has been tested and
works for a number of non-trivial programs, but there may be lurking bugs.
Please report any bugs or problems.
A wide variety of additional information is available on the LLVM web page,
including mailing lists and publications describing algorithms and components
implemented in LLVM. The web page also contains versions of the API
documentation which is up-to-date with the CVS version of the source code. You
can access versions of these documents specific to this release by going into
the "llvm/doc/" directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact
us via the mailing
lists.