This document contains the release notes for the LLVM Compiler Infrastructure, release 3.1. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various subprojects of LLVM, and some of the current users of the code. All LLVM releases may be downloaded from the LLVM releases web site.
For more information about LLVM, including information about the latest release, please check out the main LLVM web site. If you have questions or comments, the LLVM Developer's Mailing List is a good place to send them.
Note that if you are reading this file from a Subversion checkout or the main LLVM web page, this document applies to the next release, not the current one. To see the release notes for a specific release, please see the releases page.
The LLVM 3.1 distribution currently consists of code from the core LLVM repository (which roughly includes the LLVM optimizers, code generators and supporting tools), and the Clang repository. In addition to this code, the LLVM Project includes other sub-projects that are in development. Here we include updates on these subprojects.
Clang is an LLVM front end for the C, C++, and Objective-C languages. Clang aims to provide a better user experience through expressive diagnostics, a high level of conformance to language standards, fast compilation, and low memory use. Like LLVM, Clang provides a modular, library-based architecture that makes it suitable for creating or integrating with other development tools. Clang is considered a production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86 (32- and 64-bit), and for Darwin/ARM targets.
In the LLVM 3.1 time-frame, the Clang team has made many improvements. Highlights include:
For more details about the changes to Clang since the 3.0 release, see the Clang release notes.
If Clang rejects your code but another compiler accepts it, please take a look at the language compatibility guide to make sure this is not intentional or a known issue.
DragonEgg is a gcc plugin that replaces GCC's optimizers and code generators with LLVM's. It works with gcc-4.5 and gcc-4.6 (and partially with gcc-4.7), can target the x86-32/x86-64 and ARM processor families, and has been successfully used on the Darwin, FreeBSD, KFreeBSD, Linux and OpenBSD platforms. It fully supports Ada, C, C++ and Fortran. It has partial support for Go, Java, Obj-C and Obj-C++.
The 3.1 release has the following notable changes:
The new LLVM compiler-rt project is a simple library that provides an implementation of the low-level target-specific hooks required by code generation and other runtime components. For example, when compiling for a 32-bit target, converting a double to a 64-bit unsigned integer is compiled into a runtime call to the "__fixunsdfdi" function. The compiler-rt library provides highly optimized implementations of this and other low-level routines (some are 3x faster than the equivalent libgcc routines).
As of 3.1, compiler-rt includes the helper functions for atomic operations, allowing atomic operations on arbitrary-sized quantities to work. These functions follow the specification defined by gcc and are used by clang.
LLDB is a ground-up implementation of a command line debugger, as well as a debugger API that can be used from other applications. LLDB makes use of the Clang parser to provide high-fidelity expression parsing (particularly for C++) and uses the LLVM JIT for target support.
Like compiler_rt, libc++ is now dual licensed under the MIT and UIUC license, allowing it to be used more permissively.
Within the LLVM 3.1 time-frame there were the following highlights:
<atomic>header is now passing all tests, when compiling with clang and linking against the support code from compiler-rt.
The VMKit project is an implementation of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time compilation.
In the LLVM 3.1 time-frame, VMKit has had significant improvements on both runtime and startup performance.
Polly is an experimental optimizer for data locality and parallelism. It currently provides high-level loop optimizations and automatic parallelisation (using the OpenMP run time). Work in the area of automatic SIMD and accelerator code generation was started.
Within the LLVM 3.1 time-frame there were the following highlights:
An exciting aspect of LLVM is that it is used as an enabling technology for a lot of other language and tools projects. This section lists some of the projects that have already been updated to work with LLVM 3.1.
Crack aims to provide the ease of development of a scripting language with the performance of a compiled language. The language derives concepts from C++, Java and Python, incorporating object-oriented programming, operator overloading and strong typing.
GHC is an open source compiler and programming suite for Haskell, a lazy functional programming language. It includes an optimizing static compiler generating good code for a variety of platforms, together with an interactive system for convenient, quick development.
GHC 7.0 and onwards include an LLVM code generator, supporting LLVM 2.8 and later.
Julia is a high-level, high-performance dynamic language for technical computing. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. The compiler uses type inference to generate fast code without any type declarations, and uses LLVM's optimization passes and JIT compiler. The Julia Language is designed around multiple dispatch, giving programs a large degree of flexibility. It is ready for use on many kinds of problems.
LLVM D Compiler (LDC) is a compiler for the D programming Language. It is based on the DMD frontend and uses LLVM as backend.
Open Shading Language (OSL) is a small but rich language for programmable shading in advanced global illumination renderers and other applications, ideal for describing materials, lights, displacement, and pattern generation. It uses LLVM to JIT complex shader networks to x86 code at runtime.
OSL was developed by Sony Pictures Imageworks for use in its in-house renderer used for feature film animation and visual effects, and is distributed as open source software with the "New BSD" license.
In addition to producing an easily portable open source OpenCL implementation, another major goal of pocl is improving performance portability of OpenCL programs with compiler optimizations, reducing the need for target-dependent manual optimizations. An important part of pocl is a set of LLVM passes used to statically parallelize multiple work-items with the kernel compiler, even in the presence of work-group barriers. This enables static parallelization of the fine-grained static concurrency in the work groups in multiple ways (SIMD, VLIW, superscalar,...).
Pure is an algebraic/functional programming language based on term rewriting. Programs are collections of equations which are used to evaluate expressions in a symbolic fashion. The interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native code. Pure offers dynamic typing, eager and lazy evaluation, lexical closures, a hygienic macro system (also based on term rewriting), built-in list and matrix support (including list and matrix comprehensions) and an easy-to-use interface to C and other programming languages (including the ability to load LLVM bitcode modules, and inline C, C++, Fortran and Faust code in Pure programs if the corresponding LLVM-enabled compilers are installed).
Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and continues to work with older LLVM releases >= 2.5).
TCE is a toolset for designing application-specific processors (ASP) based on the Transport triggered architecture (TTA). The toolset provides a complete co-design flow from C/C++ programs down to synthesizable VHDL/Verilog and parallel program binaries. Processor customization points include the register files, function units, supported operations, and the interconnection network.
TCE uses Clang and LLVM for C/C++ language support, target independent optimizations and also for parts of code generation. It generates new LLVM-based code generators "on the fly" for the designed TTA processors and loads them in to the compiler backend as runtime libraries to avoid per-target recompilation of larger parts of the compiler chain.
This release includes a huge number of bug fixes, performance tweaks and minor improvements. Some of the major improvements and new features are listed in this section.
LLVM 3.1 includes several major changes and big features:
LLVM IR has several new features for better support of new targets and that expose new optimization opportunities:
In addition to many minor performance tweaks and bug fixes, this release includes a few major enhancements and additions to the optimizers:
-vectorizeto run this pass along with some associated post-vectorization cleanup passes. For more information, see the EuroLLVM 2012 slides: Autovectorization with LLVM.
The LLVM Machine Code (aka MC) subsystem was created to solve a number of problems in the realm of assembly, disassembly, object file format handling, and a number of other related areas that CPU instruction-set level tools work in. For more information, please see the Intro to the LLVM MC Project Blog Post.
We have changed the way that the Type Legalizer legalizes vectors. The type legalizer now attempts to promote integer elements. This enabled the implementation of vector-select. Additionally, we see a performance boost on workloads which use vectors of chars and shorts, since they are now promoted to 32-bit types, which are better supported by the SIMD instruction set. Floating point types are still widened as before.
We have put a significant amount of work into the code generator infrastructure, which allows us to implement more aggressive algorithms and make it run faster:
MachineRegisterInfonow allows the reserved registers to be frozen when register allocation starts. Target hooks should use the
MRI->canReserveReg(FramePtr)method to avoid accidentally disabling frame pointer elimination during register allocation.
MachineOperandprovides a compact representation of large clobber lists on call instructions. The register mask operand references a bit mask of preserved registers. Everything else is clobbered.
We added new TableGen infrastructure to support bundling for Very Long Instruction Word (VLIW) architectures. TableGen can now automatically generate a deterministic finite automaton from a VLIW target's schedule description which can be queried to determine legal groupings of instructions in a bundle.
We have added a new target independent VLIW packetizer based on the DFA infrastructure to group machine instructions into bundles.
A probability based block placement and code layout algorithm was added to
LLVM's code generator. This layout pass supports probabilities derived from
static heuristics as well as source code annotations such as
New features and major changes in the X86 target include:
New features of the ARM target include:
The ARM target now includes a full featured macro assembler, including direct-to-object module support for clang. The assembler is currently enabled by default for Darwin only pending testing and any additional necessary platform specific support for Linux.
Full support is included for Thumb1, Thumb2 and ARM modes, along with subtarget and CPU specific extensions for VFP2, VFP3 and NEON.
The assembler is Unified Syntax only (see ARM Architecural Reference Manual for details). While there is some, and growing, support for pre-unfied (divided) syntax, there are still significant gaps in that support.
An outstanding conditional inversion bug was fixed in this release.
NOTE: LLVM 3.1 marks the last release of the PTX back-end, in its current form. The back-end is currently being replaced by the NVPTX back-end, currently in SVN ToT.
If you're already an LLVM user or developer with out-of-tree changes based on LLVM 3.1, this section lists some "gotchas" that you may run into upgrading from the previous release.
In addition, many APIs have changed in this release. Some of the major LLVM API changes are:
TargetOptionsclass, which is local to each
TargetMachine. As a consequence, the associated flags will no longer be accepted by clang -mllvm. This includes:
llvm::DisableFramePointerElim(const MachineFunction &)
MDBuilderclass has been added to simplify the creation of metadata.
In addition, some tools have changed in this release. Some of the changes are:
Officially supported Python bindings have been added! Feature support is far from complete. The current bindings support interfaces to:
Using the Object File Interface, it is possible to inspect binary object files. Think of it as a Python version of readelf or llvm-objdump.
Support for additional features is currently being developed by community contributors. If you are interested in shaping the direction of the Python bindings, please express your intent on IRC or the developers list.
LLVM is generally a production quality compiler, and is used by a broad range of applications and shipping in many products. That said, not every subsystem is as mature as the aggregate, particularly the more obscure targets. If you run into a problem, please check the LLVM bug database and submit a bug if there isn't already one or ask on the LLVMdev list.
Known problem areas include:
A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also contains versions of the API documentation which is up-to-date with the Subversion version of the source code. You can access versions of these documents specific to this release by going into the "llvm/doc/" directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact us via the mailing lists.