This document contains the release notes for the LLVM Compiler Infrastructure, release 3.6. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various subprojects of LLVM, and some of the current users of the code. All LLVM releases may be downloaded from the LLVM releases web site.
For more information about LLVM, including information about the latest release, please check out the main LLVM web site. If you have questions or comments, the LLVM Developer’s Mailing List is a good place to send them.
The semantics of the prefix attribute have been changed. Users that want the previous prefix semantics should instead use prologue. To motivate this change, let’s examine the primary usecases that these attributes aim to serve,
- Code sanitization metadata (e.g. Clang’s undefined behavior sanitizer)
- Function hot-patching: Enable the user to insert nop operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility.
- Language runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality.
Previously prefix served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code.
In this release the concept of prefix data has been redefined to be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code.
The previous notion of prefix data now goes under the name “prologue data” to emphasize its duality with the function epilogue.
The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. See the language reference for further details on the semantics of these attributes.
This refactoring arose out of discussions with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3).
Metadata nodes (!{...}) and strings (!"...") are no longer values. They have no use-lists, no type, cannot RAUW, and cannot be function-local.
LLVM intrinsics can reference metadata using the metadata type, and metadata nodes can reference constant values.
Function-local metadata is limited to direct arguments to LLVM intrinsics.
The following old IR:
@g = global i32 0
define void @foo(i32 %v) {
entry:
call void @llvm.md(metadata !{i32 %v})
call void @llvm.md(metadata !{i32* @global})
call void @llvm.md(metadata !0)
call void @llvm.md(metadata !{metadata !"string"})
call void @llvm.md(metadata !{metadata !{metadata !1, metadata !"string"}})
ret void, !bar !1, !baz !2
}
declare void @llvm.md(metadata)
!0 = metadata !{metadata !1, metadata !2, metadata !3, metadata !"some string"}
!1 = metadata !{metadata !2, null, metadata !"other", i32* @global, i32 7}
!2 = metadata !{}
should now be written as:
@g = global i32 0
define void @foo(i32 %v) {
entry:
call void @llvm.md(metadata i32 %v) ; The only legal place for function-local
; metadata.
call void @llvm.md(metadata i32* @global)
call void @llvm.md(metadata !0)
call void @llvm.md(metadata !{!"string"})
call void @llvm.md(metadata !{!{!1, !"string"}})
ret void, !bar !1, !baz !2
}
declare void @llvm.md(metadata)
!0 = !{!1, !2, !3, !"some string"}
!1 = !{!2, null, !"other", i32* @global, i32 7}
!2 = !{}
Metadata nodes can opt-out of uniquing, using the keyword distinct. Distinct nodes are still owned by the context, but are stored in a side table, and not uniqued.
In LLVM 3.5, metadata nodes would drop uniquing if an operand changed to null during optimizations. This is no longer true. However, if an operand change causes a uniquing collision, they become distinct. Unlike LLVM 3.5, where serializing to assembly or bitcode would re-unique the nodes, they now remain distinct.
The following IR:
!named = !{!0, !1, !2, !3, !4, !5, !6, !7, !8}
!0 = !{}
!1 = !{}
!2 = distinct !{}
!3 = distinct !{}
!4 = !{!0}
!5 = distinct !{!0}
!6 = !{!4, !{}, !5}
!7 = !{!{!0}, !0, !5}
!8 = distinct !{!{!0}, !0, !5}
is equivalent to the following:
!named = !{!0, !0, !1, !2, !3, !4, !5, !5, !6}
!0 = !{}
!1 = distinct !{}
!2 = distinct !{}
!3 = !{!0}
!4 = distinct !{!0}
!5 = !{!3, !0, !4}
!6 = distinct !{!3, !0, !4}
During graph construction, if a metadata node transitively references a forward declaration, the node itself is considered “unresolved” until the forward declaration resolves. An unresolved node can RAUW itself to support uniquing. Nodes automatically resolve once all their operands have resolved.
However, cyclic graphs prevent the nodes from resolving. An API client that constructs a cyclic graph must call resolveCycles() to resolve nodes in the cycle.
To save self-references from that burden, self-referencing nodes are implicitly distinct. So the following IR:
!named = !{!0, !1, !2, !3, !4}
!0 = !{!0}
!1 = !{!1}
!2 = !{!2, !1}
!3 = !{!2, !1}
!4 = !{!2, !1}
is equivalent to:
!named = !{!0, !1, !2, !3, !3}
!0 = distinct !{!0}
!1 = distinct !{!1}
!2 = distinct !{!2, !1}
!3 = !{!2, !1}
There’s a new first-class metadata construct called MDLocation (to be followed in subsequent releases by others). It’s used for the locations referenced by !dbg metadata attachments.
For example, if an old !dbg attachment looked like this:
define i32 @foo(i32 %a, i32 %b) {
entry:
%add = add i32 %a, %b, !dbg !0
ret %add, !dbg !1
}
!0 = metadata !{i32 10, i32 3, metadata !2, metadata !1)
!1 = metadata !{i32 20, i32 7, metadata !3)
!2 = metadata !{...}
!3 = metadata !{...}
the new attachment looks like this:
define i32 @foo(i32 %a, i32 %b) {
entry:
%add = add i32 %a, %b, !dbg !0
ret %add, !dbg !1
}
!0 = !MDLocation(line: 10, column: 3, scope: !2, inlinedAt: !1)
!1 = !MDLocation(line: 20, column: 7, scope: !3)
!2 = !{...}
!3 = !{...}
The fields are named, can be reordered, and have sane defaults if left out (although scope: is required).
The syntax for aliases is now closer to what is used for global variables
@a = weak global ...
@b = weak alias ...
The order of the alias keyword and the linkage was swapped before.
All users should transition to MCJIT.
It is now just a wrapper, which simplifies using object::Binary with other users of the underlying file.
Regular object files can contain IR in a section named .llvmbc.
It is now implemented directly on top of lib/Linker instead of lib/LTO. The API of lib/LTO is sufficiently different from gold’s view of the linking process that some cases could not be conveniently implemented.
The new implementation is also lazier and has a save-temps option.
Lazy loaded functions are now represented in a way that isDeclaration returns the correct answer even before reading the body.
It was effectively an alias of -O3.
This was done to simplify compatibility with python 3.
In practice, tools like asan and valgrind were finding way more bugs than the old leak detector, so it was removed.
The syntax of comdats was changed to
$c = comdat any
@g = global i32 0, comdat($c)
@c = global i32 0, comdat
The version without the parentheses is a syntactic sugar for a comdat with the same name as the global.
LLVM now obeys the Win64 prologue and epilogue conventions documented by Microsoft. Unwind information is also emitted into the .xdata section.
As a result of the ABI-required prologue changes, it is now no longer possible to unwind the stack using a standard frame pointer walk on Win64. Instead, users should call CaptureStackBackTrace, or implement equivalent functionality by consulting the unwind tables present in the binary.
These libraries now use the diagnostic handler to print errors and warnings. This provides better error messages and simpler error handling.
It was fairly broken and was removed.
The mode is currently still available in the C API for source compatibility, but it doesn’t have any effect.
A new experimental mechanism for describing a garbage collection safepoint was added to LLVM. The new mechanism was not complete at the point this release was branched so it is recommended that anyone interested in using this mechanism track the ongoing development work on tip of tree. The hope is that these intrinsics will be ready for general use by 3.7. Documentation can be found here.
The existing gc.root implementation is still supported and as fully featured as it ever was. However, two features from GCStrategy will likely be removed in the 3.7 release (performCustomLowering and findCustomSafePoints). If you have a use case for either, please mention it on llvm-dev so that it can be considered for future development.
We are expecting to migrate away from gc.root in the 3.8 time frame, but both mechanisms will be supported in 3.7.
During this release the MIPS target has reached a few major milestones. The compiler has gained support for MIPS-II and MIPS-III; become ABI-compatible with GCC for big and little endian O32, N32, and N64; and is now able to compile the Linux kernel for 32-bit targets. Additionally, LLD now supports microMIPS for the O32 ABI on little endian targets, and code generation for microMIPS is almost completely passing the test-suite.
A large number of bugs have been fixed for big-endian MIPS targets using the N32 and N64 ABI’s as well as a small number of bugs affecting other ABI’s. Please note that some of these bugs will still affect LLVM-IR generated by LLVM 3.5 since correct code generation depends on appropriate usage of the inreg, signext, and zeroext attributes on all function arguments and returns.
There are far too many corrections to provide a complete list but here are a few notable ones:
It is now possible to compile the Linux kernel. This currently requires a small number of kernel patches. See the LLVMLinux project for details.
There are numerous improvements to the PowerPC target in this release:
An exciting aspect of LLVM is that it is used as an enabling technology for a lot of other language and tools projects. This section lists some of the projects that have already been updated to work with LLVM 3.6.
In addition to producing an easily portable open source OpenCL implementation, another major goal of pocl is improving performance portability of OpenCL programs with compiler optimizations, reducing the need for target-dependent manual optimizations. An important part of pocl is a set of LLVM passes used to statically parallelize multiple work-items with the kernel compiler, even in the presence of work-group barriers. This enables static parallelization of the fine-grained static concurrency in the work groups in multiple ways.
TCE is a toolset for designing customized exposed datapath processors based on the Transport triggered architecture (TTA).
The toolset provides a complete co-design flow from C/C++ programs down to synthesizable VHDL/Verilog and parallel program binaries. Processor customization points include the register files, function units, supported operations, and the interconnection network.
TCE uses Clang and LLVM for C/C++/OpenCL C language support, target independent optimizations and also for parts of code generation. It generates new LLVM-based code generators “on the fly” for the designed processors and loads them in to the compiler backend as runtime libraries to avoid per-target recompilation of larger parts of the compiler chain.
Likely is an embeddable just-in-time Lisp for image recognition and heterogeneous computing. Algorithms are just-in-time compiled using LLVM’s MCJIT infrastructure to execute on single or multi-threaded CPUs and potentially OpenCL SPIR or CUDA enabled GPUs. Likely seeks to explore new optimizations for statistical learning algorithms by moving them from an offline model generation step to the compile-time evaluation of a function (the learning algorithm) with constant arguments (the training data).
D is a language with C-like syntax and static typing. It pragmatically combines efficiency, control, and modeling power, with safety and programmer productivity. D supports powerful concepts like Compile-Time Function Execution (CTFE) and Template Meta-Programming, provides an innovative approach to concurrency and offers many classical paradigms.
LDC uses the frontend from the reference compiler combined with LLVM as backend to produce efficient native code. LDC targets x86/x86_64 systems like Linux, OS X, FreeBSD and Windows and also Linux on PowerPC (32/64 bit). Ports to other architectures like ARM, AArch64 and MIPS64 are underway.
LLVMSharp and ClangSharp are type-safe C# bindings for Microsoft.NET and Mono that Platform Invoke into the native libraries. ClangSharp is self-hosted and is used to generated LLVMSharp using the LLVM-C API.
LLVMSharp Kaleidoscope Tutorials are instructive examples of writing a compiler in C#, with certain improvements like using the visitor pattern to generate LLVM IR.
ClangSharp PInvoke Generator is the self-hosting mechanism for LLVM/ClangSharp and is demonstrative of using LibClang to generate Platform Invoke (PInvoke) signatures for C APIs.
A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also contains versions of the API documentation which is up-to-date with the Subversion version of the source code. You can access versions of these documents specific to this release by going into the llvm/docs/ directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact us via the mailing lists.