LLVM Developer Policy

Introduction

This document contains the LLVM Developer Policy which defines the project’s policy towards developers and their contributions. The intent of this policy is to eliminate miscommunication, rework, and confusion that might arise from the distributed nature of LLVM’s development. By stating the policy in clear terms, we hope each developer can know ahead of time what to expect when making LLVM contributions. This policy covers all llvm.org subprojects, including Clang, LLDB, libc++, etc.

This policy is also designed to accomplish the following objectives:

  1. Attract both users and developers to the LLVM project.
  2. Make life as simple and easy for contributors as possible.
  3. Keep the top of Subversion trees as stable as possible.
  4. Establish awareness of the project’s copyright, license, and patent policies with contributors to the project.

This policy is aimed at frequent contributors to LLVM. People interested in contributing one-off patches can do so in an informal way by sending them to the llvm-commits mailing list and engaging another developer to see it through the process.

Developer Policies

This section contains policies that pertain to frequent LLVM developers. We always welcome one-off patches from people who do not routinely contribute to LLVM, but we expect more from frequent contributors to keep the system as efficient as possible for everyone. Frequent LLVM contributors are expected to meet the following requirements in order for LLVM to maintain a high standard of quality.

Stay Informed

Developers should stay informed by reading at least the “dev” mailing list for the projects you are interested in, such as llvm-dev for LLVM, cfe-dev for Clang, or lldb-dev for LLDB. If you are doing anything more than just casual work on LLVM, it is suggested that you also subscribe to the “commits” mailing list for the subproject you’re interested in, such as llvm-commits, cfe-commits, or lldb-commits. Reading the “commits” list and paying attention to changes being made by others is a good way to see what other people are interested in and watching the flow of the project as a whole.

We recommend that active developers register an email account with LLVM Bugzilla and preferably subscribe to the llvm-bugs email list to keep track of bugs and enhancements occurring in LLVM. We really appreciate people who are proactive at catching incoming bugs in their components and dealing with them promptly.

Please be aware that all public LLVM mailing lists are public and archived, and that notices of confidentiality or non-disclosure cannot be respected.

Making and Submitting a Patch

When making a patch for review, the goal is to make it as easy for the reviewer to read it as possible. As such, we recommend that you:

  1. Make your patch against the Subversion trunk, not a branch, and not an old version of LLVM. This makes it easy to apply the patch. For information on how to check out SVN trunk, please see the Getting Started Guide.
  2. Similarly, patches should be submitted soon after they are generated. Old patches may not apply correctly if the underlying code changes between the time the patch was created and the time it is applied.
  3. Patches should be made with svn diff, or similar. If you use a different tool, make sure it uses the diff -u format and that it doesn’t contain clutter which makes it hard to read.
  4. If you are modifying generated files, such as the top-level configure script, please separate out those changes into a separate patch from the rest of your changes.

Once your patch is ready, submit it by emailing it to the appropriate project’s commit mailing list (or commit it directly if applicable). Alternatively, some patches get sent to the project’s development list or component of the LLVM bug tracker, but the commit list is the primary place for reviews and should generally be preferred.

When sending a patch to a mailing list, it is a good idea to send it as an attachment to the message, not embedded into the text of the message. This ensures that your mailer will not mangle the patch when it sends it (e.g. by making whitespace changes or by wrapping lines).

For Thunderbird users: Before submitting a patch, please open Preferences > Advanced > General > Config Editor, find the key mail.content_disposition_type, and set its value to 1. Without this setting, Thunderbird sends your attachment using Content-Disposition: inline rather than Content-Disposition: attachment. Apple Mail gamely displays such a file inline, making it difficult to work with for reviewers using that program.

When submitting patches, please do not add confidentiality or non-disclosure notices to the patches themselves. These notices conflict with the LLVM License and may result in your contribution being excluded.

Code Reviews

LLVM has a code review policy. Code review is one way to increase the quality of software. We generally follow these policies:

  1. All developers are required to have significant changes reviewed before they are committed to the repository.
  2. Code reviews are conducted by email on the relevant project’s commit mailing list, or alternatively on the project’s development list or bug tracker.
  3. Code can be reviewed either before it is committed or after. We expect major changes to be reviewed before being committed, but smaller changes (or changes where the developer owns the component) can be reviewed after commit.
  4. The developer responsible for a code change is also responsible for making all necessary review-related changes.
  5. Code review can be an iterative process, which continues until the patch is ready to be committed. Specifically, once a patch is sent out for review, it needs an explicit “looks good” before it is submitted. Do not assume silent approval, or request active objections to the patch with a deadline.

Sometimes code reviews will take longer than you would hope for, especially for larger features. Accepted ways to speed up review times for your patches are:

  • Review other people’s patches. If you help out, everybody will be more willing to do the same for you; goodwill is our currency.
  • Ping the patch. If it is urgent, provide reasons why it is important to you to get this patch landed and ping it every couple of days. If it is not urgent, the common courtesy ping rate is one week. Remember that you’re asking for valuable time from other professional developers.
  • Ask for help on IRC. Developers on IRC will be able to either help you directly, or tell you who might be a good reviewer.
  • Split your patch into multiple smaller patches that build on each other. The smaller your patch, the higher the probability that somebody will take a quick look at it.

Developers should participate in code reviews as both reviewers and reviewees. If someone is kind enough to review your code, you should return the favor for someone else. Note that anyone is welcome to review and give feedback on a patch, but only people with Subversion write access can approve it.

There is a web based code review tool that can optionally be used for code reviews. See Code Reviews with Phabricator.

Code Owners

The LLVM Project relies on two features of its process to maintain rapid development in addition to the high quality of its source base: the combination of code review plus post-commit review for trusted maintainers. Having both is a great way for the project to take advantage of the fact that most people do the right thing most of the time, and only commit patches without pre-commit review when they are confident they are right.

The trick to this is that the project has to guarantee that all patches that are committed are reviewed after they go in: you don’t want everyone to assume someone else will review it, allowing the patch to go unreviewed. To solve this problem, we have a notion of an ‘owner’ for a piece of the code. The sole responsibility of a code owner is to ensure that a commit to their area of the code is appropriately reviewed, either by themself or by someone else. The list of current code owners can be found in the file CODE_OWNERS.TXT in the root of the LLVM source tree.

Note that code ownership is completely different than reviewers: anyone can review a piece of code, and we welcome code review from anyone who is interested. Code owners are the “last line of defense” to guarantee that all patches that are committed are actually reviewed.

Being a code owner is a somewhat unglamorous position, but it is incredibly important for the ongoing success of the project. Because people get busy, interests change, and unexpected things happen, code ownership is purely opt-in, and anyone can choose to resign their “title” at any time. For now, we do not have an official policy on how one gets elected to be a code owner.

Test Cases

Developers are required to create test cases for any bugs fixed and any new features added. Some tips for getting your testcase approved:

  • All feature and regression test cases are added to the llvm/test directory. The appropriate sub-directory should be selected (see the Testing Guide for details).
  • Test cases should be written in LLVM assembly language.
  • Test cases, especially for regressions, should be reduced as much as possible, by bugpoint or manually. It is unacceptable to place an entire failing program into llvm/test as this creates a time-to-test burden on all developers. Please keep them short.

Note that llvm/test and clang/test are designed for regression and small feature tests only. More extensive test cases (e.g., entire applications, benchmarks, etc) should be added to the llvm-test test suite. The llvm-test suite is for coverage (correctness, performance, etc) testing, not feature or regression testing.

Quality

The minimum quality standards that any change must satisfy before being committed to the main development branch are:

  1. Code must adhere to the LLVM Coding Standards.
  2. Code must compile cleanly (no errors, no warnings) on at least one platform.
  3. Bug fixes and new features should include a testcase so we know if the fix/feature ever regresses in the future.
  4. Code must pass the llvm/test test suite.
  5. The code must not cause regressions on a reasonable subset of llvm-test, where “reasonable” depends on the contributor’s judgement and the scope of the change (more invasive changes require more testing). A reasonable subset might be something like “llvm-test/MultiSource/Benchmarks”.

Additionally, the committer is responsible for addressing any problems found in the future that the change is responsible for. For example:

  • The code should compile cleanly on all supported platforms.
  • The changes should not cause any correctness regressions in the llvm-test suite and must not cause any major performance regressions.
  • The change set should not cause performance or correctness regressions for the LLVM tools.
  • The changes should not cause performance or correctness regressions in code compiled by LLVM on all applicable targets.
  • You are expected to address any Bugzilla bugs that result from your change.

We prefer for this to be handled before submission but understand that it isn’t possible to test all of this for every submission. Our build bots and nightly testing infrastructure normally finds these problems. A good rule of thumb is to check the nightly testers for regressions the day after your change. Build bots will directly email you if a group of commits that included yours caused a failure. You are expected to check the build bot messages to see if they are your fault and, if so, fix the breakage.

Commits that violate these quality standards (e.g. are very broken) may be reverted. This is necessary when the change blocks other developers from making progress. The developer is welcome to re-commit the change after the problem has been fixed.

Commit messages

Although we don’t enforce the format of commit messages, we prefer that you follow these guidelines to help review, search in logs, email formatting and so on. These guidelines are very similar to rules used by other open source projects.

Most importantly, the contents of the message should be carefully written to convey the rationale of the change (without delving too much in detail). It also should avoid being vague or overly specific. For example, “bits were not set right” will leave the reviewer wondering about which bits, and why they weren’t right, while “Correctly set overflow bits in TargetInfo” conveys almost all there is to the change.

Below are some guidelines about the format of the message itself:

  • Separate the commit message into title, body and, if you’re not the original author, a “Patch by” attribution line (see below).
  • The title should be concise. Because all commits are emailed to the list with the first line as the subject, long titles are frowned upon. Short titles also look better in git log.
  • When the changes are restricted to a specific part of the code (e.g. a back-end or optimization pass), it is customary to add a tag to the beginning of the line in square brackets. For example, “[SCEV] ...” or “[OpenMP] ...”. This helps email filters and searches for post-commit reviews.
  • The body, if it exists, should be separated from the title by an empty line.
  • The body should be concise, but explanatory, including a complete reasoning. Unless it is required to understand the change, examples, code snippets and gory details should be left to bug comments, web review or the mailing list.
  • If the patch fixes a bug in bugzilla, please include the PR# in the message.
  • Attribution of Changes should be in a separate line, after the end of the body, as simple as “Patch by John Doe.”. This is how we officially handle attribution, and there are automated processes that rely on this format.
  • Text formatting and spelling should follow the same rules as documentation and in-code comments, ex. capitalization, full stop, etc.
  • If the commit is a bug fix on top of another recently committed patch, or a revert or reapply of a patch, include the svn revision number of the prior related commit. This could be as simple as “Revert rNNNN because it caused PR#”.

For minor violations of these recommendations, the community normally favors reminding the contributor of this policy over reverting. Minor corrections and omissions can be handled by sending a reply to the commits mailing list.

Obtaining Commit Access

We grant commit access to contributors with a track record of submitting high quality patches. If you would like commit access, please send an email to Chris with the following information:

  1. The user name you want to commit with, e.g. “hacker”.
  2. The full name and email address you want message to llvm-commits to come from, e.g. “J. Random Hacker <hacker@yoyodyne.com>”.
  3. A “password hash” of the password you want to use, e.g. “2ACR96qjUqsyM”. Note that you don’t ever tell us what your password is; you just give it to us in an encrypted form. To get this, run “htpasswd” (a utility that comes with apache) in crypt mode (often enabled with “-d”), or find a web page that will do it for you. Note that our system does not work with MD5 hashes. These are significantly longer than a crypt hash - e.g. “$apr1$vea6bBV2$Z8IFx.AfeD8LhqlZFqJer0”, we only accept the shorter crypt hash.

Once you’ve been granted commit access, you should be able to check out an LLVM tree with an SVN URL of “https://username@llvm.org/...” instead of the normal anonymous URL of “http://llvm.org/...”. The first time you commit you’ll have to type in your password. Note that you may get a warning from SVN about an untrusted key; you can ignore this. To verify that your commit access works, please do a test commit (e.g. change a comment or add a blank line). Your first commit to a repository may require the autogenerated email to be approved by a mailing list. This is normal and will be done when the mailing list owner has time.

If you have recently been granted commit access, these policies apply:

  1. You are granted commit-after-approval to all parts of LLVM. To get approval, submit a patch to llvm-commits. When approved, you may commit it yourself.
  2. You are allowed to commit patches without approval which you think are obvious. This is clearly a subjective decision — we simply expect you to use good judgement. Examples include: fixing build breakage, reverting obviously broken patches, documentation/comment changes, any other minor changes.
  3. You are allowed to commit patches without approval to those portions of LLVM that you have contributed or maintain (i.e., have been assigned responsibility for), with the proviso that such commits must not break the build. This is a “trust but verify” policy, and commits of this nature are reviewed after they are committed.
  4. Multiple violations of these policies or a single egregious violation may cause commit access to be revoked.

In any case, your changes are still subject to code review (either before or after they are committed, depending on the nature of the change). You are encouraged to review other peoples’ patches as well, but you aren’t required to do so.

Making a Major Change

When a developer begins a major new project with the aim of contributing it back to LLVM, they should inform the community with an email to the llvm-dev email list, to the extent possible. The reason for this is to:

  1. keep the community informed about future changes to LLVM,
  2. avoid duplication of effort by preventing multiple parties working on the same thing and not knowing about it, and
  3. ensure that any technical issues around the proposed work are discussed and resolved before any significant work is done.

The design of LLVM is carefully controlled to ensure that all the pieces fit together well and are as consistent as possible. If you plan to make a major change to the way LLVM works or want to add a major new extension, it is a good idea to get consensus with the development community before you start working on it.

Once the design of the new feature is finalized, the work itself should be done as a series of incremental changes, not as a long-term development branch.

Incremental Development

In the LLVM project, we do all significant changes as a series of incremental patches. We have a strong dislike for huge changes or long-term development branches. Long-term development branches have a number of drawbacks:

  1. Branches must have mainline merged into them periodically. If the branch development and mainline development occur in the same pieces of code, resolving merge conflicts can take a lot of time.
  2. Other people in the community tend to ignore work on branches.
  3. Huge changes (produced when a branch is merged back onto mainline) are extremely difficult to code review.
  4. Branches are not routinely tested by our nightly tester infrastructure.
  5. Changes developed as monolithic large changes often don’t work until the entire set of changes is done. Breaking it down into a set of smaller changes increases the odds that any of the work will be committed to the main repository.

To address these problems, LLVM uses an incremental development style and we require contributors to follow this practice when making a large/invasive change. Some tips:

  • Large/invasive changes usually have a number of secondary changes that are required before the big change can be made (e.g. API cleanup, etc). These sorts of changes can often be done before the major change is done, independently of that work.
  • The remaining inter-related work should be decomposed into unrelated sets of changes if possible. Once this is done, define the first increment and get consensus on what the end goal of the change is.
  • Each change in the set can be stand alone (e.g. to fix a bug), or part of a planned series of changes that works towards the development goal.
  • Each change should be kept as small as possible. This simplifies your work (into a logical progression), simplifies code review and reduces the chance that you will get negative feedback on the change. Small increments also facilitate the maintenance of a high quality code base.
  • Often, an independent precursor to a big change is to add a new API and slowly migrate clients to use the new API. Each change to use the new API is often “obvious” and can be committed without review. Once the new API is in place and used, it is much easier to replace the underlying implementation of the API. This implementation change is logically separate from the API change.

If you are interested in making a large change, and this scares you, please make sure to first discuss the change/gather consensus then ask about the best way to go about making the change.

Attribution of Changes

When contributors submit a patch to an LLVM project, other developers with commit access may commit it for the author once appropriate (based on the progression of code review, etc.). When doing so, it is important to retain correct attribution of contributions to their contributors. However, we do not want the source code to be littered with random attributions “this code written by J. Random Hacker” (this is noisy and distracting). In practice, the revision control system keeps a perfect history of who changed what, and the CREDITS.txt file describes higher-level contributions. If you commit a patch for someone else, please follow the attribution of changes in the simple manner as outlined by the commit messages section. Overall, please do not add contributor names to the source code.

Also, don’t commit patches authored by others unless they have submitted the patch to the project or you have been authorized to submit them on their behalf (you work together and your company authorized you to contribute the patches, etc.). The author should first submit them to the relevant project’s commit list, development list, or LLVM bug tracker component. If someone sends you a patch privately, encourage them to submit it to the appropriate list first.

IR Backwards Compatibility

When the IR format has to be changed, keep in mind that we try to maintain some backwards compatibility. The rules are intended as a balance between convenience for llvm users and not imposing a big burden on llvm developers:

  • The textual format is not backwards compatible. We don’t change it too often, but there are no specific promises.
  • Additions and changes to the IR should be reflected in test/Bitcode/compatibility.ll.
  • The current LLVM version supports loading any bitcode since version 3.0.
  • After each X.Y release, compatibility.ll must be copied to compatibility-X.Y.ll. The corresponding bitcode file should be assembled using the X.Y build and committed as compatibility-X.Y.ll.bc.
  • Newer releases can ignore features from older releases, but they cannot miscompile them. For example, if nsw is ever replaced with something else, dropping it would be a valid way to upgrade the IR.
  • Debug metadata is special in that it is currently dropped during upgrades.
  • Non-debug metadata is defined to be safe to drop, so a valid way to upgrade it is to drop it. That is not very user friendly and a bit more effort is expected, but no promises are made.

C API Changes

  • Stability Guarantees: The C API is, in general, a “best effort” for stability. This means that we make every attempt to keep the C API stable, but that stability will be limited by the abstractness of the interface and the stability of the C++ API that it wraps. In practice, this means that things like “create debug info” or “create this type of instruction” are likely to be less stable than “take this IR file and JIT it for my current machine”.
  • Release stability: We won’t break the C API on the release branch with patches that go on that branch, with the exception that we will fix an unintentional C API break that will keep the release consistent with both the previous and next release.
  • Testing: Patches to the C API are expected to come with tests just like any other patch.
  • Including new things into the API: If an LLVM subcomponent has a C API already included, then expanding that C API is acceptable. Adding C API for subcomponents that don’t currently have one needs to be discussed on the mailing list for design and maintainability feedback prior to implementation.
  • Documentation: Any changes to the C API are required to be documented in the release notes so that it’s clear to external users who do not follow the project how the C API is changing and evolving.

New Targets

LLVM is very receptive to new targets, even experimental ones, but a number of problems can appear when adding new large portions of code, and back-ends are normally added in bulk. We have found that landing large pieces of new code and then trying to fix emergent problems in-tree is problematic for a variety of reasons.

For these reasons, new targets are always added as experimental until they can be proven stable, and later moved to non-experimental. The difference between both classes is that experimental targets are not built by default (need to be added to -DLLVM_TARGETS_TO_BUILD at CMake time).

The basic rules for a back-end to be upstreamed in experimental mode are:

  • Every target must have a code owner. The CODE_OWNERS.TXT file has to be updated as part of the first merge. The code owner makes sure that changes to the target get reviewed and steers the overall effort.
  • There must be an active community behind the target. This community will help maintain the target by providing buildbots, fixing bugs, answering the LLVM community’s questions and making sure the new target doesn’t break any of the other targets, or generic code. This behavior is expected to continue throughout the lifetime of the target’s code.
  • The code must be free of contentious issues, for example, large changes in how the IR behaves or should be formed by the front-ends, unless agreed by the majority of the community via refactoring of the (IR standard) before the merge of the new target changes, following the IR Backwards Compatibility.
  • The code conforms to all of the policies laid out in this developer policy document, including license, patent, and coding standards.
  • The target should have either reasonable documentation on how it works (ISA, ABI, etc.) or a publicly available simulator/hardware (either free or cheap enough) - preferably both. This allows developers to validate assumptions, understand constraints and review code that can affect the target.

In addition, the rules for a back-end to be promoted to official are:

  • The target must have addressed every other minimum requirement and have been stable in tree for at least 3 months. This cool down period is to make sure that the back-end and the target community can endure continuous upstream development for the foreseeable future.
  • The target’s code must have been completely adapted to this policy as well as the coding standards. Any exceptions that were made to move into experimental mode must have been fixed before becoming official.
  • The test coverage needs to be broad and well written (small tests, well documented). The build target check-all must pass with the new target built, and where applicable, the test-suite must also pass without errors, in at least one configuration (publicly demonstrated, for example, via buildbots).
  • Public buildbots need to be created and actively maintained, unless the target requires no additional buildbots (ex. check-all covers all tests). The more relevant and public the new target’s CI infrastructure is, the more the LLVM community will embrace it.

To continue as a supported and official target:

  • The maintainer(s) must continue following these rules throughout the lifetime of the target. Continuous violations of aforementioned rules and policies could lead to complete removal of the target from the code base.
  • Degradation in support, documentation or test coverage will make the target as nuisance to other targets and be considered a candidate for deprecation and ultimately removed.

In essences, these rules are necessary for targets to gain and retain their status, but also markers to define bit-rot, and will be used to clean up the tree from unmaintained targets.