1 TableGen Programmer’s Reference

1.1 Introduction

The purpose of TableGen is to generate complex output files based on information from source files that are significantly easier to code than the output files would be, and also easier to maintain and modify over time. The information is coded in a declarative style involving classes and records, which are then processed by TableGen. The internalized records are passed on to various backends, which extract information from a subset of the records and generate one or more output files. These output files are typically .inc files for C++, but may be any type of file that the backend developer needs.

This document describes the LLVM TableGen facility in detail. It is intended for the programmer who is using TableGen to produce code for a project. If you are looking for a simple overview, check out the TableGen Overview. The various *-tblgen commands used to invoke TableGen are described in tblgen Family - Description to C++ Code.

An example of a backend is RegisterInfo, which generates the register file information for a particular target machine, for use by the LLVM target-independent code generator. See TableGen Backends for a description of the LLVM TableGen backends, and TableGen Backend Developer’s Guide for a guide to writing a new backend.

Here are a few of the things backends can do.

  • Generate the register file information for a particular target machine.

  • Generate the instruction definitions for a target.

  • Generate the patterns that the code generator uses to match instructions to intermediate representation (IR) nodes.

  • Generate semantic attribute identifiers for Clang.

  • Generate abstract syntax tree (AST) declaration node definitions for Clang.

  • Generate AST statement node definitions for Clang.

1.1.1 Concepts

TableGen source files contain two primary items: abstract records and concrete records. In this and other TableGen documents, abstract records are called classes. (These classes are different from C++ classes and do not map onto them.) In addition, concrete records are usually just called records, although sometimes the term record refers to both classes and concrete records. The distinction should be clear in context.

Classes and concrete records have a unique name, either chosen by the programmer or generated by TableGen. Associated with that name is a list of fields with values and an optional list of parent classes (sometimes called base or super classes). The fields are the primary data that backends will process. Note that TableGen assigns no meanings to fields; the meanings are entirely up to the backends and the programs that incorporate the output of those backends.

Note

The term “parent class” can refer to a class that is a parent of another class, and also to a class from which a concrete record inherits. This nonstandard use of the term arises because TableGen treats classes and concrete records similarly.

A backend processes some subset of the concrete records built by the TableGen parser and emits the output files. These files are usually C++ .inc files that are included by the programs that require the data in those records. However, a backend can produce any type of output files. For example, it could produce a data file containing messages tagged with identifiers and substitution parameters. In a complex use case such as the LLVM code generator, there can be many concrete records and some of them can have an unexpectedly large number of fields, resulting in large output files.

In order to reduce the complexity of TableGen files, classes are used to abstract out groups of record fields. For example, a few classes may abstract the concept of a machine register file, while other classes may abstract the instruction formats, and still others may abstract the individual instructions. TableGen allows an arbitrary hierarchy of classes, so that the abstract classes for two concepts can share a third superclass that abstracts common “sub-concepts” from the two original concepts.

In order to make classes more useful, a concrete record (or another class) can request a class as a parent class and pass template arguments to it. These template arguments can be used in the fields of the parent class to initialize them in a custom manner. That is, record or class A can request parent class S with one set of template arguments, while record or class B can request S with a different set of arguments. Without template arguments, many more classes would be required, one for each combination of the template arguments.

Both classes and concrete records can include fields that are uninitialized. The uninitialized “value” is represented by a question mark (?). Classes often have uninitialized fields that are expected to be filled in when those classes are inherited by concrete records. Even so, some fields of concrete records may remain uninitialized.

TableGen provides multiclasses to collect a group of record definitions in one place. A multiclass is a sort of macro that can be “invoked” to define multiple concrete records all at once. A multiclass can inherit from other multiclasses, which means that the multiclass inherits all the definitions from its parent multiclasses.

Appendix C: Sample Record illustrates a complex record in the Intel X86 target and the simple way in which it is defined.

1.2 Source Files

TableGen source files are plain ASCII text files. The files can contain statements, comments, and blank lines (see Lexical Analysis). The standard file extension for TableGen files is .td.

TableGen files can grow quite large, so there is an include mechanism that allows one file to include the content of another file (see Include Files). This allows large files to be broken up into smaller ones, and also provides a simple library mechanism where multiple source files can include the same library file.

TableGen supports a simple preprocessor that can be used to conditionalize portions of .td files. See Preprocessing Facilities for more information.

1.3 Lexical Analysis

The lexical and syntax notation used here is intended to imitate Python’s notation. In particular, for lexical definitions, the productions operate at the character level and there is no implied whitespace between elements. The syntax definitions operate at the token level, so there is implied whitespace between tokens.

TableGen supports BCPL-style comments (// ...) and nestable C-style comments (/* ... */). TableGen also provides simple Preprocessing Facilities.

Formfeed characters may be used freely in files to produce page breaks when the file is printed for review.

The following are the basic punctuation tokens:

- + [ ] { } ( ) < > : ; . ... = ? #

1.3.1 Literals

Numeric literals take one of the following forms:

TokInteger     ::=  DecimalInteger | HexInteger | BinInteger
DecimalInteger ::=  ["+" | "-"] ("0"..."9")+
HexInteger     ::=  "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
BinInteger     ::=  "0b" ("0" | "1")+

Observe that the DecimalInteger token includes the optional + or - sign, unlike most languages where the sign would be treated as a unary operator.

TableGen has two kinds of string literals:

TokString ::=  '"' (non-'"' characters and escapes) '"'
TokCode   ::=  "[{" (shortest text not containing "}]") "}]"

A TokCode is nothing more than a multi-line string literal delimited by [{ and }]. It can break across lines and the line breaks are retained in the string.

The current implementation accepts the following escape sequences:

\\ \' \" \t \n

1.3.2 Identifiers

TableGen has name- and identifier-like tokens, which are case-sensitive.

ualpha        ::=  "a"..."z" | "A"..."Z" | "_"
TokIdentifier ::=  ("0"..."9")* ualpha (ualpha | "0"..."9")*
TokVarName    ::=  "$" ualpha (ualpha |  "0"..."9")*

Note that, unlike most languages, TableGen allows TokIdentifier to begin with an integer. In case of ambiguity, a token is interpreted as a numeric literal rather than an identifier.

TableGen has the following reserved keywords, which cannot be used as identifiers:

assert     bit           bits          class         code
dag        def           dump          else          false
foreach    defm          defset        defvar        field
if         in            include       int           let
list       multiclass    string        then          true

Warning

The field reserved word is deprecated, except when used with the CodeEmitterGen backend where it’s used to distinguish normal record fields from encoding fields.

1.3.3 Bang operators

TableGen provides “bang operators” that have a wide variety of uses:

BangOperator ::=  one of
                  !add         !and         !cast        !con         !dag
                  !div         !empty       !eq          !exists      !filter
                  !find        !foldl       !foreach     !ge          !getdagarg
                  !getdagname  !getdagop    !gt          !head        !if
                  !interleave  !isa         !le          !listconcat  !listremove
                  !listsplat   !logtwo      !lt          !mul         !ne
                  !not         !or          !range       !repr        !setdagarg
                  !setdagname  !setdagop    !shl         !size        !sra
                  !srl         !strconcat   !sub         !subst       !substr
                  !tail        !tolower     !toupper     !xor

The !cond operator has a slightly different syntax compared to other bang operators, so it is defined separately:

CondOperator ::=  !cond

See Appendix A: Bang Operators for a description of each bang operator.

1.3.4 Include files

TableGen has an include mechanism. The content of the included file lexically replaces the include directive and is then parsed as if it was originally in the main file.

IncludeDirective ::=  "include" TokString

Portions of the main file and included files can be conditionalized using preprocessor directives.

PreprocessorDirective ::=  "#define" | "#ifdef" | "#ifndef"

1.4 Types

The TableGen language is statically typed, using a simple but complete type system. Types are used to check for errors, to perform implicit conversions, and to help interface designers constrain the allowed input. Every value is required to have an associated type.

TableGen supports a mixture of low-level types (e.g., bit) and high-level types (e.g., dag). This flexibility allows you to describe a wide range of records conveniently and compactly.

Type    ::=  "bit" | "int" | "string" | "dag"
            | "bits" "<" TokInteger ">"
            | "list" "<" Type ">"
            | ClassID
ClassID ::=  TokIdentifier
bit

A bit is a boolean value that can be 0 or 1.

int

The int type represents a simple 64-bit integer value, such as 5 or -42.

string

The string type represents an ordered sequence of characters of arbitrary length.

bits<n>

The bits type is a fixed-sized integer of arbitrary length n that is treated as separate bits. These bits can be accessed individually. A field of this type is useful for representing an instruction operation code, register number, or address mode/register/displacement. The bits of the field can be set individually or as subfields. For example, in an instruction address, the addressing mode, base register number, and displacement can be set separately.

list<type>

This type represents a list whose elements are of the type specified in angle brackets. The element type is arbitrary; it can even be another list type. List elements are indexed from 0.

dag

This type represents a nestable directed acyclic graph (DAG) of nodes. Each node has an operator and zero or more arguments (or operands). An argument can be another dag object, allowing an arbitrary tree of nodes and edges. As an example, DAGs are used to represent code patterns for use by the code generator instruction selection algorithms. See Directed acyclic graphs (DAGs) for more details;

ClassID

Specifying a class name in a type context indicates that the type of the defined value must be a subclass of the specified class. This is useful in conjunction with the list type; for example, to constrain the elements of the list to a common base class (e.g., a list<Register> can only contain definitions derived from the Register class). The ClassID must name a class that has been previously declared or defined.

1.5 Values and Expressions

There are many contexts in TableGen statements where a value is required. A common example is in the definition of a record, where each field is specified by a name and an optional value. TableGen allows for a reasonable number of different forms when building up value expressions. These forms allow the TableGen file to be written in a syntax that is natural for the application.

Note that all of the values have rules for converting them from one type to another. For example, these rules allow you to assign a value like 7 to an entity of type bits<4>.

Value         ::=  SimpleValue ValueSuffix*
                  | Value "#" [Value]
ValueSuffix   ::=  "{" RangeList "}"
                  | "[" SliceElements "]"
                  | "." TokIdentifier
RangeList     ::=  RangePiece ("," RangePiece)*
RangePiece    ::=  TokInteger
                  | TokInteger "..." TokInteger
                  | TokInteger "-" TokInteger
                  | TokInteger TokInteger
SliceElements ::=  (SliceElement ",")* SliceElement ","?
SliceElement  ::=  Value
                  | Value "..." Value
                  | Value "-" Value
                  | Value TokInteger

Warning

The peculiar last form of RangePiece and SliceElement is due to the fact that the “-” is included in the TokInteger, hence 1-5 gets lexed as two consecutive tokens, with values 1 and -5, instead of “1”, “-”, and “5”. The use of hyphen as the range punctuation is deprecated.

1.5.1 Simple values

The SimpleValue has a number of forms.

SimpleValue ::=  TokInteger | TokString+ | TokCode

A value can be an integer literal, a string literal, or a code literal. Multiple adjacent string literals are concatenated as in C/C++; the simple value is the concatenation of the strings. Code literals become strings and are then indistinguishable from them.

SimpleValue2 ::=  "true" | "false"

The true and false literals are essentially syntactic sugar for the integer values 1 and 0. They improve the readability of TableGen files when boolean values are used in field initializations, bit sequences, if statements, etc. When parsed, these literals are converted to integers.

Note

Although true and false are literal names for 1 and 0, we recommend as a stylistic rule that you use them for boolean values only.

SimpleValue3 ::=  "?"

A question mark represents an uninitialized value.

SimpleValue4 ::=  "{" [ValueList] "}"
ValueList    ::=  ValueListNE
ValueListNE  ::=  Value ("," Value)*

This value represents a sequence of bits, which can be used to initialize a bits<n> field (note the braces). When doing so, the values must represent a total of n bits.

SimpleValue5 ::=  "[" ValueList "]" ["<" Type ">"]

This value is a list initializer (note the brackets). The values in brackets are the elements of the list. The optional Type can be used to indicate a specific element type; otherwise the element type is inferred from the given values. TableGen can usually infer the type, although sometimes not when the value is the empty list ([]).

SimpleValue6 ::=  "(" DagArg [DagArgList] ")"
DagArgList   ::=  DagArg ("," DagArg)*
DagArg       ::=  Value [":" TokVarName] | TokVarName

This represents a DAG initializer (note the parentheses). The first DagArg is called the “operator” of the DAG and must be a record. See Directed acyclic graphs (DAGs) for more details.

SimpleValue7 ::=  TokIdentifier

The resulting value is the value of the entity named by the identifier. The possible identifiers are described here, but the descriptions will make more sense after reading the remainder of this guide.

  • A template argument of a class, such as the use of Bar in:

    class Foo <int Bar> {
      int Baz = Bar;
    }
    
  • The implicit template argument NAME in a class or multiclass definition (see NAME).

  • A field local to a class, such as the use of Bar in:

    class Foo {
      int Bar = 5;
      int Baz = Bar;
    }
    
  • The name of a record definition, such as the use of Bar in the definition of Foo:

    def Bar : SomeClass {
      int X = 5;
    }
    
    def Foo {
      SomeClass Baz = Bar;
    }
    
  • A field local to a record definition, such as the use of Bar in:

    def Foo {
      int Bar = 5;
      int Baz = Bar;
    }
    

    Fields inherited from the record’s parent classes can be accessed the same way.

  • A template argument of a multiclass, such as the use of Bar in:

    multiclass Foo <int Bar> {
      def : SomeClass<Bar>;
    }
    
  • A variable defined with the defvar or defset statements.

  • The iteration variable of a foreach, such as the use of i in:

    foreach i = 0...5 in
      def Foo#i;
    
SimpleValue8 ::=  ClassID "<" ArgValueList ">"

This form creates a new anonymous record definition (as would be created by an unnamed def inheriting from the given class with the given template arguments; see def) and the value is that record. A field of the record can be obtained using a suffix; see Suffixed Values.

Invoking a class in this manner can provide a simple subroutine facility. See Using Classes as Subroutines for more information.

SimpleValue9 ::=  BangOperator ["<" Type ">"] "(" ValueListNE ")"
                 | CondOperator "(" CondClause ("," CondClause)* ")"
CondClause   ::=  Value ":" Value

The bang operators provide functions that are not available with the other simple values. Except in the case of !cond, a bang operator takes a list of arguments enclosed in parentheses and performs some function on those arguments, producing a value for that bang operator. The !cond operator takes a list of pairs of arguments separated by colons. See Appendix A: Bang Operators for a description of each bang operator.

1.5.2 Suffixed values

The SimpleValue values described above can be specified with certain suffixes. The purpose of a suffix is to obtain a subvalue of the primary value. Here are the possible suffixes for some primary value.

value{17}

The final value is bit 17 of the integer value (note the braces).

value{8...15}

The final value is bits 8–15 of the integer value. The order of the bits can be reversed by specifying {15...8}.

value[i]

The final value is element i of the list value (note the brackets). In other words, the brackets act as a subscripting operator on the list. This is the case only when a single element is specified.

value[i,]

The final value is a list that contains a single element i of the list. In short, a list slice with a single element.

value[4...7,17,2...3,4]

The final value is a new list that is a slice of the list value. The new list contains elements 4, 5, 6, 7, 17, 2, 3, and 4. Elements may be included multiple times and in any order. This is the result only when more than one element is specified.

value[i,m...n,j,ls]

Each element may be an expression (variables, bang operators). The type of m and n should be int. The type of i, j, and ls should be either int or list<int>.

value.field

The final value is the value of the specified field in the specified record value.

1.5.3 The paste operator

The paste operator (#) is the only infix operator available in TableGen expressions. It allows you to concatenate strings or lists, but has a few unusual features.

The paste operator can be used when specifying the record name in a Def or Defm statement, in which case it must construct a string. If an operand is an undefined name (TokIdentifier) or the name of a global Defvar or Defset, it is treated as a verbatim string of characters. The value of a global name is not used.

The paste operator can be used in all other value expressions, in which case it can construct a string or a list. Rather oddly, but consistent with the previous case, if the right-hand-side operand is an undefined name or a global name, it is treated as a verbatim string of characters. The left-hand-side operand is treated normally.

Values can have a trailing paste operator, in which case the left-hand-side operand is concatenated to an empty string.

Appendix B: Paste Operator Examples presents examples of the behavior of the paste operator.

1.6 Statements

The following statements may appear at the top level of TableGen source files.

TableGenFile ::=  (Statement | IncludeDirective
                 | PreprocessorDirective)*
Statement    ::=  Assert | Class | Def | Defm | Defset | Defvar
                 | Dump  | Foreach | If | Let | MultiClass

The following sections describe each of these top-level statements.

1.6.1 class — define an abstract record class

A class statement defines an abstract record class from which other classes and records can inherit.

Class           ::=  "class" ClassID [TemplateArgList] RecordBody
TemplateArgList ::=  "<" TemplateArgDecl ("," TemplateArgDecl)* ">"
TemplateArgDecl ::=  Type TokIdentifier ["=" Value]

A class can be parameterized by a list of “template arguments,” whose values can be used in the class’s record body. These template arguments are specified each time the class is inherited by another class or record.

If a template argument is not assigned a default value with =, it is uninitialized (has the “value” ?) and must be specified in the template argument list when the class is inherited (required argument). If an argument is assigned a default value, then it need not be specified in the argument list (optional argument). In the declaration, all required template arguments must precede any optional arguments. The template argument default values are evaluated from left to right.

The RecordBody is defined below. It can include a list of parent classes from which the current class inherits, along with field definitions and other statements. When a class C inherits from another class D, the fields of D are effectively merged into the fields of C.

A given class can only be defined once. A class statement is considered to define the class if any of the following are true (the RecordBody elements are described below).

You can declare an empty class by specifying an empty TemplateArgList and an empty RecordBody. This can serve as a restricted form of forward declaration. Note that records derived from a forward-declared class will inherit no fields from it, because those records are built when their declarations are parsed, and thus before the class is finally defined.

Every class has an implicit template argument named NAME (uppercase), which is bound to the name of the Def or Defm inheriting from the class. If the class is inherited by an anonymous record, the name is unspecified but globally unique.

See Examples: classes and records for examples.

1.6.1.1 Record Bodies

Record bodies appear in both class and record definitions. A record body can include a parent class list, which specifies the classes from which the current class or record inherits fields. Such classes are called the parent classes of the class or record. The record body also includes the main body of the definition, which contains the specification of the fields of the class or record.

RecordBody            ::=  ParentClassList Body
ParentClassList       ::=  [":" ParentClassListNE]
ParentClassListNE     ::=  ClassRef ("," ClassRef)*
ClassRef              ::=  (ClassID | MultiClassID) ["<" [ArgValueList] ">"]
ArgValueList          ::=  PostionalArgValueList [","] NamedArgValueList
PostionalArgValueList ::=  [Value {"," Value}*]
NamedArgValueList     ::=  [NameValue "=" Value {"," NameValue "=" Value}*]

A ParentClassList containing a MultiClassID is valid only in the class list of a defm statement. In that case, the ID must be the name of a multiclass.

The argument values can be specified in two forms:

  • Positional argument (value). The value is assigned to the argument in the corresponding position. For Foo<a0, a1>, a0 will be assigned to first argument and a1 will be assigned to second argument.

  • Named argument (name=value). The value is assigned to the argument with the specified name. For Foo<a=a0, b=a1>, a0 will be assigned to the argument with name a and a1 will be assigned to the argument with name b.

Required arguments can also be specified as named argument.

Note that the argument can only be specified once regardless of the way (named or positional) to specify and positional arguments should be put before named arguments.

Body     ::=  ";" | "{" BodyItem* "}"
BodyItem ::=  (Type | "code") TokIdentifier ["=" Value] ";"
             | "let" TokIdentifier ["{" RangeList "}"] "=" Value ";"
             | "defvar" TokIdentifier "=" Value ";"
             | Assert

A field definition in the body specifies a field to be included in the class or record. If no initial value is specified, then the field’s value is uninitialized. The type must be specified; TableGen will not infer it from the value. The keyword code may be used to emphasize that the field has a string value that is code.

The let form is used to reset a field to a new value. This can be done for fields defined directly in the body or fields inherited from parent classes. A RangeList can be specified to reset certain bits in a bit<n> field.

The defvar form defines a variable whose value can be used in other value expressions within the body. The variable is not a field: it does not become a field of the class or record being defined. Variables are provided to hold temporary values while processing the body. See Defvar in a Record Body for more details.

When class C2 inherits from class C1, it acquires all the field definitions of C1. As those definitions are merged into class C2, any template arguments passed to C1 by C2 are substituted into the definitions. In other words, the abstract record fields defined by C1 are expanded with the template arguments before being merged into C2.

1.6.2 def — define a concrete record

A def statement defines a new concrete record.

Def       ::=  "def" [NameValue] RecordBody
NameValue ::=  Value (parsed in a special mode)

The name value is optional. If specified, it is parsed in a special mode where undefined (unrecognized) identifiers are interpreted as literal strings. In particular, global identifiers are considered unrecognized. These include global variables defined by defvar and defset. A record name can be the null string.

If no name value is given, the record is anonymous. The final name of an anonymous record is unspecified but globally unique.

Special handling occurs if a def appears inside a multiclass statement. See the multiclass section below for details.

A record can inherit from one or more classes by specifying the ParentClassList clause at the beginning of its record body. All of the fields in the parent classes are added to the record. If two or more parent classes provide the same field, the record ends up with the field value of the last parent class.

As a special case, the name of a record can be passed as a template argument to that record’s parent classes. For example:

class A <dag d> {
  dag the_dag = d;
}

def rec1 : A<(ops rec1)>;

The DAG (ops rec1) is passed as a template argument to class A. Notice that the DAG includes rec1, the record being defined.

The steps taken to create a new record are somewhat complex. See How records are built.

See Examples: classes and records for examples.

1.6.3 Examples: classes and records

Here is a simple TableGen file with one class and two record definitions.

class C {
  bit V = true;
}

def X : C;
def Y : C {
  let V = false;
  string Greeting = "Hello!";
}

First, the abstract class C is defined. It has one field named V that is a bit initialized to true.

Next, two records are defined, derived from class C; that is, with C as their parent class. Thus they both inherit the V field. Record Y also defines another string field, Greeting, which is initialized to "Hello!". In addition, Y overrides the inherited V field, setting it to false.

A class is useful for isolating the common features of multiple records in one place. A class can initialize common fields to default values, but records inheriting from that class can override the defaults.

TableGen supports the definition of parameterized classes as well as nonparameterized ones. Parameterized classes specify a list of variable declarations, which may optionally have defaults, that are bound when the class is specified as a parent class of another class or record.

class FPFormat <bits<3> val> {
  bits<3> Value = val;
}

def NotFP      : FPFormat<0>;
def ZeroArgFP  : FPFormat<1>;
def OneArgFP   : FPFormat<2>;
def OneArgFPRW : FPFormat<3>;
def TwoArgFP   : FPFormat<4>;
def CompareFP  : FPFormat<5>;
def CondMovFP  : FPFormat<6>;
def SpecialFP  : FPFormat<7>;

The purpose of the FPFormat class is to act as a sort of enumerated type. It provides a single field, Value, which holds a 3-bit number. Its template argument, val, is used to set the Value field. Each of the eight records is defined with FPFormat as its parent class. The enumeration value is passed in angle brackets as the template argument. Each record will inherent the Value field with the appropriate enumeration value.

Here is a more complex example of classes with template arguments. First, we define a class similar to the FPFormat class above. It takes a template argument and uses it to initialize a field named Value. Then we define four records that inherit the Value field with its four different integer values.

class ModRefVal <bits<2> val> {
  bits<2> Value = val;
}

def None   : ModRefVal<0>;
def Mod    : ModRefVal<1>;
def Ref    : ModRefVal<2>;
def ModRef : ModRefVal<3>;

This is somewhat contrived, but let’s say we would like to examine the two bits of the Value field independently. We can define a class that accepts a ModRefVal record as a template argument and splits up its value into two fields, one bit each. Then we can define records that inherit from ModRefBits and so acquire two fields from it, one for each bit in the ModRefVal record passed as the template argument.

class ModRefBits <ModRefVal mrv> {
  // Break the value up into its bits, which can provide a nice
  // interface to the ModRefVal values.
  bit isMod = mrv.Value{0};
  bit isRef = mrv.Value{1};
}

// Example uses.
def foo   : ModRefBits<Mod>;
def bar   : ModRefBits<Ref>;
def snork : ModRefBits<ModRef>;

This illustrates how one class can be defined to reorganize the fields in another class, thus hiding the internal representation of that other class.

Running llvm-tblgen on the example prints the following definitions:

def bar {      // Value
  bit isMod = 0;
  bit isRef = 1;
}
def foo {      // Value
  bit isMod = 1;
  bit isRef = 0;
}
def snork {      // Value
  bit isMod = 1;
  bit isRef = 1;
}

1.6.4 let — override fields in classes or records

A let statement collects a set of field values (sometimes called bindings) and applies them to all the classes and records defined by statements within the scope of the let.

Let     ::=   "let" LetList "in" "{" Statement* "}"
            | "let" LetList "in" Statement
LetList ::=  LetItem ("," LetItem)*
LetItem ::=  TokIdentifier ["<" RangeList ">"] "=" Value

The let statement establishes a scope, which is a sequence of statements in braces or a single statement with no braces. The bindings in the LetList apply to the statements in that scope.

The field names in the LetList must name fields in classes inherited by the classes and records defined in the statements. The field values are applied to the classes and records after the records inherit all the fields from their parent classes. So the let acts to override inherited field values. A let cannot override the value of a template argument.

Top-level let statements are often useful when a few fields need to be overridden in several records. Here are two examples. Note that let statements can be nested.

let isTerminator = true, isReturn = true, isBarrier = true, hasCtrlDep = true in
  def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>;

let isCall = true in
  // All calls clobber the non-callee saved registers...
  let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0,
              MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, XMM0, XMM1, XMM2,
              XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in {
    def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst, variable_ops),
                           "call\t${dst:call}", []>;
    def CALL32r     : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops),
                        "call\t{*}$dst", [(X86call GR32:$dst)]>;
    def CALL32m     : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops),
                        "call\t{*}$dst", []>;
  }

Note that a top-level let will not override fields defined in the classes or records themselves.

1.6.5 multiclass — define multiple records

While classes with template arguments are a good way to factor out commonality between multiple records, multiclasses allow a convenient method for defining many records at once. For example, consider a 3-address instruction architecture whose instructions come in two formats: reg = reg op reg and reg = reg op imm (e.g., SPARC). We would like to specify in one place that these two common formats exist, then in a separate place specify what all the operations are. The multiclass and defm statements accomplish this goal. You can think of a multiclass as a macro or template that expands into multiple records.

MultiClass          ::=  "multiclass" TokIdentifier [TemplateArgList]
                         ParentClassList
                         "{" MultiClassStatement+ "}"
MultiClassID        ::=  TokIdentifier
MultiClassStatement ::=  Assert | Def | Defm | Defvar | Foreach | If | Let

As with regular classes, the multiclass has a name and can accept template arguments. A multiclass can inherit from other multiclasses, which causes the other multiclasses to be expanded and contribute to the record definitions in the inheriting multiclass. The body of the multiclass contains a series of statements that define records, using Def and Defm. In addition, Defvar, Foreach, and Let statements can be used to factor out even more common elements. The If and Assert statements can also be used.

Also as with regular classes, the multiclass has the implicit template argument NAME (see NAME). When a named (non-anonymous) record is defined in a multiclass and the record’s name does not include a use of the template argument NAME, such a use is automatically prepended to the name. That is, the following are equivalent inside a multiclass:

def Foo ...
def NAME # Foo ...

The records defined in a multiclass are created when the multiclass is “instantiated” or “invoked” by a defm statement outside the multiclass definition. Each def statement in the multiclass produces a record. As with top-level def statements, these definitions can inherit from multiple parent classes.

See Examples: multiclasses and defms for examples.

1.6.6 defm — invoke multiclasses to define multiple records

Once multiclasses have been defined, you use the defm statement to “invoke” them and process the multiple record definitions in those multiclasses. Those record definitions are specified by def statements in the multiclasses, and indirectly by defm statements.

Defm ::=  "defm" [NameValue] ParentClassList ";"

The optional NameValue is formed in the same way as the name of a def. The ParentClassList is a colon followed by a list of at least one multiclass and any number of regular classes. The multiclasses must precede the regular classes. Note that the defm does not have a body.

This statement instantiates all the records defined in all the specified multiclasses, either directly by def statements or indirectly by defm statements. These records also receive the fields defined in any regular classes included in the parent class list. This is useful for adding a common set of fields to all the records created by the defm.

The name is parsed in the same special mode used by def. If the name is not included, an unspecified but globally unique name is provided. That is, the following examples end up with different names:

defm    : SomeMultiClass<...>;   // A globally unique name.
defm "" : SomeMultiClass<...>;   // An empty name.

The defm statement can be used in a multiclass body. When this occurs, the second variant is equivalent to:

defm NAME : SomeMultiClass<...>;

More generally, when defm occurs in a multiclass and its name does not include a use of the implicit template argument NAME, then NAME will be prepended automatically. That is, the following are equivalent inside a multiclass:

defm Foo        : SomeMultiClass<...>;
defm NAME # Foo : SomeMultiClass<...>;

See Examples: multiclasses and defms for examples.

1.6.7 Examples: multiclasses and defms

Here is a simple example using multiclass and defm. Consider a 3-address instruction architecture whose instructions come in two formats: reg = reg op reg and reg = reg op imm (immediate). The SPARC is an example of such an architecture.

def ops;
def GPR;
def Imm;
class inst <int opc, string asmstr, dag operandlist>;

multiclass ri_inst <int opc, string asmstr> {
  def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
                   (ops GPR:$dst, GPR:$src1, GPR:$src2)>;
  def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
                   (ops GPR:$dst, GPR:$src1, Imm:$src2)>;
}

// Define records for each instruction in the RR and RI formats.
defm ADD : ri_inst<0b111, "add">;
defm SUB : ri_inst<0b101, "sub">;
defm MUL : ri_inst<0b100, "mul">;

Each use of the ri_inst multiclass defines two records, one with the _rr suffix and one with _ri. Recall that the name of the defm that uses a multiclass is prepended to the names of the records defined in that multiclass. So the resulting definitions are named:

ADD_rr, ADD_ri
SUB_rr, SUB_ri
MUL_rr, MUL_ri

Without the multiclass feature, the instructions would have to be defined as follows.

def ops;
def GPR;
def Imm;
class inst <int opc, string asmstr, dag operandlist>;

class rrinst <int opc, string asmstr>
  : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
           (ops GPR:$dst, GPR:$src1, GPR:$src2)>;

class riinst <int opc, string asmstr>
  : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"),
           (ops GPR:$dst, GPR:$src1, Imm:$src2)>;

// Define records for each instruction in the RR and RI formats.
def ADD_rr : rrinst<0b111, "add">;
def ADD_ri : riinst<0b111, "add">;
def SUB_rr : rrinst<0b101, "sub">;
def SUB_ri : riinst<0b101, "sub">;
def MUL_rr : rrinst<0b100, "mul">;
def MUL_ri : riinst<0b100, "mul">;

A defm can be used in a multiclass to “invoke” other multiclasses and create the records defined in those multiclasses in addition to the records defined in the current multiclass. In the following example, the basic_s and basic_p multiclasses contain defm statements that refer to the basic_r multiclass. The basic_r multiclass contains only def statements.

class Instruction <bits<4> opc, string Name> {
  bits<4> opcode = opc;
  string name = Name;
}

multiclass basic_r <bits<4> opc> {
  def rr : Instruction<opc, "rr">;
  def rm : Instruction<opc, "rm">;
}

multiclass basic_s <bits<4> opc> {
  defm SS : basic_r<opc>;
  defm SD : basic_r<opc>;
  def X : Instruction<opc, "x">;
}

multiclass basic_p <bits<4> opc> {
  defm PS : basic_r<opc>;
  defm PD : basic_r<opc>;
  def Y : Instruction<opc, "y">;
}

defm ADD : basic_s<0xf>, basic_p<0xf>;

The final defm creates the following records, five from the basic_s multiclass and five from the basic_p multiclass:

ADDSSrr, ADDSSrm
ADDSDrr, ADDSDrm
ADDX
ADDPSrr, ADDPSrm
ADDPDrr, ADDPDrm
ADDY

A defm statement, both at top level and in a multiclass, can inherit from regular classes in addition to multiclasses. The rule is that the regular classes must be listed after the multiclasses, and there must be at least one multiclass.

class XD {
  bits<4> Prefix = 11;
}
class XS {
  bits<4> Prefix = 12;
}
class I <bits<4> op> {
  bits<4> opcode = op;
}

multiclass R {
  def rr : I<4>;
  def rm : I<2>;
}

multiclass Y {
  defm SS : R, XD;    // First multiclass R, then regular class XD.
  defm SD : R, XS;
}

defm Instr : Y;

This example will create four records, shown here in alphabetical order with their fields.

def InstrSDrm {
  bits<4> opcode = { 0, 0, 1, 0 };
  bits<4> Prefix = { 1, 1, 0, 0 };
}

def InstrSDrr {
  bits<4> opcode = { 0, 1, 0, 0 };
  bits<4> Prefix = { 1, 1, 0, 0 };
}

def InstrSSrm {
  bits<4> opcode = { 0, 0, 1, 0 };
  bits<4> Prefix = { 1, 0, 1, 1 };
}

def InstrSSrr {
  bits<4> opcode = { 0, 1, 0, 0 };
  bits<4> Prefix = { 1, 0, 1, 1 };
}

It’s also possible to use let statements inside multiclasses, providing another way to factor out commonality from the records, especially when using several levels of multiclass instantiations.

multiclass basic_r <bits<4> opc> {
  let Predicates = [HasSSE2] in {
    def rr : Instruction<opc, "rr">;
    def rm : Instruction<opc, "rm">;
  }
  let Predicates = [HasSSE3] in
    def rx : Instruction<opc, "rx">;
}

multiclass basic_ss <bits<4> opc> {
  let IsDouble = false in
    defm SS : basic_r<opc>;

  let IsDouble = true in
    defm SD : basic_r<opc>;
}

defm ADD : basic_ss<0xf>;

1.6.8 defset — create a definition set

The defset statement is used to collect a set of records into a global list of records.

Defset ::=  "defset" Type TokIdentifier "=" "{" Statement* "}"

All records defined inside the braces via def and defm are defined as usual, and they are also collected in a global list of the given name (TokIdentifier).

The specified type must be list<class>, where class is some record class. The defset statement establishes a scope for its statements. It is an error to define a record in the scope of the defset that is not of type class.

The defset statement can be nested. The inner defset adds the records to its own set, and all those records are also added to the outer set.

Anonymous records created inside initialization expressions using the ClassID<...> syntax are not collected in the set.

1.6.9 defvar — define a variable

A defvar statement defines a global variable. Its value can be used throughout the statements that follow the definition.

Defvar ::=  "defvar" TokIdentifier "=" Value ";"

The identifier on the left of the = is defined to be a global variable whose value is given by the value expression on the right of the =. The type of the variable is automatically inferred.

Once a variable has been defined, it cannot be set to another value.

Variables defined in a top-level foreach go out of scope at the end of each loop iteration, so their value in one iteration is not available in the next iteration. The following defvar will not work:

defvar i = !add(i, 1);

Variables can also be defined with defvar in a record body. See Defvar in a Record Body for more details.

1.6.10 foreach — iterate over a sequence of statements

The foreach statement iterates over a series of statements, varying a variable over a sequence of values.

Foreach         ::=  "foreach" ForeachIterator "in" "{" Statement* "}"
                    | "foreach" ForeachIterator "in" Statement
ForeachIterator ::=  TokIdentifier "=" ("{" RangeList "}" | RangePiece | Value)

The body of the foreach is a series of statements in braces or a single statement with no braces. The statements are re-evaluated once for each value in the range list, range piece, or single value. On each iteration, the TokIdentifier variable is set to the value and can be used in the statements.

The statement list establishes an inner scope. Variables local to a foreach go out of scope at the end of each loop iteration, so their values do not carry over from one iteration to the next. Foreach loops may be nested.

foreach i = [0, 1, 2, 3] in {
  def R#i : Register<...>;
  def F#i : Register<...>;
}

This loop defines records named R0, R1, R2, and R3, along with F0, F1, F2, and F3.

1.6.11 dump — print messages to stderr

A dump statement prints the input string to standard error output. It is intended for debugging purpose.

  • At top level, the message is printed immediately.

  • Within a record/class/multiclass, dump gets evaluated at each instantiation point of the containing record.

Dump ::=  "dump"  string ";"

For example, it can be used in combination with !repr to investigate the values passed to a multiclass:

multiclass MC<dag s> {
  dump "s = " # !repr(s);
}

1.6.12 if — select statements based on a test

The if statement allows one of two statement groups to be selected based on the value of an expression.

If     ::=  "if" Value "then" IfBody
           | "if" Value "then" IfBody "else" IfBody
IfBody ::=  "{" Statement* "}" | Statement

The value expression is evaluated. If it evaluates to true (in the same sense used by the bang operators), then the statements following the then reserved word are processed. Otherwise, if there is an else reserved word, the statements following the else are processed. If the value is false and there is no else arm, no statements are processed.

Because the braces around the then statements are optional, this grammar rule has the usual ambiguity with “dangling else” clauses, and it is resolved in the usual way: in a case like if v1 then if v2 then {...} else {...}, the else associates with the inner if rather than the outer one.

The IfBody of the then and else arms of the if establish an inner scope. Any defvar variables defined in the bodies go out of scope when the bodies are finished (see Defvar in a Record Body for more details).

The if statement can also be used in a record Body.

1.6.13 assert — check that a condition is true

The assert statement checks a boolean condition to be sure that it is true and prints an error message if it is not.

Assert ::=  "assert" condition "," message ";"

If the boolean condition is true, the statement does nothing. If the condition is false, it prints a nonfatal error message. The message, which can be an arbitrary string expression, is included in the error message as a note. The exact behavior of the assert statement depends on its placement.

  • At top level, the assertion is checked immediately.

  • In a record definition, the statement is saved and all assertions are checked after the record is completely built.

  • In a class definition, the assertions are saved and inherited by all the subclasses and records that inherit from the class. The assertions are then checked when the records are completely built.

  • In a multiclass definition, the assertions are saved with the other components of the multiclass and then checked each time the multiclass is instantiated with defm.

Using assertions in TableGen files can simplify record checking in TableGen backends. Here is an example of an assert in two class definitions.

class PersonName<string name> {
  assert !le(!size(name), 32), "person name is too long: " # name;
  string Name = name;
}

class Person<string name, int age> : PersonName<name> {
  assert !and(!ge(age, 1), !le(age, 120)), "person age is invalid: " # age;
  int Age = age;
}

def Rec20 : Person<"Donald Knuth", 60> {
  ...
}

1.7 Additional Details

1.7.1 Directed acyclic graphs (DAGs)

A directed acyclic graph can be represented directly in TableGen using the dag datatype. A DAG node consists of an operator and zero or more arguments (or operands). Each argument can be of any desired type. By using another DAG node as an argument, an arbitrary graph of DAG nodes can be built.

The syntax of a dag instance is:

( operator argument1, argument2,)

The operator must be present and must be a record. There can be zero or more arguments, separated by commas. The operator and arguments can have three formats.

Format

Meaning

value

argument value

value:name

argument value and associated name

name

argument name with unset (uninitialized) value

The value can be any TableGen value. The name, if present, must be a TokVarName, which starts with a dollar sign ($). The purpose of a name is to tag an operator or argument in a DAG with a particular meaning, or to associate an argument in one DAG with a like-named argument in another DAG.

The following bang operators are useful for working with DAGs: !con, !dag, !empty, !foreach, !getdagarg, !getdagname, !getdagop, !setdagarg, !setdagname, !setdagop, !size.

1.7.2 Defvar in a record body

In addition to defining global variables, the defvar statement can be used inside the Body of a class or record definition to define local variables. Template arguments of class or multiclass can be used in the value expression. The scope of the variable extends from the defvar statement to the end of the body. It cannot be set to a different value within its scope. The defvar statement can also be used in the statement list of a foreach, which establishes a scope.

A variable named V in an inner scope shadows (hides) any variables V in outer scopes. In particular, there are several cases:

  • V in a record body shadows a global V.

  • V in a record body shadows template argument V.

  • V in template arguments shadows a global V.

  • V in a foreach statement list shadows any V in surrounding record or global scopes.

Variables defined in a foreach go out of scope at the end of each loop iteration, so their value in one iteration is not available in the next iteration. The following defvar will not work:

defvar i = !add(i, 1)

1.7.3 How records are built

The following steps are taken by TableGen when a record is built. Classes are simply abstract records and so go through the same steps.

  1. Build the record name (NameValue) and create an empty record.

  2. Parse the parent classes in the ParentClassList from left to right, visiting each parent class’s ancestor classes from top to bottom.

  1. Add the fields from the parent class to the record.

  2. Substitute the template arguments into those fields.

  3. Add the parent class to the record’s list of inherited classes.

  1. Apply any top-level let bindings to the record. Recall that top-level bindings only apply to inherited fields.

  2. Parse the body of the record.

  • Add any fields to the record.

  • Modify the values of fields according to local let statements.

  • Define any defvar variables.

  1. Make a pass over all the fields to resolve any inter-field references.

  2. Add the record to the final record list.

Because references between fields are resolved (step 5) after let bindings are applied (step 3), the let statement has unusual power. For example:

class C <int x> {
  int Y = x;
  int Yplus1 = !add(Y, 1);
  int xplus1 = !add(x, 1);
}

let Y = 10 in {
  def rec1 : C<5> {
  }
}

def rec2 : C<5> {
  let Y = 10;
}

In both cases, one where a top-level let is used to bind Y and one where a local let does the same thing, the results are:

def rec1 {      // C
  int Y = 10;
  int Yplus1 = 11;
  int xplus1 = 6;
}
def rec2 {      // C
  int Y = 10;
  int Yplus1 = 11;
  int xplus1 = 6;
}

Yplus1 is 11 because the let Y is performed before the !add(Y, 1) is resolved. Use this power wisely.

1.8 Using Classes as Subroutines

As described in Simple values, a class can be invoked in an expression and passed template arguments. This causes TableGen to create a new anonymous record inheriting from that class. As usual, the record receives all the fields defined in the class.

This feature can be employed as a simple subroutine facility. The class can use the template arguments to define various variables and fields, which end up in the anonymous record. Those fields can then be retrieved in the expression invoking the class as follows. Assume that the field ret contains the final value of the subroutine.

int Result = ... CalcValue<arg>.ret ...;

The CalcValue class is invoked with the template argument arg. It calculates a value for the ret field, which is then retrieved at the “point of call” in the initialization for the Result field. The anonymous record created in this example serves no other purpose than to carry the result value.

Here is a practical example. The class isValidSize determines whether a specified number of bytes represents a valid data size. The bit ret is set appropriately. The field ValidSize obtains its initial value by invoking isValidSize with the data size and retrieving the ret field from the resulting anonymous record.

class isValidSize<int size> {
  bit ret = !cond(!eq(size,  1): 1,
                  !eq(size,  2): 1,
                  !eq(size,  4): 1,
                  !eq(size,  8): 1,
                  !eq(size, 16): 1,
                  true: 0);
}

def Data1 {
  int Size = ...;
  bit ValidSize = isValidSize<Size>.ret;
}

1.9 Preprocessing Facilities

The preprocessor embedded in TableGen is intended only for simple conditional compilation. It supports the following directives, which are specified somewhat informally.

LineBegin              ::=  beginning of line
LineEnd                ::=  newline | return | EOF
WhiteSpace             ::=  space | tab
CComment               ::=  "/*" ... "*/"
BCPLComment            ::=  "//" ... LineEnd
WhiteSpaceOrCComment   ::=  WhiteSpace | CComment
WhiteSpaceOrAnyComment ::=  WhiteSpace | CComment | BCPLComment
MacroName              ::=  ualpha (ualpha | "0"..."9")*
PreDefine              ::=  LineBegin (WhiteSpaceOrCComment)*
                            "#define" (WhiteSpace)+ MacroName
                            (WhiteSpaceOrAnyComment)* LineEnd
PreIfdef               ::=  LineBegin (WhiteSpaceOrCComment)*
                            ("#ifdef" | "#ifndef") (WhiteSpace)+ MacroName
                            (WhiteSpaceOrAnyComment)* LineEnd
PreElse                ::=  LineBegin (WhiteSpaceOrCComment)*
                            "#else" (WhiteSpaceOrAnyComment)* LineEnd
PreEndif               ::=  LineBegin (WhiteSpaceOrCComment)*
                            "#endif" (WhiteSpaceOrAnyComment)* LineEnd

A MacroName can be defined anywhere in a TableGen file. The name has no value; it can only be tested to see whether it is defined.

A macro test region begins with an #ifdef or #ifndef directive. If the macro name is defined (#ifdef) or undefined (#ifndef), then the source code between the directive and the corresponding #else or #endif is processed. If the test fails but there is an #else clause, the source code between the #else and the #endif is processed. If the test fails and there is no #else clause, then no source code in the test region is processed.

Test regions may be nested, but they must be properly nested. A region started in a file must end in that file; that is, must have its #endif in the same file.

A MacroName may be defined externally using the -D option on the *-tblgen command line:

llvm-tblgen self-reference.td -Dmacro1 -Dmacro3

1.10 Appendix A: Bang Operators

Bang operators act as functions in value expressions. A bang operator takes one or more arguments, operates on them, and produces a result. If the operator produces a boolean result, the result value will be 1 for true or 0 for false. When an operator tests a boolean argument, it interprets 0 as false and non-0 as true.

Warning

The !getop and !setop bang operators are deprecated in favor of !getdagop and !setdagop.

!add(a, b, ...)

This operator adds a, b, etc., and produces the sum.

!and(a, b, ...)

This operator does a bitwise AND on a, b, etc., and produces the result. A logical AND can be performed if all the arguments are either 0 or 1.

!cast<type>(a)

This operator performs a cast on a and produces the result. If a is not a string, then a straightforward cast is performed, say between an int and a bit, or between record types. This allows casting a record to a class. If a record is cast to string, the record’s name is produced.

If a is a string, then it is treated as a record name and looked up in the list of all defined records. The resulting record is expected to be of the specified type.

For example, if !cast<type>(name) appears in a multiclass definition, or in a class instantiated inside a multiclass definition, and the name does not reference any template arguments of the multiclass, then a record by that name must have been instantiated earlier in the source file. If name does reference a template argument, then the lookup is delayed until defm statements instantiating the multiclass (or later, if the defm occurs in another multiclass and template arguments of the inner multiclass that are referenced by name are substituted by values that themselves contain references to template arguments of the outer multiclass).

If the type of a does not match type, TableGen raises an error.

!con(a, b, ...)

This operator concatenates the DAG nodes a, b, etc. Their operations must equal.

!con((op a1:$name1, a2:$name2), (op b1:$name3))

results in the DAG node (op a1:$name1, a2:$name2, b1:$name3).

!cond(cond1 : val1, cond2 : val2, ..., condn : valn)

This operator tests cond1 and returns val1 if the result is true. If false, the operator tests cond2 and returns val2 if the result is true. And so forth. An error is reported if no conditions are true.

This example produces the sign word for an integer:

!cond(!lt(x, 0) : "negative", !eq(x, 0) : "zero", true : "positive")
!dag(op, arguments, names)

This operator creates a DAG node with the given operator and arguments. The arguments and names arguments must be lists of equal length or uninitialized (?). The names argument must be of type list<string>.

Due to limitations of the type system, arguments must be a list of items of a common type. In practice, this means that they should either have the same type or be records with a common parent class. Mixing dag and non-dag items is not possible. However, ? can be used.

Example: !dag(op, [a1, a2, ?], ["name1", "name2", "name3"]) results in (op a1-value:$name1, a2-value:$name2, ?:$name3).

!div(a, b)

This operator performs signed division of a by b, and produces the quotient. Division by 0 produces an error. Division of INT64_MIN by -1 produces an error.

!empty(a)

This operator produces 1 if the string, list, or DAG a is empty; 0 otherwise. A dag is empty if it has no arguments; the operator does not count.

!eq( a, b)

This operator produces 1 if a is equal to b; 0 otherwise. The arguments must be bit, bits, int, string, or record values. Use !cast<string> to compare other types of objects.

!exists<type>(name)

This operator produces 1 if a record of the given type whose name is name exists; 0 otherwise. name should be of type string.

!filter(var, list, predicate)

This operator creates a new list by filtering the elements in list. To perform the filtering, TableGen binds the variable var to each element and then evaluates the predicate expression, which presumably refers to var. The predicate must produce a boolean value (bit, bits, or int). The value is interpreted as with !if: if the value is 0, the element is not included in the new list. If the value is anything else, the element is included.

!find(string1, string2[, start])

This operator searches for string2 in string1 and produces its position. The starting position of the search may be specified by start, which can range between 0 and the length of string1; the default is 0. If the string is not found, the result is -1.

!foldl(init, list, acc, var, expr)

This operator performs a left-fold over the items in list. The variable acc acts as the accumulator and is initialized to init. The variable var is bound to each element in the list. The expression is evaluated for each element and presumably uses acc and var to calculate the accumulated value, which !foldl stores back in acc. The type of acc is the same as init; the type of var is the same as the elements of list; expr must have the same type as init.

The following example computes the total of the Number field in the list of records in RecList:

int x = !foldl(0, RecList, total, rec, !add(total, rec.Number));

If your goal is to filter the list and produce a new list that includes only some of the elements, see !filter.

!foreach(var, sequence, expr)

This operator creates a new list/dag in which each element is a function of the corresponding element in the sequence list/dag. To perform the function, TableGen binds the variable var to an element and then evaluates the expression. The expression presumably refers to the variable var and calculates the result value.

If you simply want to create a list of a certain length containing the same value repeated multiple times, see !listsplat.

!ge(a, b)

This operator produces 1 if a is greater than or equal to b; 0 otherwise. The arguments must be bit, bits, int, or string values.

!getdagarg<type>(dag,key)

This operator retrieves the argument from the given dag node by the specified key, which is either an integer index or a string name. If that argument is not convertible to the specified type, ? is returned.

!getdagname(dag,index)

This operator retrieves the argument name from the given dag node by the specified index. If that argument has no name associated, ? is returned.

!getdagop(dag) –or– !getdagop<type>(dag)

This operator produces the operator of the given dag node. Example: !getdagop((foo 1, 2)) results in foo. Recall that DAG operators are always records.

The result of !getdagop can be used directly in a context where any record class at all is acceptable (typically placing it into another dag value). But in other contexts, it must be explicitly cast to a particular class. The <type> syntax is provided to make this easy.

For example, to assign the result to a value of type BaseClass, you could write either of these:

BaseClass b = !getdagop<BaseClass>(someDag);
BaseClass b = !cast<BaseClass>(!getdagop(someDag));

But to create a new DAG node that reuses the operator from another, no cast is necessary:

dag d = !dag(!getdagop(someDag), args, names);
!gt(a, b)

This operator produces 1 if a is greater than b; 0 otherwise. The arguments must be bit, bits, int, or string values.

!head(a)

This operator produces the zeroth element of the list a. (See also !tail.)

!if(test, then, else)

This operator evaluates the test, which must produce a bit or int. If the result is not 0, the then expression is produced; otherwise the else expression is produced.

!interleave(list, delim)

This operator concatenates the items in the list, interleaving the delim string between each pair, and produces the resulting string. The list can be a list of string, int, bits, or bit. An empty list results in an empty string. The delimiter can be the empty string.

!isa<type>(a)

This operator produces 1 if the type of a is a subtype of the given type; 0 otherwise.

!le(a, b)

This operator produces 1 if a is less than or equal to b; 0 otherwise. The arguments must be bit, bits, int, or string values.

!listconcat(list1, list2, ...)

This operator concatenates the list arguments list1, list2, etc., and produces the resulting list. The lists must have the same element type.

!listremove(list1, list2)

This operator returns a copy of list1 removing all elements that also occur in list2. The lists must have the same element type.

!listsplat(value, count)

This operator produces a list of length count whose elements are all equal to the value. For example, !listsplat(42, 3) results in [42, 42, 42].

!logtwo(a)

This operator produces the base 2 log of a and produces the integer result. The log of 0 or a negative number produces an error. This is a flooring operation.

!lt(a, b)

This operator produces 1 if a is less than b; 0 otherwise. The arguments must be bit, bits, int, or string values.

!mul(a, b, ...)

This operator multiplies a, b, etc., and produces the product.

!ne(a, b)

This operator produces 1 if a is not equal to b; 0 otherwise. The arguments must be bit, bits, int, string, or record values. Use !cast<string> to compare other types of objects.

!not(a)

This operator performs a logical NOT on a, which must be an integer. The argument 0 results in 1 (true); any other argument results in 0 (false).

!or(a, b, ...)

This operator does a bitwise OR on a, b, etc., and produces the result. A logical OR can be performed if all the arguments are either 0 or 1.

!range([start,] end[, ``\ *step*\ ``])

This operator produces half-open range sequence [start : end : step) as list<int>. start is 0 and step is 1 by default. step can be negative and cannot be 0. If start < end and step is negative, or start > end and step is positive, the result is an empty list []<list<int>>.

For example:

  • !range(4) is equivalent to !range(0, 4, 1) and the result is [0, 1, 2, 3].

  • !range(1, 4) is equivalent to !range(1, 4, 1) and the result is [1, 2, 3].

  • The result of !range(0, 4, 2) is [0, 2].

  • The results of !range(0, 4, -1) and !range(4, 0, 1) are empty.

!range(list)

Equivalent to !range(0, !size(list)).

!repr(value)

Represents value as a string. String format for the value is not guaranteed to be stable. Intended for debugging purposes only.

!setdagarg(dag,key,arg)

This operator produces a DAG node with the same operator and arguments as dag, but replacing the value of the argument specified by the key with arg. That key could be either an integer index or a string name.

!setdagname(dag,key,name)

This operator produces a DAG node with the same operator and arguments as dag, but replacing the name of the argument specified by the key with name. That key could be either an integer index or a string name.

!setdagop(dag, op)

This operator produces a DAG node with the same arguments as dag, but with its operator replaced with op.

Example: !setdagop((foo 1, 2), bar) results in (bar 1, 2).

!shl(a, count)

This operator shifts a left logically by count bits and produces the resulting value. The operation is performed on a 64-bit integer; the result is undefined for shift counts outside 0…63.

!size(a)

This operator produces the size of the string, list, or dag a. The size of a DAG is the number of arguments; the operator does not count.

!sra(a, count)

This operator shifts a right arithmetically by count bits and produces the resulting value. The operation is performed on a 64-bit integer; the result is undefined for shift counts outside 0…63.

!srl(a, count)

This operator shifts a right logically by count bits and produces the resulting value. The operation is performed on a 64-bit integer; the result is undefined for shift counts outside 0…63.

!strconcat(str1, str2, ...)

This operator concatenates the string arguments str1, str2, etc., and produces the resulting string.

!sub(a, b)

This operator subtracts b from a and produces the arithmetic difference.

!subst(target, repl, value)

This operator replaces all occurrences of the target in the value with the repl and produces the resulting value. The value can be a string, in which case substring substitution is performed.

The value can be a record name, in which case the operator produces the repl record if the target record name equals the value record name; otherwise it produces the value.

!substr(string, start[, length])

This operator extracts a substring of the given string. The starting position of the substring is specified by start, which can range between 0 and the length of the string. The length of the substring is specified by length; if not specified, the rest of the string is extracted. The start and length arguments must be integers.

!tail(a)

This operator produces a new list with all the elements of the list a except for the zeroth one. (See also !head.)

!tolower(a)

This operator converts a string input a to lower case.

!toupper(a)

This operator converts a string input a to upper case.

!xor(a, b, ...)

This operator does a bitwise EXCLUSIVE OR on a, b, etc., and produces the result. A logical XOR can be performed if all the arguments are either 0 or 1.

1.11 Appendix B: Paste Operator Examples

Here is an example illustrating the use of the paste operator in record names.

defvar suffix = "_suffstring";
defvar some_ints = [0, 1, 2, 3];

def name # suffix {
}

foreach i = [1, 2] in {
def rec # i {
}
}

The first def does not use the value of the suffix variable. The second def does use the value of the i iterator variable, because it is not a global name. The following records are produced.

def namesuffix {
}
def rec1 {
}
def rec2 {
}

Here is a second example illustrating the paste operator in field value expressions.

def test {
  string strings = suffix # suffix;
  list<int> integers = some_ints # [4, 5, 6];
}

The strings field expression uses suffix on both sides of the paste operator. It is evaluated normally on the left hand side, but taken verbatim on the right hand side. The integers field expression uses the value of the some_ints variable and a literal list. The following record is produced.

def test {
  string strings = "_suffstringsuffix";
  list<int> ints = [0, 1, 2, 3, 4, 5, 6];
}

1.12 Appendix C: Sample Record

One target machine supported by LLVM is the Intel x86. The following output from TableGen shows the record that is created to represent the 32-bit register-to-register ADD instruction.

def ADD32rr { // InstructionEncoding Instruction X86Inst I ITy Sched BinOpRR BinOpRR_RF
  int Size = 0;
  string DecoderNamespace = "";
  list<Predicate> Predicates = [];
  string DecoderMethod = "";
  bit hasCompleteDecoder = 1;
  string Namespace = "X86";
  dag OutOperandList = (outs GR32:$dst);
  dag InOperandList = (ins GR32:$src1, GR32:$src2);
  string AsmString = "add{l}  {$src2, $src1|$src1, $src2}";
  EncodingByHwMode EncodingInfos = ?;
  list<dag> Pattern = [(set GR32:$dst, EFLAGS, (X86add_flag GR32:$src1, GR32:$src2))];
  list<Register> Uses = [];
  list<Register> Defs = [EFLAGS];
  int CodeSize = 3;
  int AddedComplexity = 0;
  bit isPreISelOpcode = 0;
  bit isReturn = 0;
  bit isBranch = 0;
  bit isEHScopeReturn = 0;
  bit isIndirectBranch = 0;
  bit isCompare = 0;
  bit isMoveImm = 0;
  bit isMoveReg = 0;
  bit isBitcast = 0;
  bit isSelect = 0;
  bit isBarrier = 0;
  bit isCall = 0;
  bit isAdd = 0;
  bit isTrap = 0;
  bit canFoldAsLoad = 0;
  bit mayLoad = ?;
  bit mayStore = ?;
  bit mayRaiseFPException = 0;
  bit isConvertibleToThreeAddress = 1;
  bit isCommutable = 1;
  bit isTerminator = 0;
  bit isReMaterializable = 0;
  bit isPredicable = 0;
  bit isUnpredicable = 0;
  bit hasDelaySlot = 0;
  bit usesCustomInserter = 0;
  bit hasPostISelHook = 0;
  bit hasCtrlDep = 0;
  bit isNotDuplicable = 0;
  bit isConvergent = 0;
  bit isAuthenticated = 0;
  bit isAsCheapAsAMove = 0;
  bit hasExtraSrcRegAllocReq = 0;
  bit hasExtraDefRegAllocReq = 0;
  bit isRegSequence = 0;
  bit isPseudo = 0;
  bit isExtractSubreg = 0;
  bit isInsertSubreg = 0;
  bit variadicOpsAreDefs = 0;
  bit hasSideEffects = ?;
  bit isCodeGenOnly = 0;
  bit isAsmParserOnly = 0;
  bit hasNoSchedulingInfo = 0;
  InstrItinClass Itinerary = NoItinerary;
  list<SchedReadWrite> SchedRW = [WriteALU];
  string Constraints = "$src1 = $dst";
  string DisableEncoding = "";
  string PostEncoderMethod = "";
  bits<64> TSFlags = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0 };
  string AsmMatchConverter = "";
  string TwoOperandAliasConstraint = "";
  string AsmVariantName = "";
  bit UseNamedOperandTable = 0;
  bit FastISelShouldIgnore = 0;
  bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 };
  Format Form = MRMDestReg;
  bits<7> FormBits = { 0, 1, 0, 1, 0, 0, 0 };
  ImmType ImmT = NoImm;
  bit ForceDisassemble = 0;
  OperandSize OpSize = OpSize32;
  bits<2> OpSizeBits = { 1, 0 };
  AddressSize AdSize = AdSizeX;
  bits<2> AdSizeBits = { 0, 0 };
  Prefix OpPrefix = NoPrfx;
  bits<3> OpPrefixBits = { 0, 0, 0 };
  Map OpMap = OB;
  bits<3> OpMapBits = { 0, 0, 0 };
  bit hasREX_WPrefix = 0;
  FPFormat FPForm = NotFP;
  bit hasLockPrefix = 0;
  Domain ExeDomain = GenericDomain;
  bit hasREPPrefix = 0;
  Encoding OpEnc = EncNormal;
  bits<2> OpEncBits = { 0, 0 };
  bit HasVEX_W = 0;
  bit IgnoresVEX_W = 0;
  bit EVEX_W1_VEX_W0 = 0;
  bit hasVEX_4V = 0;
  bit hasVEX_L = 0;
  bit ignoresVEX_L = 0;
  bit hasEVEX_K = 0;
  bit hasEVEX_Z = 0;
  bit hasEVEX_L2 = 0;
  bit hasEVEX_B = 0;
  bits<3> CD8_Form = { 0, 0, 0 };
  int CD8_EltSize = 0;
  bit hasEVEX_RC = 0;
  bit hasNoTrackPrefix = 0;
  bits<7> VectSize = { 0, 0, 1, 0, 0, 0, 0 };
  bits<7> CD8_Scale = { 0, 0, 0, 0, 0, 0, 0 };
  string FoldGenRegForm = ?;
  string EVEX2VEXOverride = ?;
  bit isMemoryFoldable = 1;
  bit notEVEX2VEXConvertible = 0;
}

On the first line of the record, you can see that the ADD32rr record inherited from eight classes. Although the inheritance hierarchy is complex, using parent classes is much simpler than specifying the 109 individual fields for each instruction.

Here is the code fragment used to define ADD32rr and multiple other ADD instructions:

defm ADD : ArithBinOp_RF<0x00, 0x02, 0x04, "add", MRM0r, MRM0m,
                         X86add_flag, add, 1, 1, 1>;

The defm statement tells TableGen that ArithBinOp_RF is a multiclass, which contains multiple concrete record definitions that inherit from BinOpRR_RF. That class, in turn, inherits from BinOpRR, which inherits from ITy and Sched, and so forth. The fields are inherited from all the parent classes; for example, IsIndirectBranch is inherited from the Instruction class.