Function Signatures

A key job of the IREE compiler and runtime is capturing function call semantics from the originating system and providing mechanisms so that invocations can be performed in as similar way as possible in various target languages. In general, this requires additional metadata on top of the raw characteristics of a function. Where possible, this is done by attaching attributes to a function.

iree.abi : JSON encoded description of the function's calling convention.

V1 ABI

This is the default ABI supported by the IREE VM invocations. It attempts to provide a default calling convention that can be used without further reflection metadata but which may be enhanced with it.

It natively allows monomorphic functions to be exported where arguments and results are composed of the following types:

Value Types:

Byte aligned integer type (i8, i16, i32, i64)
Floating point value (f16, f32, f64)

Reference Types:

ND-Array buffers of Value Types:
- Simple: Packed, C-layout
- Strided: Arbitrary layout with strides (future)
String (byte arrays)
Opaque reference object

Sequence Types:

Tuples: fixed length lists where each position has its own type bound
Homogenous list: lists of arbitrary size where a single type bound applies to all elements

The intent with these low level types is that calling conventions can be synthesized to bind arbitrary high level, domain/language specific signatures to these types, possibly by way of additional reflection metadata.

Representations:

The above are all representable with native constructs in the VM:

ValueType:
- Runtime: iree_vm_value
- Compile Time: primitive MLIR integer/floating point types
Simple ND-Array Buffer:
- Runtime: iree_hal_buffer_view
- Compile Time: tensor<>
String:
- Runtime: iree_vm_list containing i8
- Compile Time: !iree.list<i8>
Tuple:
- Runtime: iree_vm_list of variant
- Compile Time: !iree.list<?>
- Note that these are statically type erased at the boundary.
TypedList (homogenous):
- Runtime: iree_vm_list of T
- Compile Time: !iree.list<T>

Extended Type Calling Conventions

While the above features of the native ABI may be sufficient for direct use by various programs, many programs and callers will need to represent various higher level types, consistently mapping them to the above facilities. This section describes calling conventions for various higher level types which do not map 1:1 to the above. Not all source language types are representable, and extending these calling conventions (and the fundamental types above) is demand driven.

All of these calling conventions presume that the arity of the arguments/results of the raw function matches the user-level function, meaning that the calling convention is specified per argument/result. Higher-level whole function transformations may also exist for some domains but are outside of the scope of this specification.

Structure

A Structure is a common enough entity to have a dedicated calling convention. In C-like languages, this may just be a struct. In Python, it is typically a dict with an associated schema providing a name and type bound for each of its slots. In both, its slots are of fixed arity.

In this convention, such a structure is represented as a Tuple in the native calling convention (i.e. !iree.list of variant type). The order of the elements of the tuple are the natural order of the structure, where that is either:

For a C-like system where order is determinate, it is the order of declaration.
For a name-based system (i.e. bind to dict) where no order is defined, the natural order will be the lexically sorted order of the keys.

String

Most languages interop between byte arrays (i.e. the native ABI String type) by way of applying an encoding. Such strings are just a sequence of bytes (i.e. !iree.list<i8>).

Typed List

High level lists which all share the same type bound are represented as a TypedList in the native ABI.

NDArray of Reference Types

NDArrays of reference types are considered separately from those of value types. Internally, the code generated for them is completely different from what gets generated for numeric based arrays (i.e. has ref-counting, ownership semantics, non-POD, etc). These types are permitted for completeness, not necessarily performance: by nature they are already indirected and have overheads.

In the native ABI, these are represented as a composite tuple type (i.e. today a list since sugar for tuple is not yet defined): !iree.tuple<!iree.list<T>, !iree.list<index>>. The first element of the tuple is the list of values, packed with a C-Layout and the second element is the list of dimension sizes.

Reflection

Additional reflection metadata may be encoded in a custom JSON form, providing additional typing hints for arguments and results. If present, this will be a reflection attribute with key d, containing a serialized JSON object.

The JSON object contains:

a (array): List of type records for each argument.
r (array): List of type records for each argument.

Type records are one of:

A string naming a primitive type:
- i[0-9]+: Integer type with given bit width
- f[0-9]+: IEEE floating point type with given bit width
- bf16: BFloat16
JSON null: A null reference value
"unknown": An unknown/unmapped type
An array, interpreted as a tuple describing a compound type.

Compound type tuples

A compound type tuple has a type identifier as its first element, followed with type specific fields:

["ndarray", {element_type}, {rank}, {dim...}]: For unknown rank, the rank will be null and there will be no dims. Any unknown dim will be null.
["slist", {slot_type...}]: An anonymous structured list of fixed arity and slot specific types. If there are gaps in the list, empty slots will have a null type.
["stuple", {slot_type...}]: Same as slist but some languages differentiate between sequences represented as lists and those represented as tuples (read-only lists).
["sdict", ["key", {slot_type}]...]: An anonymous structure with named slots. Note that when passing these types, the keys are not passed to the function (only the slot values).

Deprecated V0 ABIs

These will be removed as soon as the corresponding code is removed.

Generic Signature Mangling

Where possible, ABI metadata is encoded into a plain-text signature in a way that is easily transported across component boundaries and can be efficiently implemented without additional dependencies (i.e. just string manipulation).

The suggested format is manipulated via the C++ reference implementations SignatureBuilder and SignatureParser classes (see iree/base/signature_parser.h). See documentation and code for those classes for more details.

ABIs

Raw Function ABI

All exported functions implement the raw function ABI, which defines the metadata and calling convention for marshalling inputs and results to their underlying implementations.

Attributes:

fv = 1 (current version of the raw function ABI)
f = encoded raw function signature (see below)
fbr = result buffer allocation function name (optional)

The reflection metadata documented here augments the underlying type system such that host language bindings can interop as needed. This additional metadata is needed in most dynamic cases because the compiled assets operate on fundamental types with most characteristics type erased away (think: void* level things vs high-level ShapedBuffer level things).

Grammar

The signature is implemented in terms of the SignatureBuilder, using tagged Integer and Spans.

signature ::= 'I' length-prefixed(type-sequence)
              'R' length-prefixed(type-sequence)

type-sequence ::= (arg-result-type)*
arg-result-type ::= buffer-type
                  | ref-object-type
                  | scalar-type
                  | unrecognized-type
buffer-type ::= 'B' length-prefixed(scalar-element-type? dim*)
scalar-type ::= 'S' length-prefixed(scalar-element-type?)
scalar-element-type ::= 't' (
                    '0'  # IEEE float32 (default if not specified)
                  | '1'  # IEEE float16
                  | '2'  # IEEE float64
                  | '3'  # Google bfloat16
                  | '4'  # Signed int8
                  | '5'  # Signed int16
                  | '6'  # Signed int32
                  | '7'  # Signed int64
                  | '8'  # Unsigned int8
                  | '9'  # Unsigned int16
                  | '10' # Unsigned int32
                  | '11' # Unsigned int64
                  )
dim :: = 'd' integer  # -1 indicates a dynamic dim
ref-object-type ::= 'O' length-prefixed()  # Details TBD
unrecognized-type ::= 'U' length-prefixed()

# Lexical primitives
integer ::= -?[0-9]+
length ::= [0-9]+
# The `length` encodes the length in bytes of `production`, plus 1 for the '!'.
length-prefixed(production) ::= length '!' production
any-byte-sequence ::= <any byte sequence>

Interpretation and Rationale

Memory layout

The astute reader will note that the above metadata is insufficient to determine the memory layout of a buffer. The reason is that any more specific details than this (contiguity, strides, alignment, etc) can actually only be known once the actual compute devices have been enumerated and the resulting matrix of conversions is more dynamic than can be expressed in something as static as a function signature. The above formulation is an input to an additional runtime oracle which produces appropriate full buffer descriptions.

While the exact implementation is host-language specific, consider the following more detailed set of declarations that may exist in such a binding layer:

// Inspired heavily by the Py_buffer type.
// See: https://docs.python.org/3/c-api/buffer.html
struct BufferDescription {
  ScalarType element_type;
  // For contiguous arrays, this is is the length of the underlying memory.
  // For non-contiguous, this is the size of the buffer if it were copied
  // to a contiguous representation.
  size_t len;
  // Number of dims and strides.
  size_t ndim;
  int* shape;
  int* strides;
};

// Mirrors the 'buffer-type' production in the above grammar.
struct SignatureBufferType;

// Oracle which combines signature metadata with a user-provided, materialized
// BufferDescription to derive a BufferDescription that is compatible for
// invocation. Returns an updated buffer description if the original is
// not compatible or fully specified.
// This can be used in a couple of ways:
//   a) On function invocation to determine whether a provided buffer can be
//      used as-is or needs to be converted (copied).
//   b) To provide a factory function to the host language to create a
//      compatible buffer.
optional<BufferDescription> BufferDescriptionOracle(
    DeviceContext*, SignatureBufferType, BufferDescription)
  throws UnsupportedBufferException;

The above scheme should allow host-language and device coordination with respect to buffer layout. For the moment, the responsibility to convert the buffer to a compatible memory layout is on the host-language binding. However, often it is the most efficient to schedule this for execution on a device. In the future, it is anticipated that there will be a built-in pathway for scheduling such a conversion (which would allow pipelining and offload of buffer conversions).

Deferred result allocation

In general, exported functions accept pre-allocated results that should be mutated. For the simplest cases, such results can be null and retrieved upon completion of the function. This, however, puts severe limitations on the ability to pipeline. For fully specified signatures (no dynamic shapes), the BufferDescriptionOracle and the signature is sufficient to pre-allocate appropriate results, which allows chains of result-producing invocations to be pipelined.

If, however, a buffer-type is not fully specified, the compiler may emit a special result allocator function, which will be referenced in the fbr attribute. Such a function would have a signature like this:

tuple<buffer> __allocate_results(tuple<int> dynamic_dims);

Such a function takes a tuple of all dynamic buffer dims in the function input signature and returns a tuple of allocated buffers for each dynamic result. Note that it may not be possible to fully allocate results in this fashion (i.e. if the result layout is data dependent), in which case a null buffer is returned for that slot (and the host library would need to await on the invocation to get the fully populated result).

A similar mechanism will need to be created at some future point for under-specified results of other (non-buffer) types.

Contiguity hinting

Commonly in some kinds of dataflows, the compiler needs to be free to internally toggle buffer continuity (i.e. C/row-major, Fortran/col-major, etc). In many cases, such toggling does not naturally escape through the exported function boundaries, in which case, there is no ABI impact. However, it is anticipated that there is benefit to letting the toggle propagate through the exported ABI boundary, in which case, the buffer-type will likely be extended with a contiguity hint indicating the preference. When combined with the buffer description oracle and in-pipeline conversion features described above, this could yield a powerful mechanism for dynamically and efficiently managing such transitions.

Such an enhancement would almost certainly necessitate a major version bump in the ABI and would be logical to implement once the advanced features above are functional.

Structured Index Path ABI

Functions may support the SIP ABI if their input and result tuples logically map onto “structures” (nested sequence/dicts).

Attributes:

sipv = 1 (current SIP ABI version)
sip = encoded SIP signature (see below)

This ABI maps a raw, linear sequence of inputs and results onto an input and result “structure” -- which in this context refers to a nested assembly of sequences (with integer keys) and dictionaries (with string keys). Such a facility is useful for encoding input/result mappings in a way that is common in dynamic languages (such as Python).

In practice, this ABI supports the calling convention for TensorFlow, which allows functions that accept and produce nestings via the tf.nest facility. In implementing it, however, care has been taken to allow the calling convention to generalize to other similar cases.

Grammar

The signature is implemented in terms of the SignatureBuilder, using tagged Integer and Spans.

# Defines the structured value for the inputs ('I') and results ('R')
# of the function.
signature ::= 'I' length-prefixed(structured-value)
              'R' length-prefixed(structured-value)

structured-value ::= raw-fn-index | sequence | dict
raw-fn-index ::= '_' integer
sequence ::= 'S' length-prefixed( (integer-key structured-value)* )
integer-key ::= 'k' integer
dict ::= 'D' length-prefixed( (string-key structured-value)* )
string-key ::= 'K' length-prefixed( any-byte-sequence )

# Low-level lexical primitives:
integer ::= -?[0-9]+
length ::= [0-9]+
# The `length` encodes the length in bytes of `production`, plus 1 for the '!'.
length-prefixed(production) ::= length '!' production
any-byte-sequence ::= <any byte sequence>

Structured values define a tree of recursive dicts/lists, with raw-fn-index at the leaves. The interpretation is that a raw-fn-index that has been reached by traversing N expansions of the structured-value production is assigned an “index path” which is a list of the N keys that were traversed to reach it. For example, for N=0, the index path is empty. For N=1, and if an integer-key with numerical value 0 was traversed to reach the raw-fn-index, then the index path is [0].

.... give a few examples more, writing out various nested dicts/lists in Python-esque notation to clarify this concept ....

See the SipSignatureParser::ToStringVisitor for a canonical example of how to interpret the signature.

Implementations

C++
- SipSignatureMangler: Produces a function signature given individual input and result assignment of physical indices to nested index paths in the structure tree.
- SipSignatureParser: Parses signatures and dispatches calls to a visitor.