| # Function Signatures |
| |
| A key job of the IREE compiler and runtime is capturing function call semantics |
| from the originating system and providing mechanisms so that invocations can be |
| performed in as similar way as possible in various target languages. In general, |
| this requires additional metadata on top of the raw characteristics of a |
| function. Where possible, this is done by attaching attributes to a function. |
| |
| - `iree.abi` : JSON encoded description of the function's calling convention. |
| |
| ## V1 ABI |
| |
| This is the default ABI supported by the IREE VM invocations. It attempts to |
| provide a default calling convention that can be used without further reflection |
| metadata but which may be enhanced with it. |
| |
| It natively allows monomorphic functions to be exported where arguments and |
| results are composed of the following types: |
| |
| ### Value Types: |
| |
| - Byte aligned integer type (i8, i16, i32, i64) |
| - Floating point value (f16, f32, f64) |
| |
| ### Reference Types: |
| |
| - ND-Array buffers of Value Types: |
| |
| - Simple: Packed, C-layout |
| - Strided: Arbitrary layout with strides (future) |
| |
| - String (byte arrays) |
| |
| - Opaque reference object |
| |
| ### Sequence Types: |
| |
| - Tuples: fixed length lists where each position has its own type bound |
| - Homogenous list: lists of arbitrary size where a single type bound applies |
| to all elements |
| |
| The intent with these low level types is that calling conventions can be |
| synthesized to bind arbitrary high level, domain/language specific signatures to |
| these types, possibly by way of additional reflection metadata. |
| |
| ### Representations: |
| |
| The above are all representable with native constructs in the VM: |
| |
| - ValueType: |
| |
| - Runtime: |
| [`iree_vm_value`](https://github.com/google/iree/blob/main/iree/vm/value.h) |
| - Compile Time: primitive MLIR integer/floating point types |
| |
| - Simple ND-Array Buffer: |
| |
| - Runtime: |
| [`iree_hal_buffer_view`](https://github.com/google/iree/blob/main/iree/hal/buffer_view.h) |
| - Compile Time: `tensor<>` |
| |
| - String: |
| |
| - Runtime: |
| [`iree_vm_list`](https://github.com/google/iree/blob/main/iree/vm/list.h) |
| containing `i8` |
| - Compile Time: `!iree.list<i8>` |
| |
| - Tuple: |
| |
| - Runtime: |
| [`iree_vm_list`](https://github.com/google/iree/blob/main/iree/vm/list.h) |
| of variant |
| - Compile Time: `!iree.list<?>` |
| - Note that these are statically type erased at the boundary. |
| |
| - TypedList (homogenous): |
| |
| - Runtime: |
| [`iree_vm_list`](https://github.com/google/iree/blob/main/iree/vm/list.h) |
| of `T` |
| - Compile Time: `!iree.list<T>` |
| |
| ### Extended Type Calling Conventions |
| |
| While the above features of the native ABI may be sufficient for direct use by |
| various programs, many programs and callers will need to represent various |
| higher level types, consistently mapping them to the above facilities. This |
| section describes calling conventions for various higher level types which do |
| not map 1:1 to the above. Not all source language types are representable, and |
| extending these calling conventions (and the fundamental types above) is demand |
| driven. |
| |
| All of these calling conventions presume that the arity of the arguments/results |
| of the raw function matches the user-level function, meaning that the calling |
| convention is specified per argument/result. Higher-level whole function |
| transformations may also exist for some domains but are outside of the scope of |
| this specification. |
| |
| #### Structure |
| |
| A `Structure` is a common enough entity to have a dedicated calling convention. |
| In C-like languages, this may just be a `struct`. In Python, it is typically a |
| `dict` with an associated schema providing a name and type bound for each of its |
| slots. In both, its slots are of fixed arity. |
| |
| In this convention, such a structure is represented as a `Tuple` in the native |
| calling convention (i.e. `!iree.list` of variant type). The order of the |
| elements of the tuple are the natural order of the structure, where that is |
| either: |
| |
| - For a C-like system where order is determinate, it is the order of |
| declaration. |
| - For a name-based system (i.e. bind to `dict`) where no order is defined, the |
| natural order will be the lexically sorted order of the keys. |
| |
| #### String |
| |
| Most languages interop between byte arrays (i.e. the native ABI `String` type) |
| by way of applying an encoding. Such strings are just a sequence of bytes (i.e. |
| `!iree.list<i8>`). |
| |
| #### Typed List |
| |
| High level lists which all share the same type bound are represented as a |
| `TypedList` in the native ABI. |
| |
| #### NDArray of Reference Types |
| |
| NDArrays of reference types are considered separately from those of value types. |
| Internally, the code generated for them is completely different from what gets |
| generated for numeric based arrays (i.e. has ref-counting, ownership semantics, |
| non-POD, etc). These types are permitted for completeness, not necessarily |
| performance: by nature they are already indirected and have overheads. |
| |
| In the native ABI, these are represented as a composite tuple type (i.e. today a |
| list since sugar for tuple is not yet defined): `!iree.tuple<!iree.list<T>, |
| !iree.list<index>>`. The first element of the tuple is the list of values, |
| packed with a C-Layout and the second element is the list of dimension sizes. |
| |
| #### Reflection |
| |
| Additional reflection metadata may be encoded in a custom JSON form, providing |
| additional typing hints for arguments and results. If present, this will be a |
| reflection attribute with key `d`, containing a serialized JSON object. |
| |
| The JSON object contains: |
| |
| - `a` (array): List of type records for each argument. |
| - `r` (array): List of type records for each argument. |
| |
| Type records are one of: |
| |
| - A string naming a primitive type: |
| |
| - `i[0-9]+`: Integer type with given bit width |
| - `f[0-9]+`: IEEE floating point type with given bit width |
| - `bf16`: BFloat16 |
| |
| - JSON `null`: A null reference value |
| |
| - `"unknown"`: An unknown/unmapped type |
| |
| - An array, interpreted as a tuple describing a compound type. |
| |
| ##### Compound type tuples |
| |
| A compound type tuple has a type identifier as its first element, followed with |
| type specific fields: |
| |
| - `["ndarray", {element_type}, {rank}, {dim...}]`: For unknown rank, the |
| `rank` will be `null` and there will be no dims. Any unknown dim will be |
| `null`. |
| - `["slist", {slot_type...}]`: An anonymous structured list of fixed arity and |
| slot specific types. If there are gaps in the list, empty slots will have a |
| `null` type. |
| - `["stuple", {slot_type...}]`: Same as `slist` but some languages |
| differentiate between sequences represented as lists and those represented |
| as tuples (read-only lists). |
| - `["sdict", ["key", {slot_type}]...]`: An anonymous structure with named |
| slots. Note that when passing these types, the keys are not passed to the |
| function (only the slot values). |
| |
| ## Deprecated V0 ABIs |
| |
| These will be removed as soon as the corresponding code is removed. |
| |
| ### Generic Signature Mangling |
| |
| Where possible, ABI metadata is encoded into a plain-text signature in a way |
| that is easily transported across component boundaries and can be efficiently |
| implemented without additional dependencies (i.e. just string manipulation). |
| |
| The suggested format is manipulated via the C++ reference implementations |
| `SignatureBuilder` and `SignatureParser` classes (see |
| `iree/base/signature_parser.h`). See documentation and code for those classes |
| for more details. |
| |
| ### ABIs |
| |
| #### Raw Function ABI |
| |
| All exported functions implement the raw function ABI, which defines the |
| metadata and calling convention for marshalling inputs and results to their |
| underlying implementations. |
| |
| _Attributes:_ |
| |
| - `fv` = 1 (current version of the raw function ABI) |
| - `f` = encoded raw function signature (see below) |
| - `fbr` = result buffer allocation function name (optional) |
| |
| The reflection metadata documented here augments the underlying type system such |
| that host language bindings can interop as needed. This additional metadata is |
| needed in most dynamic cases because the compiled assets operate on fundamental |
| types with most characteristics type erased away (think: `void*` level things vs |
| high-level `ShapedBuffer` level things). |
| |
| ##### Grammar |
| |
| The signature is implemented in terms of the SignatureBuilder, using tagged |
| Integer and Spans. |
| |
| ```text |
| signature ::= 'I' length-prefixed(type-sequence) |
| 'R' length-prefixed(type-sequence) |
| |
| type-sequence ::= (arg-result-type)* |
| arg-result-type ::= buffer-type |
| | ref-object-type |
| | scalar-type |
| | unrecognized-type |
| buffer-type ::= 'B' length-prefixed(scalar-element-type? dim*) |
| scalar-type ::= 'S' length-prefixed(scalar-element-type?) |
| scalar-element-type ::= 't' ( |
| '0' # IEEE float32 (default if not specified) |
| | '1' # IEEE float16 |
| | '2' # IEEE float64 |
| | '3' # Google bfloat16 |
| | '4' # Signed int8 |
| | '5' # Signed int16 |
| | '6' # Signed int32 |
| | '7' # Signed int64 |
| | '8' # Unsigned int8 |
| | '9' # Unsigned int16 |
| | '10' # Unsigned int32 |
| | '11' # Unsigned int64 |
| ) |
| dim :: = 'd' integer # -1 indicates a dynamic dim |
| ref-object-type ::= 'O' length-prefixed() # Details TBD |
| unrecognized-type ::= 'U' length-prefixed() |
| |
| # Lexical primitives |
| integer ::= -?[0-9]+ |
| length ::= [0-9]+ |
| # The `length` encodes the length in bytes of `production`, plus 1 for the '!'. |
| length-prefixed(production) ::= length '!' production |
| any-byte-sequence ::= <any byte sequence> |
| ``` |
| |
| ##### Interpretation and Rationale |
| |
| ###### Memory layout |
| |
| The astute reader will note that the above metadata is insufficient to determine |
| the memory layout of a buffer. The reason is that any more specific details than |
| this (contiguity, strides, alignment, etc) can actually only be known once the |
| actual compute devices have been enumerated and the resulting matrix of |
| conversions is more dynamic than can be expressed in something as static as a |
| function signature. The above formulation is an input to an additional runtime |
| oracle which produces appropriate full buffer descriptions. |
| |
| While the exact implementation is host-language specific, consider the following |
| more detailed set of declarations that may exist in such a binding layer: |
| |
| ```c++ |
| // Inspired heavily by the Py_buffer type. |
| // See: https://docs.python.org/3/c-api/buffer.html |
| struct BufferDescription { |
| ScalarType element_type; |
| // For contiguous arrays, this is is the length of the underlying memory. |
| // For non-contiguous, this is the size of the buffer if it were copied |
| // to a contiguous representation. |
| size_t len; |
| // Number of dims and strides. |
| size_t ndim; |
| int* shape; |
| int* strides; |
| }; |
| |
| // Mirrors the 'buffer-type' production in the above grammar. |
| struct SignatureBufferType; |
| |
| // Oracle which combines signature metadata with a user-provided, materialized |
| // BufferDescription to derive a BufferDescription that is compatible for |
| // invocation. Returns an updated buffer description if the original is |
| // not compatible or fully specified. |
| // This can be used in a couple of ways: |
| // a) On function invocation to determine whether a provided buffer can be |
| // used as-is or needs to be converted (copied). |
| // b) To provide a factory function to the host language to create a |
| // compatible buffer. |
| optional<BufferDescription> BufferDescriptionOracle( |
| DeviceContext*, SignatureBufferType, BufferDescription) |
| throws UnsupportedBufferException; |
| ``` |
| |
| The above scheme should allow host-language and device coordination with respect |
| to buffer layout. For the moment, the responsibility to convert the buffer to a |
| compatible memory layout is on the host-language binding. However, often it is |
| the most efficient to schedule this for execution on a device. In the future, it |
| is anticipated that there will be a built-in pathway for scheduling such a |
| conversion (which would allow pipelining and offload of buffer conversions). |
| |
| ###### Deferred result allocation |
| |
| In general, exported functions accept pre-allocated results that should be |
| mutated. For the simplest cases, such results can be `null` and retrieved upon |
| completion of the function. This, however, puts severe limitations on the |
| ability to pipeline. For fully specified signatures (no dynamic shapes), the |
| `BufferDescriptionOracle` and the signature is sufficient to pre-allocate |
| appropriate results, which allows chains of result-producing invocations to be |
| pipelined. |
| |
| If, however, a `buffer-type` is not fully specified, the compiler may emit a |
| special _result allocator_ function, which will be referenced in the `fbr` |
| attribute. Such a function would have a signature like this: |
| |
| ```c++ |
| tuple<buffer> __allocate_results(tuple<int> dynamic_dims); |
| ``` |
| |
| Such a function takes a tuple of all dynamic buffer dims in the function input |
| signature and returns a tuple of allocated buffers for each dynamic result. Note |
| that it may not be possible to fully allocate results in this fashion (i.e. if |
| the result layout is data dependent), in which case a null buffer is returned |
| for that slot (and the host library would need to await on the invocation to get |
| the fully populated result). |
| |
| A similar mechanism will need to be created at some future point for |
| under-specified results of other (non-buffer) types. |
| |
| ###### Contiguity hinting |
| |
| Commonly in some kinds of dataflows, the compiler needs to be free to internally |
| toggle buffer continuity (i.e. C/row-major, Fortran/col-major, etc). In many |
| cases, such toggling does not naturally escape through the exported function |
| boundaries, in which case, there is no ABI impact. However, it is anticipated |
| that there is benefit to letting the toggle propagate through the exported ABI |
| boundary, in which case, the `buffer-type` will likely be extended with a |
| contiguity hint indicating the preference. When combined with the buffer |
| description oracle and in-pipeline conversion features described above, this |
| could yield a powerful mechanism for dynamically and efficiently managing such |
| transitions. |
| |
| Such an enhancement would almost certainly necessitate a major version bump in |
| the ABI and would be logical to implement once the advanced features above are |
| functional. |
| |
| #### Structured Index Path ABI |
| |
| Functions may support the SIP ABI if their input and result tuples logically map |
| onto "structures" (nested sequence/dicts). |
| |
| _Attributes:_ |
| |
| - `sipv` = 1 (current SIP ABI version) |
| - `sip` = encoded SIP signature (see below) |
| |
| This ABI maps a raw, linear sequence of inputs and results onto an input and |
| result "structure" -- which in this context refers to a nested assembly of |
| sequences (with integer keys) and dictionaries (with string keys). Such a |
| facility is useful for encoding input/result mappings in a way that is common in |
| dynamic languages (such as Python). |
| |
| In practice, this ABI supports the calling convention for TensorFlow, which |
| allows functions that accept and produce nestings via the |
| [`tf.nest`](https://www.tensorflow.org/api_docs/python/tf/nest) facility. In |
| implementing it, however, care has been taken to allow the calling convention to |
| generalize to other similar cases. |
| |
| ##### Grammar |
| |
| The signature is implemented in terms of the SignatureBuilder, using tagged |
| Integer and Spans. |
| |
| ```text |
| # Defines the structured value for the inputs ('I') and results ('R') |
| # of the function. |
| signature ::= 'I' length-prefixed(structured-value) |
| 'R' length-prefixed(structured-value) |
| |
| structured-value ::= raw-fn-index | sequence | dict |
| raw-fn-index ::= '_' integer |
| sequence ::= 'S' length-prefixed( (integer-key structured-value)* ) |
| integer-key ::= 'k' integer |
| dict ::= 'D' length-prefixed( (string-key structured-value)* ) |
| string-key ::= 'K' length-prefixed( any-byte-sequence ) |
| |
| # Low-level lexical primitives: |
| integer ::= -?[0-9]+ |
| length ::= [0-9]+ |
| # The `length` encodes the length in bytes of `production`, plus 1 for the '!'. |
| length-prefixed(production) ::= length '!' production |
| any-byte-sequence ::= <any byte sequence> |
| ``` |
| |
| Structured values define a tree of recursive dicts/lists, with `raw-fn-index` at |
| the leaves. The interpretation is that a raw-fn-index that has been reached by |
| traversing N expansions of the structured-value production is assigned an "index |
| path" which is a list of the N keys that were traversed to reach it. For |
| example, for N=0, the index path is empty. For N=1, and if an integer-key with |
| numerical value 0 was traversed to reach the raw-fn-index, then the index path |
| is [0]. |
| |
| .... give a few examples more, writing out various nested dicts/lists in |
| Python-esque notation to clarify this concept .... |
| |
| See the `SipSignatureParser::ToStringVisitor` for a canonical example of how to |
| interpret the signature. |
| |
| ##### Implementations |
| |
| - C++ |
| |
| - `SipSignatureMangler`: Produces a function signature given individual |
| input and result assignment of physical indices to nested index paths in |
| the structure tree. |
| - `SipSignatureParser`: Parses signatures and dispatches calls to a |
| visitor. |