Begin implementation of the 'raw' function ABI.

* Adds signature support for it.
* Adds documentation.
* This is all very draft-y and open to further discussion.

PiperOrigin-RevId: 285480154
diff --git a/docs/function_abi.md b/docs/function_abi.md
index 1c0c8b5..0296a6a 100644
--- a/docs/function_abi.md
+++ b/docs/function_abi.md
@@ -24,10 +24,170 @@
 
 ## ABIs
 
+### Raw Function ABI
+
+All exported functions implement the raw function ABI, which defines the
+metadata and calling convention for marshalling inputs and results to their
+underlying implementations.
+
+*Attributes:*
+
+*   `fv` = 1 (current version of the raw function ABI)
+*   `f` = encoded raw function signature (see below)
+*   `fbr` = result buffer allocation function name (optional)
+
+The reflection metadata documented here augments the underlying type system such
+that host language bindings can interop as needed. This additional metadata is
+needed in most dynamic cases because the compiled assets operate on fundamental
+types with most characteristics type erased away (think: `void*` level things vs
+high-level `ShapedBuffer` level things).
+
+#### Grammar
+
+The signature is implemented in terms of the SignatureBuilder, using tagged
+Integer and Spans.
+
+```text
+signature ::= 'I' length-prefixed(type-sequence)
+              'R' length-prefixed(type-sequence)
+
+type-sequence ::= (arg-result-type)*
+arg-result-type ::= buffer-type | ref-object-type
+buffer-type ::= 'B' length-prefixed(scalar-type? dim*)
+scalar-type ::= 't' (
+                    '0'  # IEEE float32 (default if not specified)
+                  | '1'  # IEEE float16
+                  | '2'  # IEEE float64
+                  | '3'  # Google bfloat16
+                  | '4'  # Signed int8
+                  | '5'  # Signed int16
+                  | '6'  # Signed int32
+                  | '7'  # Signed int64
+                  | '8'  # Unsigned int8
+                  | '9'  # Unsigned int16
+                  | '10' # Unsigned int32
+                  | '11' # Unsigned int64
+                  )
+dim :: = 'd' integer  # -1 indicates a dynamic dim
+ref-object-type ::= 'O' length-prefixed()  # Details TBD
+
+
+# Lexical primitives
+integer ::= -?[0-9]+
+length ::= [0-9]+
+# The `length` encodes the length in bytes of `production`, plus 1 for the '!'.
+length-prefixed(production) ::= length '!' production
+any-byte-sequence ::= <any byte sequence>
+```
+
+#### Interpretation and Rationale
+
+##### Memory layout
+
+The astute reader will note that the above metadata is insufficient to determine
+the memory layout of a buffer. The reason is that any more specific details than
+this (contiguity, strides, alignment, etc) can actually only be known once the
+actual compute devices have been enumerated and the resulting matrix of
+conversions is more dynamic than can be expressed in something as static as a
+function signature. The above formulation is an input to an additional runtime
+oracle which produces appropriate full buffer descriptions.
+
+While the exact implementation is host-language specific, consider the following
+more detailed set of declarations that may exist in such a binding layer:
+
+```c++
+// Inspired heavily by the Py_buffer type.
+// See: https://docs.python.org/3/c-api/buffer.html
+struct BufferDescription {
+  ScalarType element_type;
+  // For contiguous arrays, this is is the length of the underlying memory.
+  // For non-contiguous, this is the size of the buffer if it were copied
+  // to a contiguous representation.
+  size_t len;
+  // Number of dims and strides.
+  size_t ndim;
+  int* shape;
+  int* strides;
+};
+
+// Mirrors the 'buffer-type' production in the above grammar.
+struct SignatureBufferType;
+
+// Oracle which combines signature metadata with a user-provided, materialized
+// BufferDescription to derive a BufferDescription that is compatible for
+// invocation. Returns an updated buffer description if the original is
+// not compatible or fully specified.
+// This can be used in a couple of ways:
+//   a) On function invocation to determine whether a provided buffer can be
+//      used as-is or needs to be converted (copied).
+//   b) To provide a factory function to the host language to create a
+//      compatible buffer.
+optional<BufferDescription> BufferDescriptionOracle(
+    DeviceContext*, SignatureBufferType, BufferDescription)
+  throws UnsupportedBufferException;
+```
+
+The above scheme should allow host-language and device coordination with respect
+to buffer layout. For the moment, the responsibility to convert the buffer to a
+compatible memory layout is on the host-language binding. However, often it is
+the most efficient to schedule this for execution on a device. In the future, it
+is anticipated that there will be a built-in pathway for scheduling such a
+conversion (which would allow pipelinining and offload of buffer conversions).
+
+##### Deferred result allocation
+
+In general, exported functions accept pre-allocated results that should be
+mutated. For the simplest cases, such results can be `null` and retrieved upon
+completion of the function. This, however, puts severe limitations on the
+ability to pipeline. For fully specified signatures (no dynamic shapes), the
+`BufferDescriptionOracle` and the signature is sufficient to pre-allocate
+appropriate results, which allows chains of result-producing invocations to be
+pipelined.
+
+If, however, a `buffer-type` is not fully specified, the compiler may emit a
+special *result allocator* function, which will be referenced in the `fbr`
+attribute. Such a function would have a signature like this:
+
+```c++
+tuple<buffer> __allocate_results(tuple<int> dynamic_dims);
+```
+
+Such a function takes a tuple of all dynamic buffer dims in the function input
+signature and returns a tuple of allocated buffers for each dynamic result. Note
+that it may not be possible to fully allocate results in this fashion (i.e. if
+the result layout is data dependent), in which case a null buffer is returned
+for that slot (and the host library would need to await on the invocation to get
+the fully populated result).
+
+A similar mechanism will need to be created at some future point for
+under-specified results of other (non-buffer) types.
+
+##### Contiguity hinting
+
+Commonly in some kinds of dataflows, the compiler needs to be free to internally
+toggle buffer continuity (i.e. C/row-major, Fortran/col-major, etc). In many
+cases, such toggling does not naturally escape through the exported function
+boundaries, in which case, there is no ABI impact. However, it is anticipated
+that there is benefit to letting the toggle propagate through the exported ABI
+boundary, in which case, the `buffer-type` will likely be extended with a
+contiguity hint indicating the preference. When combined with the buffer
+description oracle and in-pipeline conversion features described above, this
+could yield a powerful mechanism for dynamically and efficiently managing such
+transitions.
+
+Such an enhancement would almost certainly necessitate a major version bump in
+the ABI and would be logical to implement once the advanced features above are
+functional.
+
 ### Structured Index Path ABI
 
-*   `abi` = `sip`
-*   Current `abiv` version = 1
+Functions may support the SIP ABI if their input and result tuples logically map
+onto "structures" (nested sequence/dicts).
+
+*Attributes:*
+
+*   `sipv` = 1 (current SIP ABI version)
+*   `sip` = encoded SIP signature (see below)
 
 This ABI maps a raw, linear sequence of inputs and results onto an input and
 result "structure" -- which in this context refers to a nested assembly of