commit | e8f4948d5c62beea27a109161b697d9088d5b3bd | [log] [tgz] |
---|---|---|
author | Han-Chung Wang <hanhan0912@gmail.com> | Fri Apr 19 17:24:28 2024 -0700 |
committer | GitHub <noreply@github.com> | Fri Apr 19 17:24:28 2024 -0700 |
tree | d2f96f292054b8ce93c3ca19ec15bade05718541 | |
parent | 78005ef66ecb75f984d1d24e46bcaf6d01ed7317 [diff] |
[DT] Teach encoding about padding. (#17077) The revision has four major commits. They are expected to be landed together because we need all the piece to make it work. ## Make encodings be able to carry padding semantics https://github.com/openxla/iree/pull/17077/commits/7967faf4d2b442f7f1229d3801dbc95678ebe051 introduces a `round_dims_to` integer array on encodings. It represents the values for padding M,N,K dimensions. This provides the hints for both host and device the values that every dimension should be aligned with. Eventually we should have a better way for propagating the information between host side and device side. The revision is a step towards to the goal. The commit adds an option to SetEncoding pass. If the `padFactor` is set, the `round_dims_to` will be filled with the values; it only generates set_encoding ops, but not `iree_linalg_ext.upper_bound_tile_size` and `tensor.pad` ops. ## Teach Pack/UnPack Materialization Patterns about the new field https://github.com/openxla/iree/pull/17077/commits/2652c028b586b984dca237cfc6cd53a1ffa5235e teaches the materialization patterns to handle the new field. If the field is set, the inner tile sizes can't be greater than corresponding `round_dims_to` values. Otherwise, the actual buffer size could mismatch. ## Teach stream.tensor.sizeof to take encoding into accounts https://github.com/openxla/iree/pull/17077/commits/365dc413250675ae8e49cfcaccc3e4341d1d3432 teaches stream.tensor.sizeof to calculate proper sizes based on encodings. The encodings have `role`, `indexing_maps`, and `round_dims_to`. So it is able to look at `role` and `indexing_map` to infer contraction dimensions; pads the dimension to be aligned with values in `round_dims_to`. E.g., ```mlir #map = affine_map<(d0, d1, d2) -> (d0, d2)> #map1 = affine_map<(d0, d1, d2) -> (d2, d1)> #map2 = affine_map<(d0, d1, d2) -> (d0, d1)> util.func public @sizeof_lhs_encoding_dynamic(%arg0: index, %arg1: index) -> index { %0 = stream.tensor.sizeof tensor<?x?xf32, #iree_linalg_ext.encoding< role = LHS, element_types = [f32, f32, f32], original_type = tensor<?x?xf32>, user_indexing_maps = [#map, #map1, #map2], round_dims_to = 4, 8, 16>>{%arg0, %arg1} : index util.return %0 : index } // CHECK-LABEL: @sizeof_lhs_encoding_dynamic // CHECK-DAG: %[[C4:.+]] = arith.constant 4 : index // CHECK-DAG: %[[C16:.+]] = arith.constant 16 : index // CHECK: %[[CEIL_DIV_D0:.+]] = arith.ceildivui %arg0, %[[C4]] // CHECK: %[[PAD_D0:.+]] = arith.muli %[[CEIL_DIV_D0]], %[[C4]] // CHECK: %[[CEIL_DIV_D1:.+]] = arith.ceildivui %arg1, %[[C16]] // CHECK: %[[PAD_D1:.+]] = arith.muli %[[CEIL_DIV_D1]], %[[C16]] // CHECK: %[[T0:.+]] = arith.muli %[[PAD_D0]], %[[C4]] // CHECK: %[[T1:.+]] = arith.muli %[[T0]], %[[PAD_D1]] // CHECK: return %[[T1]] ``` ## Add e2e tests https://github.com/openxla/iree/pull/17077/commits/5ca4a4b7a5a672e25d257cc9443c030735c305e5 adds a new test suite with `--iree-global-opt-enable-early-materialization=false`, so we have enough e2e test coverage for the new path.
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.