| commit | 94550280e13a69952d3934aafae9c010ec0b9d01 | [log] [tgz] |
|---|---|---|
| author | Andrzej WarzyĆski <andrzej.warzynski@arm.com> | Mon Nov 27 19:57:49 2023 +0000 |
| committer | GitHub <noreply@github.com> | Mon Nov 27 11:57:49 2023 -0800 |
| tree | 137f82246303797e5921bb6cba4253ce6bf8cdcd | |
| parent | aee955b92938eca4f37d3100d0ca32bc377573e9 [diff] |
[CPU][SVE] Update default tiles sizes for matmul ops (#15650) The default tile sizes for SVE for matmuls, [8, 32, 16], are basically a copy of one of the existing configurations. In the case of SVE, that configuration leads to too aggressive unrolling and poor performance due to register spilling. Hence the need to update. This patch updates the default tile sizes for SVE to [8, 16, 1]. The middle dimension corresponds to vector sizes after vectorisation (that's also the dimension that's configured to be scalable). As the base vector register size for SVE is 128 bits, there will be (depending on the element size): * (16 x vscale) elements per vector register for i8, * (16 / 2 x vscale) elements per vector registers for i16, * (16 / 4 x vscale) elements per vector registers for i32, * (...) So, effectively, 16 is the lowest number that can be used to avoid under utilisation of vector registers (i.e. a lower number might be fine for wider elements, but not for i8). As for the remaining tile sizes, those were determined experimentally by benchmarking `linalg.matmul` for: * square matrices (1020x1020, 1021x1021, 1024x1024), * both i8 and f32 element types, * input tensors with static and dynamic shapes. In all the of the above cases, the new configuration improves the performance. It's probably worth pointing out that during compilation, IREE will reduce the leading tile size from 8 to 6. So, effectively, the tile sizes will be [6, 16, 1].
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.