commit | da030739b320c3549fadb298fc40b12bf411c0d3 | [log] [tgz] |
---|---|---|
author | Ben Vanik <ben.vanik@gmail.com> | Wed Nov 09 01:03:36 2022 +0000 |
committer | GitHub <noreply@github.com> | Wed Nov 09 01:03:36 2022 +0000 |
tree | 00a252f445683a095359ddb19e79abdc88b4f390 | |
parent | e4369610e7d7574361bfac1c460fb110f46cbc3a [diff] |
Changing default bytecode dispatch away from computed goto. (#11090) It's a fairly marginal benefit to very heavy scalar VMVX workloads (~10-20%) but adds 20-30KB to the binary size and most users care more about that. Leaving the code path around so we can turn it back on if needed as workloads change. ``` ben@noxa-pc:/mnt/d/Dev/iree$ ../iree-build-wsl/runtime/src/iree/vm/bytecode_module_benchmark_switch --benchmark_min_time=10 2022-11-08T15:25:08-08:00 Running ../iree-build-wsl/runtime/src/iree/vm/bytecode_module_benchmark_switch Run on (64 X 3700 MHz CPU s) Load Average: 0.52, 0.58, 0.59 --------------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------------- BM_ModuleCreate 1665 ns 1664 ns 8452830 BM_ModuleCreateState 67.7 ns 67.7 ns 206451613 BM_FullModuleInit 1698 ns 1699 ns 8296296 BM_EmptyFuncReference 1.19 ns 1.19 ns 1000000000 BM_EmptyFuncBytecode 55.2 ns 55.2 ns 256000000 BM_CallInternalFuncReference 1.53 ns 1.53 ns 1000000000 BM_CallInternalFuncBytecode 43.0 ns 43.0 ns 324637700 BM_CallImportedFuncBytecode 48.0 ns 48.0 ns 291856680 BM_LoopSumReference/100000 2.18 ns 2.19 ns 1000000000 BM_LoopSumBytecode/100000 5.27 ns 5.27 ns 1000000000 BM_BufferReduceReference/100000 2.09 ns 2.09 ns 1000000000 BM_BufferReduceBytecode/100000 14.2 ns 14.2 ns 984700000 BM_BufferReduceBytecodeUnrolled/100000 12.6 ns 12.6 ns 1000000000 ``` ``` ben@noxa-pc:/mnt/d/Dev/iree$ ../iree-build-wsl/runtime/src/iree/vm/bytecode_module_benchmark_goto --benchmark_min_time=10 2022-11-08T15:28:05-08:00 Running ../iree-build-wsl/runtime/src/iree/vm/bytecode_module_benchmark_goto Run on (64 X 3700 MHz CPU s) Load Average: 0.52, 0.58, 0.59 --------------------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------------------- BM_ModuleCreate 1638 ns 1638 ns 8615385 BM_ModuleCreateState 47.0 ns 47.0 ns 297674419 BM_FullModuleInit 1677 ns 1677 ns 8373832 BM_EmptyFuncReference 1.19 ns 1.19 ns 1000000000 BM_EmptyFuncBytecode 54.0 ns 54.0 ns 258213256 BM_CallInternalFuncReference 1.53 ns 1.53 ns 1000000000 BM_CallInternalFuncBytecode 41.9 ns 41.9 ns 334328360 BM_CallImportedFuncBytecode 48.0 ns 48.0 ns 294252880 BM_LoopSumReference/100000 2.19 ns 2.19 ns 1000000000 BM_LoopSumBytecode/100000 4.64 ns 4.64 ns 1000000000 BM_BufferReduceReference/100000 2.09 ns 2.09 ns 1000000000 BM_BufferReduceBytecode/100000 10.1 ns 10.1 ns 1000000000 BM_BufferReduceBytecodeUnrolled/100000 10.1 ns 10.1 ns 1000000000 ```
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.