tree 10e29ff4ca6604b4bcf79f0ba06ab1ac06e37f93
parent ef0f1a405783c9cf673487a61fda8fca8e83e8c1
author bjacob <benoitjacob@google.com> 1699977489 -0500
committer GitHub <noreply@github.com> 1699977489 -0500
gpgsig -----BEGIN PGP SIGNATURE-----
 
 wsBcBAABCAAQBQJlU5kRCRBK7hj4Ov3rIwAAUhQIAEOQlS4zZQJMnYZDiunVZfZ0
 LITUx8SIqxLfJOOiTuCxg3QH8mqVQUnhLF1qTHL04dr6XbwQB6F8HLgbA6cBLHh+
 stgskbIQlpYOc4G8PVz9LmLMOG2rZsaAHkv7A6xXcuExvPr5Mb31qOFu1V2S6VBk
 EddhZs9EMfpYtOH3OjC+LZYuZgmcy0ajcGuQZyDXNO/8K1fiVwDMTHefVoHG44Rk
 U7jfTr87A+zhq7HorpHLfOY9ZnxTI/J4Lil65KouWiqfrVIfIhnuHU9pczBGNd+z
 K2HNyiIaKqdRws7GGfK+gmdj3bCv/5Uw28W0V8E9uKXU2q3ILGpznJ5zRT2pJsI=
 =fxxF
 -----END PGP SIGNATURE-----
 

ukernel test improvements (#15542)

* Consistently compare with/without skipping of intermediate roundings.
A catch is that the ukernel may fall back to a generic code path (and
that fallback is consistently exercised by the test, even when a
non-fallback path is also available and tested). And generic code paths
("tile functions") never skipped intermediate roundings, even if allowed
to by the flag. This caused complicated test code retrying again on
error. This PR simply adds the skipping-intermediate-roundings generic
tile functions, so the test code is simpler, and concretely I just
needed that for #15543 as I'm adding bf16-accumulator tile functions
that are skipping intermediate roundings.
* I had to also update `iree-e2e-matmul-test` to switch to skipping
intermediate roundings. Unlike the ukernels' own tests, which really
must test both flavors, in `iree-e2e-matmul-test` we are e2e testing
what the compiler produces, and that is skippig intermediate roundings
at least by default, and while that could be overridden with
`--iree-llvmcpu-skip-intermediate-roundings=false`, we don't currently
test that in e2e matmul tests.
* Generate better random test input values. Some were too large - when
we generate random bfloat16 to accumulate into bfloat16, they better be
very small as we don't want to grow accumulators to the point where they
would start rounding. It's OK, because bfloat16 kernels use bfloat16
arithmetic instructions, not bit hacks, so correctness is sufficiently
tested on very small values. Conversely, for int8/int16 test input
values, we were generating a very narrow range and that was potentially
missing important coverage as some of our int kernels are starting to do
evil bit hacks (#15525).