Disable external test suite on ROCm while flaky. (#16705)
Seeing flaky failures in different tests each time, like these:
*
https://github.com/openxla/iree/actions/runs/8195423846/job/22413939662#step:9:47
*
https://github.com/openxla/iree/actions/runs/8195232491/job/22413293699?pr=16703#step:9:47
*
https://github.com/openxla/iree/actions/runs/8196441429/job/22416930820?pr=16704#step:9:47
Reducing parallelism helped with somewhat similar errors before, but I
suspect there are some real issues in either the compiler, runtime, ROCm
driver, or runner machine. Probably need to debug the code/machine more
directly.
diff --git a/.github/workflows/pkgci_regression_test_amdgpu_rocm.yml b/.github/workflows/pkgci_regression_test_amdgpu_rocm.yml
index 729b2e4..44de820 100644
--- a/.github/workflows/pkgci_regression_test_amdgpu_rocm.yml
+++ b/.github/workflows/pkgci_regression_test_amdgpu_rocm.yml
@@ -53,22 +53,24 @@
-rA -s -m "plat_rdna3_rocm and presubmit" \
experimental/regression_suite
- # Out of tree tests
- - name: Checking out external TestSuite repository
- uses: actions/checkout@8f4b7f84864484a7bf31766abe9204da3cbe65b3 # v3.5.0
- with:
- repository: nod-ai/SHARK-TestSuite
- ref: 00053290a576ae39aecb6fb2757b58fcd3b143f2
- path: SHARK-TestSuite
- submodules: false
- - name: Installing external TestSuite Python requirements
- run: |
- source ${VENV_DIR}/bin/activate
- python -m pip install -r SHARK-TestSuite/iree_tests/requirements.txt
- - name: Run external TestSuite tests
- env:
- IREE_TEST_CONFIG_FILES: experimental/regression_suite/external_test_suite/config_gpu_rocm_rdna3.json
- run: |
- source ${VENV_DIR}/bin/activate
- export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/rocm/lib:/opt/rocm/hip/lib
- pytest SHARK-TestSuite/iree_tests -n 4 -rpfE --timeout=60
+ # Disabled while flaky. Might be replaced with HIP tests instead?
+
+ # # Out of tree tests
+ # - name: Checking out external TestSuite repository
+ # uses: actions/checkout@8f4b7f84864484a7bf31766abe9204da3cbe65b3 # v3.5.0
+ # with:
+ # repository: nod-ai/SHARK-TestSuite
+ # ref: 00053290a576ae39aecb6fb2757b58fcd3b143f2
+ # path: SHARK-TestSuite
+ # submodules: false
+ # - name: Installing external TestSuite Python requirements
+ # run: |
+ # source ${VENV_DIR}/bin/activate
+ # python -m pip install -r SHARK-TestSuite/iree_tests/requirements.txt
+ # - name: Run external TestSuite tests
+ # env:
+ # IREE_TEST_CONFIG_FILES: experimental/regression_suite/external_test_suite/config_gpu_rocm_rdna3.json
+ # run: |
+ # source ${VENV_DIR}/bin/activate
+ # export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/rocm/lib:/opt/rocm/hip/lib
+ # pytest SHARK-TestSuite/iree_tests -n 4 -rpfE --timeout=60