Merge pull request #6382 from hanhanW:main-to-google

PiperOrigin-RevId: 382568990
diff --git a/README.md b/README.md
index f625236..b8067a6 100644
--- a/README.md
+++ b/README.md
@@ -35,16 +35,23 @@
 
 ## Build Status
 
-CI System | Build System  | Platform | Architecture    | Component            | Status
-:-------: | :-----------: | :------: | :-------------: | :------------------: | :----:
-Kokoro    | Bazel         | Linux    | x86             | Core                 | [![kokoro_status_bazel_linux_x86_core](https://storage.googleapis.com/iree-oss-build-badges/bazel/linux/x86-swiftshader/core/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/bazel/linux/x86-swiftshader/core/main_result.html)
-Kokoro    | CMake & Bazel | Linux    | x86-swiftshader | Integrations         | [![kokoro_status_cmake-bazel_linux_x86-swiftshader_integrations](https://storage.googleapis.com/iree-oss-build-badges/cmake_bazel/linux/x86-swiftshader/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake_bazel/linux/x86-swiftshader/main_result.html)
-Kokoro    | CMake & Bazel | Linux    | x86-turing      | Integrations         | [![kokoro_status_cmake-bazel_linux_x86-turing_integrations](https://storage.googleapis.com/iree-oss-build-badges/cmake_bazel/linux/x86-turing/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake_bazel/linux/x86-turing/main_result.html)
-Kokoro    | CMake         | Linux    | x86-swiftshader | Core + Bindings      | [![kokoro_status_cmake_linux_x86-swiftshader](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader/main_result.html)
-Kokoro    | CMake         | Linux    | x86-swiftshader-asan | Core + Bindings      | [![kokoro_status_cmake_linux_x86-swiftshader-asan](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader-asan/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader-asan/main_result.html)
-Kokoro    | CMake         | Linux    | x86-turing      | Core + Bindings      | [![kokoro_status_cmake_linux_x86-turing](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-turing/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-turing/main_result.html)
-Kokoro    | CMake         | Android  | arm64-v8a       | Runtime (build only) | [![kokoro_status_cmake_android_arm64-v8a](https://storage.googleapis.com/iree-oss-build-badges/cmake/android/arm64-v8a/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/android/arm64-v8a/main_result.html)
-BuildKite | CMake         | Android  | arm64-v8a       | Runtime              | [![buildkite-status-cmake-android-arm](https://badge.buildkite.com/a73df0ba9f4aa132650dd6676bc1e6c20d3d99ed6b24db2179.svg?branch=main)](https://buildkite.com/iree/iree-android-arm64-v8a/builds?branch=main)
+
+CI System | Build System  | Platform   | Architecture         | Configuration / Component    | Status
+:-------: | :-----------: | :--------: | :------------------: | :--------------------------: | :----:
+Kokoro    | Bazel         | Linux      | x86-64               |                              | [![kokoro_status_bazel/linux/x86-swiftshader/core](https://storage.googleapis.com/iree-oss-build-badges/bazel/linux/x86-swiftshader/core/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake-bazel/linux/x86-swiftshader/main_result.html)
+Kokoro    | CMake & Bazel | Linux      | x86-64 (swiftshader) | Integrations                 | [![kokoro status cmake-bazel/linux/x86-swiftshader](https://storage.googleapis.com/iree-oss-build-badges/cmake-bazel/linux/x86-swiftshader/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake-bazel/linux/x86-swiftshader/main_result.html)
+Kokoro    | CMake & Bazel | Linux      | x86-64 (turing)      | Integrations                 | [![kokoro status cmake-bazel/linux/x86-turing](https://storage.googleapis.com/iree-oss-build-badges/cmake-bazel/linux/x86-turing/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake-bazel/linux/x86-turing/main_result.html)
+Kokoro    | CMake         | Linux      | x86-64 (swiftshader) |                              | [![kokoro status cmake/linux/x86-swiftshader](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader/main_result.html)
+Kokoro    | CMake         | Linux      | x86-64 (swiftshader) | asan                         | [![kokoro status cmake/linux/x86-swiftshader-asan](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader-asan/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-swiftshader-asan/main_result.html)
+Kokoro    | CMake         | Linux      | x86-64 (turing)      |                              | [![kokoro status cmake/linux/x86-turing](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-turing/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/linux/x86-turing/main_result.html)
+Kokoro    | CMake         | Android    | arm64-v8a            | Runtime (build only)         | [![kokoro status cmake/android/arm64-v8a](https://storage.googleapis.com/iree-oss-build-badges/cmake/android/arm64-v8a/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/android/arm64-v8a/main_result.html)
+Kokoro    | CMake         | Bare Metal | risc-v-32            | Runtime                      | [![kokoro status cmake/baremetal/riscv32](https://storage.googleapis.com/iree-oss-build-badges/cmake/baremetal/riscv32/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/baremetal/riscv32/main_result.html)
+Kokoro    | CMake         | Linux      | risc-v-64            | Runtime                      | [![kokoro status cmake/baremetal/riscv32](https://storage.googleapis.com/iree-oss-build-badges/cmake/baremetal/riscv32/main_status.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/baremetal/riscv32/main_result.html)
+Buildkite | CMake         | Android    | arm64-v8a            | Runtime                      | [![buildkite status iree-android-arm64-v8a](https://badge.buildkite.com/a73df0ba9f4aa132650dd6676bc1e6c20d3d99ed6b24db2179.svg?branch=main)](https://buildkite.com/iree/iree-android-arm64-v8a)
+BuildKite | CMake         | Android    | arm64-v8a            | Runtime Benchmarks (Mako)    | [![buildkite status iree-android-benchmark](https://badge.buildkite.com/fd75cea669ec0614d810fefdcbe4a088cb6c312b4729023254.svg?branch=main)](https://buildkite.com/iree/iree-android-benchmark)
+BuildKite | CMake         | Android    | arm64-v8a            | Runtime Benchmarks           | [![buildkite status iree-benchmark](https://badge.buildkite.com/62e504b93171f4a19e5c46f8b9a99eb5dba050666640fbc21b.svg?branch=main)](https://buildkite.com/iree/iree-benchmark)
+BuildKite | CMake         | Linux      | x86-64               | Tracing + Standalone Runtime | [![buildkite status iree-build-configurations](https://badge.buildkite.com/3bc03ad54a6b785b3fdd0dd3d67fd93ed22ef2b538cb34adc3.svg?branch=main)](https://buildkite.com/iree/iree-build-configurations)
+
 
 ## Presentations and Talks
 
diff --git a/build_tools/cmake/build_riscv.sh b/build_tools/cmake/build_riscv.sh
index 138483f..f890c39 100755
--- a/build_tools/cmake/build_riscv.sh
+++ b/build_tools/cmake/build_riscv.sh
@@ -71,7 +71,7 @@
 
 if [[ "${RISCV_CONFIG?}" == "rv64" ]]; then
   args+=(
-    -DRISCV_TOOLCHAIN_ROOT="${RISCV_TOOLCHAIN_ROOT?}"
+    -DRISCV_TOOLCHAIN_ROOT="${RISCV_RV64_LINUX_TOOLCHAIN_ROOT?}"
   )
 elif [[ "${RISCV_CONFIG?}" == "rv32-baremetal" ]]; then
   args+=(
diff --git a/build_tools/docker/cmake-riscv/Dockerfile b/build_tools/docker/cmake-riscv/Dockerfile
index d845c0d..3d07dd8 100644
--- a/build_tools/docker/cmake-riscv/Dockerfile
+++ b/build_tools/docker/cmake-riscv/Dockerfile
@@ -19,5 +19,7 @@
 COPY --from=install-riscv "/usr/src/toolchain_iree" "/usr/src/toolchain_iree"
 COPY --from=install-riscv "/usr/src/toolchain_iree_rv32imf" "/usr/src/toolchain_iree_rv32imf"
 COPY --from=install-riscv "/usr/src/qemu-riscv" "/usr/src/qemu-riscv"
-ENV RISCV_TOOLCHAIN_ROOT="/usr/src/toolchain_iree"
+ENV RISCV_RV64_LINUX_TOOLCHAIN_ROOT="/usr/src/toolchain_iree"
 ENV RISCV_RV32_NEWLIB_TOOLCHAIN_ROOT="/usr/src/toolchain_iree_rv32imf"
+ENV QEMU_RV64_BIN="/usr/src/qemu-riscv/qemu-riscv64"
+ENV QEMU_RV32_BIN="/usr/src/qemu-riscv/qemu-riscv32"
diff --git a/build_tools/docker/prod_digests.txt b/build_tools/docker/prod_digests.txt
index 6920797..f5e1f59 100644
--- a/build_tools/docker/prod_digests.txt
+++ b/build_tools/docker/prod_digests.txt
@@ -14,6 +14,6 @@
 gcr.io/iree-oss/cmake-bazel-frontends-vulkan@sha256:e99fd07a48e2b1a00200b3b600ff00878d413045cb7809fe73dac4c36fa4825a
 gcr.io/iree-oss/cmake-bazel-frontends-nvidia@sha256:71eeb44ba014ee043ae2adeeb6458bc281444ee6f295b5ba7e4337a69a95f7df
 gcr.io/iree-oss/cmake-bazel-frontends-swiftshader@sha256:4e018bd74c630f89f86b700a47b6a6792c8f97e337870af69a000e578a3ca688
-gcr.io/iree-oss/cmake-riscv@sha256:2ec67bdb5094323e1df45b88c088aeda435655a6543b6ee42db4a41adde3048d
+gcr.io/iree-oss/cmake-riscv@sha256:95489593bc9b0cd325ce9c1a32b47389c01b174a5b8190a16d937d2e8828d384
 gcr.io/iree-oss/cmake-bazel-frontends-android@sha256:1392e3a27cddbdc597817168fb61e125bbdcbfd9076eff9d70bd8012b0a0c5ba
 gcr.io/iree-oss/samples@sha256:be5465585706b620d6c722caa6237eafdfaa8dd11ce20db0981b979f2d3387b3
diff --git a/build_tools/kokoro/gcp_ubuntu/cmake-bazel/linux/x86-swiftshader/build.sh b/build_tools/kokoro/gcp_ubuntu/cmake-bazel/linux/x86-swiftshader/build.sh
index 054ef2c..f74194c 100755
--- a/build_tools/kokoro/gcp_ubuntu/cmake-bazel/linux/x86-swiftshader/build.sh
+++ b/build_tools/kokoro/gcp_ubuntu/cmake-bazel/linux/x86-swiftshader/build.sh
@@ -31,11 +31,14 @@
 echo "Initializing submodules"
 ./scripts/git/submodule_versions.py init
 
-# BUILD the integrations binaries with Bazel
+# BUILD the integrations binaries with Bazel and run any lit tests
 pushd integrations/tensorflow
 BAZEL_CMD=(bazel --noworkspace_rc --bazelrc=build_tools/bazel/iree-tf.bazelrc)
 BAZEL_BINDIR="$(${BAZEL_CMD[@]?} info bazel-bin)"
-"${BAZEL_CMD[@]?}" build --config=generic_clang //iree_tf_compiler:all
+"${BAZEL_CMD[@]?}" query //iree_tf_compiler/... | \
+   xargs "${BAZEL_CMD[@]?}" test --config=generic_clang \
+      --test_tag_filters="-nokokoro" \
+      --build_tag_filters="-nokokoro"
 popd
 
 CMAKE_BUILD_DIR="$HOME/iree/build/tf"
diff --git a/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/build_kokoro.sh b/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/build_kokoro.sh
index f724e70..5bc9f57 100755
--- a/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/build_kokoro.sh
+++ b/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/build_kokoro.sh
@@ -24,7 +24,7 @@
 docker_setup
 
 docker run "${DOCKER_RUN_ARGS[@]?}" \
-  gcr.io/iree-oss/cmake-riscv@sha256:2ec67bdb5094323e1df45b88c088aeda435655a6543b6ee42db4a41adde3048d \
+  gcr.io/iree-oss/cmake-riscv@sha256:95489593bc9b0cd325ce9c1a32b47389c01b174a5b8190a16d937d2e8828d384 \
   build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/build.sh
 
 # Kokoro will rsync this entire directory back to the executor orchestrating the
diff --git a/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/test.sh b/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/test.sh
index 0dfeedd..5f87dd8 100755
--- a/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/test.sh
+++ b/build_tools/kokoro/gcp_ubuntu/cmake/baremetal/riscv32/test.sh
@@ -14,22 +14,20 @@
 # Print the UTC time when set -x is on
 export PS4='[$(date -u "+%T %Z")] '
 
-# Docker image has the QEMU installed at /usr/src/qemu-riscv.
 # Run the embedded_library module loader and simple_embedding under QEMU.
-
 echo "Test elf_module_test_binary"
 pushd "${BUILD_RISCV_DIR?}/iree/hal/local/elf" > /dev/null
-/usr/src/qemu-riscv/qemu-riscv32 -cpu rv32,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+"${QEMU_RV32_BIN?}" -cpu rv32,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
 elf_module_test_binary
 popd > /dev/null
 
 echo "Test simple_embedding binaries"
 pushd "${BUILD_RISCV_DIR?}/iree/samples/simple_embedding" > /dev/null
 
-/usr/src/qemu-riscv/qemu-riscv32 -cpu rv32,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+"${QEMU_RV32_BIN?}" -cpu rv32,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
 simple_embedding_embedded_sync
 
-/usr/src/qemu-riscv/qemu-riscv32 -cpu rv32,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+"${QEMU_RV32_BIN?}" -cpu rv32,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
 simple_embedding_vmvx_sync
 
 popd > /dev/null
diff --git a/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/build_kokoro.sh b/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/build_kokoro.sh
index d0ae447..3ee0877 100755
--- a/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/build_kokoro.sh
+++ b/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/build_kokoro.sh
@@ -24,7 +24,7 @@
 docker_setup
 
 docker run "${DOCKER_RUN_ARGS[@]?}" \
-  gcr.io/iree-oss/cmake-riscv@sha256:2ec67bdb5094323e1df45b88c088aeda435655a6543b6ee42db4a41adde3048d \
+  gcr.io/iree-oss/cmake-riscv@sha256:95489593bc9b0cd325ce9c1a32b47389c01b174a5b8190a16d937d2e8828d384 \
   build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/build.sh
 
 # Kokoro will rsync this entire directory back to the executor orchestrating the
diff --git a/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/test.sh b/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/test.sh
index 8e68788..2925954 100755
--- a/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/test.sh
+++ b/build_tools/kokoro/gcp_ubuntu/cmake/linux/riscv64/test.sh
@@ -14,19 +14,21 @@
 # Print the UTC time when set -x is on
 export PS4='[$(date -u "+%T %Z")] '
 
-# Docker image has the QEMU installed at /usr/src/qemu-riscv.
-# Run the simple_embedding binaries under QEMU.
+# Environment variable used by the emulator and iree-translate for the
+# dylib-llvm-aot bytecode codegen.
+export RISCV_TOOLCHAIN_ROOT=${RISCV_RV64_LINUX_TOOLCHAIN_ROOT?}
 
+# Run the binaries under QEMU.
 echo "Test simple_embedding binaries"
 pushd "${BUILD_RISCV_DIR?}/iree/samples/simple_embedding" > /dev/null
 
-/usr/src/qemu-riscv/qemu-riscv64 -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+"${QEMU_RV64_BIN?}" -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
 -L "${RISCV_TOOLCHAIN_ROOT?}/sysroot" simple_embedding_dylib
 
-/usr/src/qemu-riscv/qemu-riscv64 -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+"${QEMU_RV64_BIN?}" -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
 -L "${RISCV_TOOLCHAIN_ROOT?}/sysroot" simple_embedding_embedded_sync
 
-/usr/src/qemu-riscv/qemu-riscv64 -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+"${QEMU_RV64_BIN?}" -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
 -L "${RISCV_TOOLCHAIN_ROOT?}/sysroot" simple_embedding_vmvx_sync
 
 popd > /dev/null
@@ -43,7 +45,7 @@
   "${ROOT_DIR?}/iree/tools/test/iree-run-module.mlir" \
   -o "${BUILD_RISCV_DIR?}/iree-run-module-llvm_aot.vmfb"
 
-IREE_RUN_OUT=$(/usr/src/qemu-riscv/qemu-riscv64 -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
+IREE_RUN_OUT=$(${QEMU_RV64_BIN?} -cpu rv64,x-v=true,x-k=true,vlen=256,elen=64,vext_spec=v1.0 \
     -L "${RISCV_TOOLCHAIN_ROOT?}/sysroot" \
     "${BUILD_RISCV_DIR?}/iree/tools/iree-run-module" --driver=dylib \
     --module_file="${BUILD_RISCV_DIR?}/iree-run-module-llvm_aot.vmfb" \
diff --git a/docs/developers/design_docs/cuda_backend.md b/docs/developers/design_docs/cuda_backend.md
index ae05fff..be8bc04 100644
--- a/docs/developers/design_docs/cuda_backend.md
+++ b/docs/developers/design_docs/cuda_backend.md
@@ -59,10 +59,9 @@
 
 ### IREE flow
 
-IREE's [`target independent codegen`][codegen-passes] converts the compiler input to Linalg on Tensors. Afterward IREE will call the LinalgToNVVM codegen passes.
-Note that IREE had a legacy mode generating Linalg on Buffers. It is not supported by this path.
+IREE's [`target independent codegen`][codegen-passes] converts the compiler input to Linalg on Tensors. Afterward IREE will call the LinalgToLLVMGPU codegen passes.
 
-Once we get into LinalgToNNVM passes we first do bufferize to generate Linalg on Buffers. Then we apply MLIR generic passes to  convert linalg to SCF dialect and then SCF to Standard dialect. After that we convert Standard dialect to LLVM+NVVM dialect.
+Once we get into LinalgToLLVMGPU passes we first do bufferize to generate Linalg on Buffers. Then we apply MLIR generic passes to  convert linalg to SCF dialect and then SCF to Standard dialect. After that we convert Standard dialect to LLVM+NVVM dialect.
 
 ## Example
 
@@ -75,7 +74,7 @@
 ```
 
 ```shell
-# First translate into a VM bytecode module using linalg on tensors path.
+# First translate into a VM bytecode module.
 $ ../iree-build/iree/tools/iree-translate \
  -iree-input-type=mhlo \
  -iree-mlir-to-vm-bytecode-module \
diff --git a/integrations/tensorflow/build_tools/overlay/mlir-hlo/BUILD.bazel b/integrations/tensorflow/build_tools/overlay/mlir-hlo/BUILD.bazel
index cd2a36f..afeede3 100644
--- a/integrations/tensorflow/build_tools/overlay/mlir-hlo/BUILD.bazel
+++ b/integrations/tensorflow/build_tools/overlay/mlir-hlo/BUILD.bazel
@@ -27,6 +27,7 @@
         "legalize_gather_to_torch_index_select",
         "legalize_to_linalg",
         "lhlo",
+        "map_lmhlo_to_scalar_op",
         "materialize_broadcasts",
         "mhlo_to_mhlo_lowering_patterns",
         "unfuse_batch_norm",
diff --git a/integrations/tensorflow/iree_tf_compiler/BUILD b/integrations/tensorflow/iree_tf_compiler/BUILD
index 68533e9..4e94220 100644
--- a/integrations/tensorflow/iree_tf_compiler/BUILD
+++ b/integrations/tensorflow/iree_tf_compiler/BUILD
@@ -36,6 +36,9 @@
         "@iree//iree/compiler/Dialect/Flow/IR",
         "@iree//iree/compiler/Dialect/HAL/IR",
         "@iree//iree/compiler/Dialect/IREE/IR",
+        "@iree//iree/compiler/InputConversion/Common",
+        "@iree//iree/compiler/InputConversion/MHLO",
+        "@iree//iree/compiler/InputConversion/TOSA",
         "@iree//iree/tools:init_xla_dialects",
         "@llvm-project//llvm:Support",
         "@llvm-project//mlir:IR",
diff --git a/integrations/tensorflow/iree_tf_compiler/MHLO/test/CMakeLists.txt b/integrations/tensorflow/iree_tf_compiler/MHLO/test/CMakeLists.txt
deleted file mode 100644
index b5b4752..0000000
--- a/integrations/tensorflow/iree_tf_compiler/MHLO/test/CMakeLists.txt
+++ /dev/null
@@ -1,18 +0,0 @@
-# Copyright 2021 The IREE Authors
-#
-# Licensed under the Apache License v2.0 with LLVM Exceptions.
-# See https://llvm.org/LICENSE.txt for license information.
-# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-
-iree_add_all_subdirs()
-
-file(GLOB _GLOB_X_MLIR LIST_DIRECTORIES false RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} CONFIGURE_DEPENDS *.mlir)
-iree_lit_test_suite(
-  NAME
-    lit
-  SRCS
-    "${_GLOB_X_MLIR}"
-  DATA
-    iree::tools::IreeFileCheck
-    iree_tf_compiler_iree-tf-opt
-)
diff --git a/integrations/tensorflow/iree_tf_compiler/TF/test/CMakeLists.txt b/integrations/tensorflow/iree_tf_compiler/TF/test/CMakeLists.txt
deleted file mode 100644
index 8bb7325..0000000
--- a/integrations/tensorflow/iree_tf_compiler/TF/test/CMakeLists.txt
+++ /dev/null
@@ -1,18 +0,0 @@
-# Copyright 2020 The IREE Authors
-#
-# Licensed under the Apache License v2.0 with LLVM Exceptions.
-# See https://llvm.org/LICENSE.txt for license information.
-# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-
-iree_add_all_subdirs()
-
-file(GLOB _GLOB_X_MLIR LIST_DIRECTORIES false RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} CONFIGURE_DEPENDS *.mlir)
-iree_lit_test_suite(
-  NAME
-    lit
-  SRCS
-    "${_GLOB_X_MLIR}"
-  DATA
-    iree::tools::IreeFileCheck
-    iree_tf_compiler_iree-tf-opt
-)
diff --git a/integrations/tensorflow/iree_tf_compiler/TFL/test/CMakeLists.txt b/integrations/tensorflow/iree_tf_compiler/TFL/test/CMakeLists.txt
deleted file mode 100644
index defb904..0000000
--- a/integrations/tensorflow/iree_tf_compiler/TFL/test/CMakeLists.txt
+++ /dev/null
@@ -1,18 +0,0 @@
-# Copyright 2021 The IREE Authors
-#
-# Licensed under the Apache License v2.0 with LLVM Exceptions.
-# See https://llvm.org/LICENSE.txt for license information.
-# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-
-iree_add_all_subdirs()
-
-file(GLOB _GLOB_X_MLIR LIST_DIRECTORIES false RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} CONFIGURE_DEPENDS *.mlir)
-iree_lit_test_suite(
-  NAME
-    lit
-  SRCS
-    "${_GLOB_X_MLIR}"
-  DATA
-    iree::tools::IreeFileCheck
-    iree_tf_compiler_iree-opt-tflite
-)
diff --git a/integrations/tensorflow/iree_tf_compiler/TFL/test/import/BUILD b/integrations/tensorflow/iree_tf_compiler/TFL/test/import/BUILD
index 4f9a349..b8ab690 100644
--- a/integrations/tensorflow/iree_tf_compiler/TFL/test/import/BUILD
+++ b/integrations/tensorflow/iree_tf_compiler/TFL/test/import/BUILD
@@ -27,9 +27,10 @@
         "@iree//iree/tools:IreeFileCheck",
     ],
     driver = "@iree//iree/tools:run_lit.sh",
-    # TODO: These tests are failing internally and not running in OSS.
+    # TODO: These tests have never passed
     tags = [
         "manual",
+        "nokokoro",
         "notap",
     ],
 )
diff --git a/integrations/tensorflow/iree_tf_compiler/TFL/test/import/CMakeLists.txt b/integrations/tensorflow/iree_tf_compiler/TFL/test/import/CMakeLists.txt
deleted file mode 100644
index 9fa5e4c..0000000
--- a/integrations/tensorflow/iree_tf_compiler/TFL/test/import/CMakeLists.txt
+++ /dev/null
@@ -1,20 +0,0 @@
-# Copyright 2021 The IREE Authors
-#
-# Licensed under the Apache License v2.0 with LLVM Exceptions.
-# See https://llvm.org/LICENSE.txt for license information.
-# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-
-iree_add_all_subdirs()
-
-file(GLOB _GLOB_X_MLIR LIST_DIRECTORIES false RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} CONFIGURE_DEPENDS *.mlir)
-file(GLOB _GLOB_X_TFLITE LIST_DIRECTORIES false RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} CONFIGURE_DEPENDS *.tflite)
-iree_lit_test_suite(
-  NAME
-    lit
-  SRCS
-    "${_GLOB_X_MLIR}"
-  DATA
-    "${_GLOB_X_TFLITE}"
-    iree::tools::IreeFileCheck
-    iree_tf_compiler_iree-import-tflite
-)
diff --git a/integrations/tensorflow/iree_tf_compiler/iree-tf-opt-main.cpp b/integrations/tensorflow/iree_tf_compiler/iree-tf-opt-main.cpp
index 014a2ca..3c79799 100644
--- a/integrations/tensorflow/iree_tf_compiler/iree-tf-opt-main.cpp
+++ b/integrations/tensorflow/iree_tf_compiler/iree-tf-opt-main.cpp
@@ -13,6 +13,9 @@
 #include "iree/compiler/Dialect/Flow/IR/FlowDialect.h"
 #include "iree/compiler/Dialect/HAL/IR/HALDialect.h"
 #include "iree/compiler/Dialect/IREE/IR/IREEDialect.h"
+#include "iree/compiler/InputConversion/Common/Passes.h"
+#include "iree/compiler/InputConversion/MHLO/Passes.h"
+#include "iree/compiler/InputConversion/TOSA/Passes.h"
 #include "iree/tools/init_xla_dialects.h"
 #include "iree_tf_compiler/MHLO/Passes.h"
 #include "iree_tf_compiler/TF/Passes.h"
@@ -33,6 +36,12 @@
                   mlir::iree_compiler::IREE::HAL::HALDialect,
                   mlir::iree_compiler::IREEDialect>();
 
+  // Select IREE input passes.
+  mlir::iree_compiler::registerCommonInputConversionPasses();
+  mlir::iree_compiler::registerMHLOConversionPasses();
+  mlir::iree_compiler::registerTOSAConversionPasses();
+
+  // TensorFlow integration passes.
   mlir::RegisterAllTensorFlowDialects(registry);
   mlir::iree_integrations::TF::registerAllPasses();
   mlir::iree_integrations::MHLO::registerAllPasses();
diff --git a/iree/compiler/Codegen/Common/BUILD b/iree/compiler/Codegen/Common/BUILD
index afd1b47..6e6b5c7 100644
--- a/iree/compiler/Codegen/Common/BUILD
+++ b/iree/compiler/Codegen/Common/BUILD
@@ -50,6 +50,7 @@
         "//iree/compiler/Dialect/Flow/IR",
         "//iree/compiler/Dialect/HAL/IR",
         "//iree/compiler/Dialect/IREE/IR",
+        "//iree/compiler/Dialect/LinalgExt/IR",
         "//iree/compiler/Dialect/Shape/IR",
         "@llvm-project//llvm:Support",
         "@llvm-project//mlir:Affine",
diff --git a/iree/compiler/Codegen/Common/CMakeLists.txt b/iree/compiler/Codegen/Common/CMakeLists.txt
index ae91a28..8466201 100644
--- a/iree/compiler/Codegen/Common/CMakeLists.txt
+++ b/iree/compiler/Codegen/Common/CMakeLists.txt
@@ -58,6 +58,7 @@
     iree::compiler::Dialect::Flow::IR
     iree::compiler::Dialect::HAL::IR
     iree::compiler::Dialect::IREE::IR
+    iree::compiler::Dialect::LinalgExt::IR
     iree::compiler::Dialect::Shape::IR
   PUBLIC
 )
diff --git a/iree/compiler/Codegen/Common/LinalgBufferizePass.cpp b/iree/compiler/Codegen/Common/LinalgBufferizePass.cpp
index 7a2d0e0..ad130d0 100644
--- a/iree/compiler/Codegen/Common/LinalgBufferizePass.cpp
+++ b/iree/compiler/Codegen/Common/LinalgBufferizePass.cpp
@@ -46,6 +46,7 @@
 #include "iree/compiler/Dialect/HAL/IR/HALOps.h"
 #include "iree/compiler/Dialect/IREE/IR/IREEDialect.h"
 #include "iree/compiler/Dialect/IREE/IR/IREEOps.h"
+#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.h"
 #include "iree/compiler/Dialect/Shape/IR/ShapeOps.h"
 #include "llvm/ADT/EquivalenceClasses.h"
 #include "llvm/ADT/TypeSwitch.h"
@@ -380,6 +381,26 @@
   return tiedOperands;
 }
 
+static LogicalResult analyseLinalgExtOps(linalg_ext::LinalgExtOp op,
+                                         BufferizationPlan &plan) {
+  if (!op.hasTensorSemantics()) return success();
+  // TODO(hanchung): Revisit if we can tie together op.getOutputOperands() with
+  // the corresponding op.getInputOperands(). For now we have limit LinalgExt
+  // ops, and there is no use case. So we ignore it.
+  // Note: this is what should be done for LinalgOps, except for a what is done
+  // for operand fusion today.
+  for (auto input : op.getInputOperands()) {
+    plan.insert(input->get());
+  }
+  for (auto output : op.getOutputOperands()) {
+    plan.insert(output->get());
+  }
+  for (auto result : op->getResults()) {
+    plan.insert(result);
+  }
+  return success();
+}
+
 /// Adds the corresponding `outs` and result tensors of the linalg op into the
 /// same equivalence class.
 static LogicalResult analyseLinalgOps(linalg::LinalgOp linalgOp,
@@ -580,6 +601,10 @@
         .Case<linalg::LinalgOp>([&](linalg::LinalgOp linalgOp) {
           return analyseLinalgOps(linalgOp, plan);
         })
+        .Case<linalg_ext::LinalgExtOp>(
+            [&](linalg_ext::LinalgExtOp linalgExtOp) {
+              return analyseLinalgExtOps(linalgExtOp, plan);
+            })
         .Case<linalg::TensorCollapseShapeOp, linalg::TensorExpandShapeOp>(
             [&](auto reshapeOp) {
               return analyseSingleOperandResultOp(reshapeOp.src(),
@@ -910,7 +935,8 @@
     resultBuffer =
         TypeSwitch<Operation *, Value>(op)
             .Case<scf::IfOp, scf::ForOp, linalg::LinalgOp,
-                  tensor::InsertSliceOp, vector::TransferWriteOp>(
+                  linalg_ext::LinalgExtOp, tensor::InsertSliceOp,
+                  vector::TransferWriteOp>(
                 [&](auto op) { return resultBuffer; })
             .Case<linalg::TensorCollapseShapeOp, linalg::TensorExpandShapeOp>(
                 [&](auto reshapeOp) {
@@ -1123,9 +1149,10 @@
 /// Generic conversion pattern that matches any linalg::LinalgOp. This avoids
 /// template instantiating one pattern for each linalg::LinalgOp. The method
 /// expects all operands and results have already been mapped to memrefs.
+template <typename OpTy>
 static LogicalResult convertAnyLinalgOp(
-    OpBuilder &b, linalg::LinalgOp op, BlockAndValueMapping &bvm,
-    BufferizationPlan &plan, WorkgroupMemoryAllocationFn allocationFn) {
+    OpBuilder &b, OpTy op, BlockAndValueMapping &bvm, BufferizationPlan &plan,
+    WorkgroupMemoryAllocationFn allocationFn) {
   // Skip linalg ops inserted by this pass.
   if (op.hasBufferSemantics()) return success();
 
@@ -1539,12 +1566,12 @@
           }
           return convertPadTensorOp(b, padTensorOp, bvm);
         })
-        .Case<linalg::LinalgOp>([&](linalg::LinalgOp linalgOp) {
-          if (failed(getOrAllocateResultBuffers(b, linalgOp.getOperation(), bvm,
-                                                plan, allocationFn))) {
+        .Case<linalg::LinalgOp, linalg_ext::LinalgExtOp>([&](auto op) {
+          if (failed(
+                  getOrAllocateResultBuffers(b, op, bvm, plan, allocationFn))) {
             return failure();
           }
-          return convertAnyLinalgOp(b, linalgOp, bvm, plan, allocationFn);
+          return convertAnyLinalgOp(b, op, bvm, plan, allocationFn);
         })
         .Case<tensor::InsertSliceOp>(
             [&](tensor::InsertSliceOp subTensorInsertOp) {
diff --git a/iree/compiler/Codegen/Common/test/linalg_bufferize.mlir b/iree/compiler/Codegen/Common/test/linalg_bufferize.mlir
index e21945f..07257b6 100644
--- a/iree/compiler/Codegen/Common/test/linalg_bufferize.mlir
+++ b/iree/compiler/Codegen/Common/test/linalg_bufferize.mlir
@@ -2384,3 +2384,23 @@
 //       CHECK:       scf.if
 //   CHECK-DAG:         memref.store %[[V1]], %[[INOUT]][%[[P1]]]
 //   CHECK-DAG:         memref.store %[[V2]], %[[INOUT]][%[[ARG1]]]
+
+// -----
+
+func @linalg_ext_sort_1d() {
+  %c0 = constant 0 : index
+  %0 = hal.interface.binding.subspan @io::@rw[%c0] : !flow.dispatch.tensor<readwrite:128xi32>
+  %1 = flow.dispatch.tensor.load %0, offsets = [], sizes = [], strides = [] : !flow.dispatch.tensor<readwrite:128xi32> -> tensor<128xi32>
+  %2 = linalg_ext.sort {dimension = 0 : i64} outs(%1 : tensor<128xi32>) {
+  ^bb0(%arg0: i32, %arg1: i32):  // no predecessors
+    %3 = cmpi sgt, %arg0, %arg1 : i32
+    linalg_ext.yield %3 : i1
+  } -> tensor<128xi32>
+  flow.dispatch.tensor.store %2, %0, offsets = [], sizes = [], strides = [] : tensor<128xi32> -> !flow.dispatch.tensor<readwrite:128xi32>
+  return
+}
+// CHECK-LABEL: func @linalg_ext_sort_1d()
+//   CHECK-DAG:   %[[INOUT:.+]] = hal.interface.binding.subspan @io::@rw
+//       CHECK:   linalg_ext.sort
+//  CHECK-SAME:     dimension = 0 : i64
+//  CHECK-SAME:     outs(%[[INOUT]] : memref<128xi32>)
diff --git a/iree/compiler/Dialect/Flow/IR/FlowOps.cpp b/iree/compiler/Dialect/Flow/IR/FlowOps.cpp
index ecd0b1b..6f8333c 100644
--- a/iree/compiler/Dialect/Flow/IR/FlowOps.cpp
+++ b/iree/compiler/Dialect/Flow/IR/FlowOps.cpp
@@ -638,6 +638,18 @@
 }
 
 //===----------------------------------------------------------------------===//
+// flow.dispatch.tensor.store
+//===----------------------------------------------------------------------===//
+
+void DispatchTensorStoreOp::build(OpBuilder &builder, OperationState &state,
+                                  Value value, Value target,
+                                  ArrayRef<NamedAttribute> attributes) {
+  build(builder, state, ArrayRef<Type>(), value, target, ArrayRef<Value>(),
+        ArrayRef<Value>(), ArrayRef<Value>(), builder.getI64ArrayAttr({}),
+        builder.getI64ArrayAttr({}), builder.getI64ArrayAttr({}));
+}
+
+//===----------------------------------------------------------------------===//
 // flow.dispatch.workgroups
 //===----------------------------------------------------------------------===//
 
diff --git a/iree/compiler/Dialect/Flow/IR/FlowOps.td b/iree/compiler/Dialect/Flow/IR/FlowOps.td
index 9d84998..3b14102 100644
--- a/iree/compiler/Dialect/Flow/IR/FlowOps.td
+++ b/iree/compiler/Dialect/Flow/IR/FlowOps.td
@@ -543,6 +543,13 @@
     attr-dict `:` type($value) `->` type($target)
   }];
 
+  let builders = [
+    // Builder for tensor.store with empty offset, sizes and strides operands.
+    // This is used to store an entire tensor.
+    OpBuilder<(ins "Value":$value, "Value":$target,
+      CArg<"ArrayRef<NamedAttribute>", "{}">:$attrs)>,
+  ];
+
   let extraClassDeclaration = [{
     /// Return the expected rank of each of the`static_offsets`, `static_sizes`
     /// and `static_strides` attributes.
diff --git a/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.h b/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.h
index f67a982..35584d8 100644
--- a/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.h
+++ b/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.h
@@ -7,6 +7,8 @@
 #ifndef IREE_COMPILER_DIALECT_LINALGEXT_IR_LINALGEXTINTERFACES_H_
 #define IREE_COMPILER_DIALECT_LINALGEXT_IR_LINALGEXTINTERFACES_H_
 
+#include "mlir/IR/BlockAndValueMapping.h"
+#include "mlir/IR/Builders.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/OpDefinition.h"
 #include "mlir/Support/LLVM.h"
diff --git a/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.td b/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.td
index 19424d2..77d0ddf 100644
--- a/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.td
+++ b/iree/compiler/Dialect/LinalgExt/IR/LinalgExtInterfaces.td
@@ -438,6 +438,30 @@
             return opOperand->get().getType().template isa<RankedTensorType>();
           });
       }]
+    >,
+    //===------------------------------------------------------------------===//
+    // Other static interface methods.
+    //===------------------------------------------------------------------===//
+    InterfaceMethod<
+      /*desc=*/[{
+        Clone the current operation with the given location and operands. This
+        is used to abstract away the optional underlying region creation. This
+        does not change the balance between input, output_buffer and
+        init_tensors operands.
+      }],
+      /*retTy=*/"Operation *",
+      /*methodName=*/"clone",
+      (ins "OpBuilder &":$b, "Location":$loc, "TypeRange":$resultTypes,
+           "ValueRange":$operands),
+      [{
+        BlockAndValueMapping bvm;
+        OperationState state(
+          loc, ConcreteOp::getOperationName(), operands, resultTypes,
+          $_op->getAttrs());
+        for (Region &r : $_op->getRegions())
+          r.cloneInto(state.addRegion(), bvm);
+        return b.createOperation(state);
+      }]
     >
   ];
 
diff --git a/iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.cpp b/iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.cpp
index 491ac1f..8c1d62e 100644
--- a/iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.cpp
+++ b/iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.cpp
@@ -165,14 +165,18 @@
 }
 
 static LogicalResult verifySortOp(SortOp op) {
-  Block &block = op.region().front();
-  size_t numInputs = op.getNumInputs();
-  if (block.getNumArguments() != 2 * numInputs) {
-    return op.emitOpError("region block should have ")
-           << 2 * numInputs << " arguments";
+  if (op.getNumInputs()) {
+    return op.emitOpError("does not expect to take any inputs");
   }
 
-  int rank = op.getRank(op.getInputOperand(0));
+  Block &block = op.region().front();
+  size_t numOutputs = op.getNumOutputs();
+  if (block.getNumArguments() != 2 * numOutputs) {
+    return op.emitOpError("region block should have ")
+           << 2 * numOutputs << " arguments";
+  }
+
+  int rank = op.getRank(op.getOutputOperand(0));
   if (rank > 1 && !op.dimensionAttr()) {
     return op.emitOpError("dimension must be specified if rank > 1");
   }
diff --git a/iree/compiler/Dialect/LinalgExt/IR/test/invalid.mlir b/iree/compiler/Dialect/LinalgExt/IR/test/invalid.mlir
index a125011..1d6d6b1 100644
--- a/iree/compiler/Dialect/LinalgExt/IR/test/invalid.mlir
+++ b/iree/compiler/Dialect/LinalgExt/IR/test/invalid.mlir
@@ -3,7 +3,7 @@
 func @sort_invalid_dimension(%arg0: tensor<128xi32>) -> tensor<128xi32> {
   // expected-error @+1 {{dimension must be within (0, 1]}}
   %0 = linalg_ext.sort {dimension = 1 : i64}
-    ins(%arg0 : tensor<128xi32>) {
+    outs(%arg0 : tensor<128xi32>) {
   ^bb0(%arg1: i32, %arg2: i32):  // no predecessors
     %1 = cmpi sgt, %arg1, %arg2 : i32
     linalg_ext.yield %1 : i1
@@ -16,7 +16,7 @@
 func @sort_without_dimension(%arg0: tensor<3x4xi32>) -> tensor<3x4xi32> {
   // expected-error @+1 {{dimension must be specified if rank > 1}}
   %0 = linalg_ext.sort
-    ins(%arg0 : tensor<3x4xi32>) {
+    outs(%arg0 : tensor<3x4xi32>) {
   ^bb0(%arg1: i32, %arg2: i32):  // no predecessors
     %1 = cmpi sgt, %arg1, %arg2 : i32
     linalg_ext.yield %1 : i1
diff --git a/iree/compiler/Dialect/LinalgExt/IR/test/roundtrip.mlir b/iree/compiler/Dialect/LinalgExt/IR/test/roundtrip.mlir
index 8116cea..70f22a7 100644
--- a/iree/compiler/Dialect/LinalgExt/IR/test/roundtrip.mlir
+++ b/iree/compiler/Dialect/LinalgExt/IR/test/roundtrip.mlir
@@ -2,11 +2,11 @@
 
 // CHECK-LABEL: func @sort_tensor
 // CHECK:         linalg_ext.sort
-// CHECK-SAME:      ins({{.*}})
+// CHECK-SAME:      outs({{.*}})
 // CHECK:           linalg_ext.yield
 func @sort_tensor(%arg0: tensor<128xi32>) -> tensor<128xi32> {
   %0 = linalg_ext.sort
-    ins(%arg0 : tensor<128xi32>) {
+    outs(%arg0 : tensor<128xi32>) {
   ^bb0(%arg1: i32, %arg2: i32):  // no predecessors
     %1 = cmpi sgt, %arg1, %arg2 : i32
     linalg_ext.yield %1 : i1
@@ -16,14 +16,13 @@
 
 // CHECK-LABEL: func @sort_memref
 // CHECK:         linalg_ext.sort
-// CHECK-SAME:      ins({{.*}}) outs({{.*}})
+// CHECK-SAME:      outs({{.*}})
 // CHECK:           linalg_ext.yield
-func @sort_memref(%arg0: memref<128xi32>, %arg1: memref<128xi32>) {
+func @sort_memref(%arg0: memref<128xi32>) {
   linalg_ext.sort {dimension = 0 : i64}
-    ins(%arg0 : memref<128xi32>)
-    outs(%arg1 : memref<128xi32>) {
-  ^bb0(%arg2: i32, %arg3: i32):  // no predecessors
-    %0 = cmpi sgt, %arg2, %arg3 : i32
+    outs(%arg0 : memref<128xi32>) {
+  ^bb0(%arg1: i32, %arg2: i32):  // no predecessors
+    %0 = cmpi sgt, %arg1, %arg2 : i32
     linalg_ext.yield %0 : i1
   }
   return
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/BUILD b/iree/compiler/Dialect/LinalgExt/Transforms/BUILD
new file mode 100644
index 0000000..96e3752
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/BUILD
@@ -0,0 +1,53 @@
+# Copyright 2021 The IREE Authors
+#
+# Licensed under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+load("//build_tools/bazel:tblgen.bzl", "gentbl_cc_library")
+
+package(
+    default_visibility = ["//visibility:public"],
+    features = ["layering_check"],
+    licenses = ["notice"],  # Apache 2.0
+)
+
+gentbl_cc_library(
+    name = "PassesIncGen",
+    tbl_outs = [
+        (
+            ["-gen-pass-decls"],
+            "Passes.h.inc",
+        ),
+    ],
+    tblgen = "@llvm-project//mlir:mlir-tblgen",
+    td_file = "Passes.td",
+    td_srcs = [
+        "@llvm-project//mlir:PassBaseTdFiles",
+    ],
+)
+
+cc_library(
+    name = "Transforms",
+    srcs = [
+        "ConvertToLoops.cpp",
+        "PassDetail.h",
+        "Passes.cpp",
+    ],
+    hdrs = [
+        "Passes.h",
+        "Passes.h.inc",
+    ],
+    deps = [
+        ":PassesIncGen",
+        "//iree/compiler/Dialect/LinalgExt/IR",
+        "@llvm-project//llvm:Support",
+        "@llvm-project//mlir:IR",
+        "@llvm-project//mlir:MemRefDialect",
+        "@llvm-project//mlir:Pass",
+        "@llvm-project//mlir:SCFDialect",
+        "@llvm-project//mlir:StandardOps",
+        "@llvm-project//mlir:Support",
+        "@llvm-project//mlir:Transforms",
+    ],
+)
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/CMakeLists.txt b/iree/compiler/Dialect/LinalgExt/Transforms/CMakeLists.txt
new file mode 100644
index 0000000..12511a4
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/CMakeLists.txt
@@ -0,0 +1,45 @@
+################################################################################
+# Autogenerated by build_tools/bazel_to_cmake/bazel_to_cmake.py from           #
+# iree/compiler/Dialect/LinalgExt/Transforms/BUILD                             #
+#                                                                              #
+# Use iree_cmake_extra_content from iree/build_defs.oss.bzl to add arbitrary   #
+# CMake-only content.                                                          #
+#                                                                              #
+# To disable autogeneration for this file entirely, delete this header.        #
+################################################################################
+
+iree_add_all_subdirs()
+
+iree_tablegen_library(
+  NAME
+    PassesIncGen
+  TD_FILE
+    "Passes.td"
+  OUTS
+    -gen-pass-decls Passes.h.inc
+)
+
+iree_cc_library(
+  NAME
+    Transforms
+  HDRS
+    "Passes.h"
+    "Passes.h.inc"
+  SRCS
+    "ConvertToLoops.cpp"
+    "PassDetail.h"
+    "Passes.cpp"
+  DEPS
+    LLVMSupport
+    MLIRIR
+    MLIRMemRef
+    MLIRPass
+    MLIRSCF
+    MLIRStandard
+    MLIRSupport
+    MLIRTransforms
+    iree::compiler::Dialect::LinalgExt::IR
+  PUBLIC
+)
+
+### BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_BELOW_THIS_LINE ###
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/ConvertToLoops.cpp b/iree/compiler/Dialect/LinalgExt/Transforms/ConvertToLoops.cpp
new file mode 100644
index 0000000..9f57649
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/ConvertToLoops.cpp
@@ -0,0 +1,154 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtDialect.h"
+#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.h"
+#include "iree/compiler/Dialect/LinalgExt/Transforms/PassDetail.h"
+#include "iree/compiler/Dialect/LinalgExt/Transforms/Passes.h"
+#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallVector.h"
+#include "mlir/Dialect/MemRef/IR/MemRef.h"
+#include "mlir/Dialect/SCF/SCF.h"
+#include "mlir/Dialect/StandardOps/IR/Ops.h"
+#include "mlir/IR/BuiltinTypes.h"
+#include "mlir/Pass/Pass.h"
+#include "mlir/Transforms/DialectConversion.h"
+
+namespace mlir {
+namespace iree_compiler {
+namespace linalg_ext {
+namespace {
+
+struct BubbleSortConversion : public OpRewritePattern<linalg_ext::SortOp> {
+  using OpRewritePattern<linalg_ext::SortOp>::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(linalg_ext::SortOp op,
+                                PatternRewriter& rewriter) const final {
+    if (!op.hasBufferSemantics()) return failure();
+
+    auto arg0 = op.getOutputOperand(0);
+    Location loc = op.getLoc();
+    SmallVector<Value, 4> lbs, ubs, steps;
+    for (auto en : llvm::enumerate(op.getShape(arg0))) {
+      if (ShapedType::isDynamic(en.value())) {
+        ubs.push_back(
+            rewriter.create<memref::DimOp>(loc, arg0->get(), en.index()));
+      } else {
+        ubs.push_back(rewriter.create<ConstantIndexOp>(loc, en.value()));
+      }
+    }
+    Value zero = rewriter.create<ConstantIndexOp>(loc, 0);
+    Value one = rewriter.create<ConstantIndexOp>(loc, 1);
+    lbs.append(op.getRank(arg0), zero);
+    steps.append(op.getRank(arg0), one);
+
+    uint64_t sortDim = 0;
+    if (op.dimensionAttr()) sortDim = op.dimension().getValue();
+    Value ub = rewriter.create<SubIOp>(loc, ubs[sortDim], one);
+    mlir::scf::buildLoopNest(
+        rewriter, loc, lbs, ubs, steps, ValueRange{},
+        [&](OpBuilder& b, Location loc, ValueRange ivs, ValueRange iters) {
+          SmallVector<Value> indices, sortBlkArgs;
+          indices.append(ivs.begin(), ivs.end());
+          // Bubble sort innermost loop.
+          auto scfFor = b.create<scf::ForOp>(
+              loc, zero, ub, one, iters,
+              [&](OpBuilder& b, Location loc, Value iv, ValueRange iters) {
+                SmallVector<Value> indices(ivs);
+                Value ivPlusOne = b.create<AddIOp>(loc, iv, one);
+                for (auto output : op.getOutputOperands()) {
+                  indices[sortDim] = iv;
+                  sortBlkArgs.push_back(
+                      b.create<memref::LoadOp>(loc, output->get(), indices));
+                  indices[sortDim] = ivPlusOne;
+                  sortBlkArgs.push_back(
+                      b.create<memref::LoadOp>(loc, output->get(), indices));
+                }
+                // A block must end with a terminator. This op will be erased
+                // later.
+                b.create<scf::YieldOp>(loc);
+              });
+
+          Region& region = scfFor.region();
+          rewriter.mergeBlockBefore(&op.region().front(),
+                                    region.front().getTerminator(),
+                                    sortBlkArgs);
+          rewriter.eraseOp(region.front().getTerminator());
+
+          // The erasion of an op will happen later, so we can not use
+          // .getTerminator() method here.
+          auto linalgExtYieldOp = llvm::to_vector<4>(
+              region.front().getOps<linalg_ext::YieldOp>())[0];
+          Value cond = linalgExtYieldOp.getOperand(0);
+          rewriter.replaceOp(linalgExtYieldOp, {});
+
+          b.setInsertionPointToEnd(&region.front());
+          b.create<scf::IfOp>(
+              loc, TypeRange{}, cond,
+              [&](OpBuilder& b, Location loc) {
+                // Swap the pairs if true.
+                SmallVector<Value> indices(ivs.begin(), ivs.end());
+                Value ivPlusOne =
+                    b.create<AddIOp>(loc, scfFor.getInductionVar(), one);
+                for (int i = 0, e = op.getNumOutputs(); i < e; ++i) {
+                  Value v1 = sortBlkArgs[i * 2];
+                  Value v2 = sortBlkArgs[i * 2 + 1];
+                  indices[sortDim] = scfFor.getInductionVar();
+                  b.create<memref::StoreOp>(
+                      loc, v2, op.getOutputOperand(i)->get(), indices);
+                  indices[sortDim] = ivPlusOne;
+                  b.create<memref::StoreOp>(
+                      loc, v1, op.getOutputOperand(i)->get(), indices);
+                }
+                b.create<scf::YieldOp>(loc);
+              },
+              [&](OpBuilder& b, Location loc) {
+                // Do not swap the pairs if false.
+                b.create<scf::YieldOp>(loc);
+              });
+
+          b.create<scf::YieldOp>(loc);
+          return scf::ValueVector();
+        });
+    rewriter.eraseOp(op);
+    return success();
+  }
+};
+
+struct LinalgExtToLoopsPass
+    : public LinalgExtToLoopsBase<LinalgExtToLoopsPass> {
+  void getDependentDialects(DialectRegistry& registry) const override {
+    registry
+        .insert<StandardOpsDialect, memref::MemRefDialect, scf::SCFDialect>();
+  }
+
+  void runOnOperation() override {
+    MLIRContext* context = &getContext();
+
+    OwningRewritePatternList patterns(context);
+    patterns.insert<BubbleSortConversion>(context);
+
+    ConversionTarget target(getContext());
+    target.addLegalDialect<memref::MemRefDialect, StandardOpsDialect,
+                           scf::SCFDialect>();
+    target.addIllegalOp<linalg_ext::SortOp>();
+
+    if (failed(applyPartialConversion(getOperation(), target,
+                                      std::move(patterns)))) {
+      signalPassFailure();
+    }
+  }
+};
+}  // namespace
+
+std::unique_ptr<OperationPass<FuncOp>> createLinalgExtToLoopsPass() {
+  return std::make_unique<LinalgExtToLoopsPass>();
+}
+
+}  // namespace linalg_ext
+}  // namespace iree_compiler
+}  // namespace mlir
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/PassDetail.h b/iree/compiler/Dialect/LinalgExt/Transforms/PassDetail.h
new file mode 100644
index 0000000..c2f87e9
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/PassDetail.h
@@ -0,0 +1,17 @@
+#ifndef IREE_COMPILER_DIALECT_LINALGEXT_TRANSFORMS_PASS_DETAIL_H_
+#define IREE_COMPILER_DIALECT_LINALGEXT_TRANSFORMS_PASS_DETAIL_H_
+
+#include "mlir/Pass/Pass.h"
+
+namespace mlir {
+namespace iree_compiler {
+namespace linalg_ext {
+
+#define GEN_PASS_CLASSES
+#include "iree/compiler/Dialect/LinalgExt/Transforms/Passes.h.inc"  // IWYU pragma: keep
+
+}  // namespace linalg_ext
+}  // namespace iree_compiler
+}  // namespace mlir
+
+#endif  // IREE_COMPILER_DIALECT_LINALGEXT_TRANSFORMS_PASS_DETAIL_H_
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/Passes.cpp b/iree/compiler/Dialect/LinalgExt/Transforms/Passes.cpp
new file mode 100644
index 0000000..1030e5a
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/Passes.cpp
@@ -0,0 +1,26 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#include "iree/compiler/Dialect/LinalgExt/Transforms/Passes.h"
+
+#include "mlir/Pass/Pass.h"
+#include "mlir/Pass/PassRegistry.h"
+#include "mlir/Transforms/Passes.h"
+
+namespace mlir {
+namespace iree_compiler {
+namespace linalg_ext {
+
+namespace {
+#define GEN_PASS_REGISTRATION
+#include "iree/compiler/Dialect/LinalgExt/Transforms/Passes.h.inc"  // IWYU pragma: export
+}  // namespace
+
+void registerLinalgExtPasses() { registerPasses(); }
+
+}  // namespace linalg_ext
+}  // namespace iree_compiler
+}  // namespace mlir
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/Passes.h b/iree/compiler/Dialect/LinalgExt/Transforms/Passes.h
new file mode 100644
index 0000000..df5ed5a
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/Passes.h
@@ -0,0 +1,24 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#ifndef IREE_COMPILER_DIALECT_LINALGEXT_TRANSFORMS_PASSES_H_
+#define IREE_COMPILER_DIALECT_LINALGEXT_TRANSFORMS_PASSES_H_
+
+#include "mlir/Pass/Pass.h"
+
+namespace mlir {
+namespace iree_compiler {
+namespace linalg_ext {
+
+std::unique_ptr<OperationPass<FuncOp>> createLinalgExtToLoopsPass();
+
+void registerLinalgExtPasses();
+
+}  // namespace linalg_ext
+}  // namespace iree_compiler
+}  // namespace mlir
+
+#endif  // IREE_COMPILER_DIALECT_LINALGEXT_TRANSFORMS_PASSES_H_
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/Passes.td b/iree/compiler/Dialect/LinalgExt/Transforms/Passes.td
new file mode 100644
index 0000000..2320a39
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/Passes.td
@@ -0,0 +1,18 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#ifndef IREE_DIALECT_LINALGEXT_PASSES
+#define IREE_DIALECT_LINALGEXT_PASSES
+
+include "mlir/Pass/PassBase.td"
+
+def LinalgExtToLoops :
+    Pass<"iree-linalg-ext-to-loops", "FuncOp"> {
+  let summary = "";
+  let constructor = "mlir::iree_compiler::linalg_ext::createLinalgExtToLoopsPass()";
+}
+
+#endif  // IREE_DIALECT_LINALGEXT_PASSES
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/test/BUILD b/iree/compiler/Dialect/LinalgExt/Transforms/test/BUILD
new file mode 100644
index 0000000..a2aef95
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/test/BUILD
@@ -0,0 +1,28 @@
+# Copyright 2021 The IREE Authors
+#
+# Licensed under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+load("//iree:lit_test.bzl", "iree_lit_test_suite")
+load("//build_tools/bazel:enforce_glob.bzl", "enforce_glob")
+
+package(
+    default_visibility = ["//visibility:public"],
+    features = ["layering_check"],
+    licenses = ["notice"],  # Apache 2.0
+)
+
+iree_lit_test_suite(
+    name = "lit",
+    srcs = enforce_glob(
+        [
+            "convert_to_loops.mlir",
+        ],
+        include = ["*.mlir"],
+    ),
+    data = [
+        "//iree/tools:IreeFileCheck",
+        "//iree/tools:iree-opt",
+    ],
+)
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/test/CMakeLists.txt b/iree/compiler/Dialect/LinalgExt/Transforms/test/CMakeLists.txt
new file mode 100644
index 0000000..931e320
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/test/CMakeLists.txt
@@ -0,0 +1,23 @@
+################################################################################
+# Autogenerated by build_tools/bazel_to_cmake/bazel_to_cmake.py from           #
+# iree/compiler/Dialect/LinalgExt/Transforms/test/BUILD                        #
+#                                                                              #
+# Use iree_cmake_extra_content from iree/build_defs.oss.bzl to add arbitrary   #
+# CMake-only content.                                                          #
+#                                                                              #
+# To disable autogeneration for this file entirely, delete this header.        #
+################################################################################
+
+iree_add_all_subdirs()
+
+iree_lit_test_suite(
+  NAME
+    lit
+  SRCS
+    "convert_to_loops.mlir"
+  DATA
+    iree::tools::IreeFileCheck
+    iree::tools::iree-opt
+)
+
+### BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_BELOW_THIS_LINE ###
diff --git a/iree/compiler/Dialect/LinalgExt/Transforms/test/convert_to_loops.mlir b/iree/compiler/Dialect/LinalgExt/Transforms/test/convert_to_loops.mlir
new file mode 100644
index 0000000..7af12e8
--- /dev/null
+++ b/iree/compiler/Dialect/LinalgExt/Transforms/test/convert_to_loops.mlir
@@ -0,0 +1,93 @@
+// RUN: iree-opt -split-input-file -iree-linalg-ext-to-loops %s | IreeFileCheck %s
+
+func @sort_1d(%arg0: memref<128xi32>) {
+  linalg_ext.sort {dimension = 0 : i64}
+    outs(%arg0 : memref<128xi32>) {
+  ^bb0(%arg2: i32, %arg3: i32):  // no predecessors
+    %0 = cmpi sgt, %arg2, %arg3 : i32
+    linalg_ext.yield %0 : i1
+  }
+  return
+}
+// CHECK-LABEL: func @sort_1d
+// CHECK-SAME:    %[[BUF:[a-zA-Z0-9]+]]
+// CHECK-DAG:     %[[C128:.+]] = constant 128 : index
+// CHECK-DAG:     %[[C0:.+]] = constant 0 : index
+// CHECK-DAG:     %[[C1:.+]] = constant 1 : index
+// CHECK-DAG:     %[[UB:.+]] = subi %[[C128]], %c1 : index
+// CHECK:         scf.for %[[ARG1:.+]] = %[[C0]] to %[[C128]] step %[[C1]]
+// CHECK:           scf.for %[[ARG2:.+]] = %[[C0]] to %[[UB]] step %[[C1]]
+// CHECK:             %[[T1:.+]] = addi %[[ARG2]], %[[C1]] : index
+// CHECK:             %[[V1:.+]] = memref.load %[[BUF]][%[[ARG2]]]
+// CHECK:             %[[V2:.+]] = memref.load %[[BUF]][%[[T1]]]
+// CHECK:             %[[COND:.+]] = cmpi sgt, %[[V1]], %[[V2]] : i32
+// CHECK:             scf.if %[[COND]] {
+// CHECK:               %[[T2:.+]] = addi %[[ARG2]], %[[C1]] : index
+// CHECK:               memref.store %[[V2]], %[[BUF]][%[[ARG2]]]
+// CHECK:               memref.store %[[V1]], %[[BUF]][%[[T2]]]
+// CHECK:             }
+
+// -----
+
+func @sort_2d(%arg0: memref<16x32xi32>) {
+  linalg_ext.sort {dimension = 0 : i64}
+    outs(%arg0 : memref<16x32xi32>) {
+  ^bb0(%arg2: i32, %arg3: i32):  // no predecessors
+    %0 = cmpi sgt, %arg2, %arg3 : i32
+    linalg_ext.yield %0 : i1
+  }
+  return
+}
+// CHECK-LABEL: func @sort_2d
+// CHECK-SAME:    %[[BUF:[a-zA-Z0-9]+]]
+// CHECK-DAG:     %[[C16:.+]] = constant 16 : index
+// CHECK-DAG:     %[[C32:.+]] = constant 32 : index
+// CHECK-DAG:     %[[C0:.+]] = constant 0 : index
+// CHECK-DAG:     %[[C1:.+]] = constant 1 : index
+// CHECK-DAG:     %[[UB:.+]] = subi %[[C16]], %c1 : index
+// CHECK:         scf.for %[[ARG1:.+]] = %[[C0]] to %[[C16]] step %[[C1]]
+// CHECK:           scf.for %[[ARG2:.+]] = %[[C0]] to %[[C32]] step %[[C1]]
+// CHECK:             scf.for %[[ARG3:.+]] = %[[C0]] to %[[UB]] step %[[C1]]
+// CHECK:               %[[T1:.+]] = addi %[[ARG3]], %[[C1]] : index
+// CHECK:               %[[V1:.+]] = memref.load %[[BUF]][%[[ARG3]], %[[ARG2]]]
+// CHECK:               %[[V2:.+]] = memref.load %[[BUF]][%[[T1]], %[[ARG2]]]
+// CHECK:               %[[COND:.+]] = cmpi sgt, %[[V1]], %[[V2]] : i32
+// CHECK:               scf.if %[[COND]] {
+// CHECK:                 %[[T2:.+]] = addi %[[ARG3]], %[[C1]] : index
+// CHECK:                 memref.store %[[V2]], %[[BUF]][%[[ARG3]], %[[ARG2]]]
+// CHECK:                 memref.store %[[V1]], %[[BUF]][%[[T2]], %[[ARG2]]]
+// CHECK:               }
+
+// -----
+
+func @sort_multi(%arg0: memref<128xf32>, %arg1: memref<128xi32>) {
+  linalg_ext.sort
+    outs(%arg0, %arg1 : memref<128xf32>, memref<128xi32>) {
+  ^bb0(%arg2: f32, %arg3: f32, %arg4: i32, %arg5: i32):  // no predecessors
+    %0 = cmpf ogt, %arg2, %arg3 : f32
+    linalg_ext.yield %0 : i1
+  }
+  return
+}
+// CHECK-LABEL: func @sort_multi
+// CHECK-SAME:    %[[BUF1:[a-zA-Z0-9]+]]
+// CHECK-SAME:    %[[BUF2:[a-zA-Z0-9]+]]
+// CHECK-DAG:     %[[C128:.+]] = constant 128 : index
+// CHECK-DAG:     %[[C0:.+]] = constant 0 : index
+// CHECK-DAG:     %[[C1:.+]] = constant 1 : index
+// CHECK-DAG:     %[[UB:.+]] = subi %[[C128]], %c1 : index
+// CHECK:         scf.for %[[ARG1:.+]] = %[[C0]] to %[[C128]] step %[[C1]]
+// CHECK:           scf.for %[[ARG2:.+]] = %[[C0]] to %[[UB]] step %[[C1]]
+// CHECK:             %[[T1:.+]] = addi %[[ARG2]], %[[C1]] : index
+// CHECK:             %[[V1:.+]] = memref.load %[[BUF1]][%[[ARG2]]]
+// CHECK:             %[[V2:.+]] = memref.load %[[BUF1]][%[[T1]]]
+// CHECK:             %[[V3:.+]] = memref.load %[[BUF2]][%[[ARG2]]]
+// CHECK:             %[[V4:.+]] = memref.load %[[BUF2]][%[[T1]]]
+// CHECK:             %[[COND:.+]] = cmpf ogt, %[[V1]], %[[V2]] : f32
+// CHECK:             scf.if %[[COND]] {
+// CHECK:               %[[T2:.+]] = addi %[[ARG2]], %[[C1]] : index
+// CHECK:               memref.store %[[V2]], %[[BUF1]][%[[ARG2]]]
+// CHECK:               memref.store %[[V1]], %[[BUF1]][%[[T2]]]
+// CHECK:               memref.store %[[V4]], %[[BUF2]][%[[ARG2]]]
+// CHECK:               memref.store %[[V3]], %[[BUF2]][%[[T2]]]
+// CHECK:             }
diff --git a/iree/compiler/InputConversion/Common/BUILD b/iree/compiler/InputConversion/Common/BUILD
new file mode 100644
index 0000000..bf08663
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/BUILD
@@ -0,0 +1,62 @@
+# Copyright 2021 The IREE Authors
+#
+# Licensed under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+load("//build_tools/bazel:tblgen.bzl", "gentbl_cc_library")
+
+package(
+    default_visibility = ["//visibility:public"],
+    features = ["layering_check"],
+    licenses = ["notice"],  # Apache 2.0
+)
+
+gentbl_cc_library(
+    name = "PassesIncGen",
+    tbl_outs = [
+        (
+            ["-gen-pass-decls"],
+            "Passes.h.inc",
+        ),
+    ],
+    tblgen = "@llvm-project//mlir:mlir-tblgen",
+    td_file = "Passes.td",
+    td_srcs = [
+        "@llvm-project//mlir:PassBaseTdFiles",
+    ],
+)
+
+cc_library(
+    name = "PassHeaders",
+    hdrs = [
+        "PassDetail.h",
+        "Passes.h",
+        "Passes.h.inc",
+    ],
+    deps = [
+        ":PassesIncGen",
+        "@llvm-project//mlir:Pass",
+        "@llvm-project//mlir:Transforms",
+    ],
+)
+
+cc_library(
+    name = "Common",
+    srcs = [
+        "Passes.cpp",
+        "TopLevelSCFToCFG.cpp",
+    ],
+    hdrs = [
+        "Passes.h",
+    ],
+    deps = [
+        ":PassHeaders",
+        ":PassesIncGen",
+        "@llvm-project//mlir:LinalgOps",
+        "@llvm-project//mlir:Pass",
+        "@llvm-project//mlir:SCFDialect",
+        "@llvm-project//mlir:SCFToStandard",
+        "@llvm-project//mlir:Transforms",
+    ],
+)
diff --git a/iree/compiler/InputConversion/Common/CMakeLists.txt b/iree/compiler/InputConversion/Common/CMakeLists.txt
new file mode 100644
index 0000000..484de62
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/CMakeLists.txt
@@ -0,0 +1,58 @@
+################################################################################
+# Autogenerated by build_tools/bazel_to_cmake/bazel_to_cmake.py from           #
+# iree/compiler/InputConversion/Common/BUILD                                   #
+#                                                                              #
+# Use iree_cmake_extra_content from iree/build_defs.oss.bzl to add arbitrary   #
+# CMake-only content.                                                          #
+#                                                                              #
+# To disable autogeneration for this file entirely, delete this header.        #
+################################################################################
+
+iree_add_all_subdirs()
+
+iree_tablegen_library(
+  NAME
+    PassesIncGen
+  TD_FILE
+    "Passes.td"
+  OUTS
+    -gen-pass-decls Passes.h.inc
+)
+
+iree_cc_library(
+  NAME
+    PassHeaders
+  HDRS
+    "PassDetail.h"
+    "Passes.h"
+    "Passes.h.inc"
+  DEPS
+    MLIRPass
+    MLIRTransforms
+  PUBLIC
+)
+
+iree_cc_library(
+  NAME
+    Common
+  HDRS
+    "Passes.h"
+  SRCS
+    "Passes.cpp"
+    "TopLevelSCFToCFG.cpp"
+  DEPS
+    ::PassHeaders
+    MLIRLinalg
+    MLIRPass
+    MLIRSCF
+    MLIRSCFToStandard
+    MLIRTransforms
+  PUBLIC
+)
+
+### BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_BELOW_THIS_LINE ###
+# TODO: For some reason, these dependencies are not being added automatically.
+add_dependencies(
+  iree_compiler_InputConversion_Common_PassHeaders
+  iree_compiler_InputConversion_Common_PassesIncGen
+)
diff --git a/iree/compiler/InputConversion/Common/PassDetail.h b/iree/compiler/InputConversion/Common/PassDetail.h
new file mode 100644
index 0000000..71990b3
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/PassDetail.h
@@ -0,0 +1,21 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#ifndef IREE_COMPILER_INPUTCONVERSION_COMMON_PASSDETAIL_H_
+#define IREE_COMPILER_INPUTCONVERSION_COMMON_PASSDETAIL_H_
+
+#include "mlir/Pass/Pass.h"
+
+namespace mlir {
+namespace iree_compiler {
+
+#define GEN_PASS_CLASSES
+#include "iree/compiler/InputConversion/Common/Passes.h.inc"
+
+}  // namespace iree_compiler
+}  // namespace mlir
+
+#endif  // IREE_COMPILER_INPUTCONVERSION_COMMON_PASSDETAIL_H_
diff --git a/iree/compiler/InputConversion/Common/Passes.cpp b/iree/compiler/InputConversion/Common/Passes.cpp
new file mode 100644
index 0000000..cb52f5b
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/Passes.cpp
@@ -0,0 +1,27 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#include "iree/compiler/InputConversion/Common/Passes.h"
+
+#include "mlir/Pass/PassOptions.h"
+#include "mlir/Pass/PassRegistry.h"
+#include "mlir/Transforms/Passes.h"
+
+namespace mlir {
+namespace iree_compiler {
+
+namespace {
+#define GEN_PASS_REGISTRATION
+#include "iree/compiler/InputConversion/Common/Passes.h.inc"  // IWYU pragma: export
+}  // namespace
+
+void registerCommonInputConversionPasses() {
+  // Generated.
+  registerPasses();
+}
+
+}  // namespace iree_compiler
+}  // namespace mlir
diff --git a/iree/compiler/InputConversion/Common/Passes.h b/iree/compiler/InputConversion/Common/Passes.h
new file mode 100644
index 0000000..5b7fc65
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/Passes.h
@@ -0,0 +1,30 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#ifndef IREE_COMPILER_INPUTCONVERSION_COMMON_PASSES_H_
+#define IREE_COMPILER_INPUTCONVERSION_COMMON_PASSES_H_
+
+#include "mlir/Pass/Pass.h"
+
+namespace mlir {
+namespace iree_compiler {
+
+//------------------------------------------------------------------------------
+// Conversions into Linalg
+//------------------------------------------------------------------------------
+
+std::unique_ptr<OperationPass<FuncOp>> createTopLevelSCFToCFGPass();
+
+//===----------------------------------------------------------------------===//
+// Register all Passes
+//===----------------------------------------------------------------------===//
+
+void registerCommonInputConversionPasses();
+
+}  // namespace iree_compiler
+}  // namespace mlir
+
+#endif  // IREE_COMPILER_INPUTCONVERSION_COMMON_PASSES_H_
diff --git a/iree/compiler/InputConversion/Common/Passes.td b/iree/compiler/InputConversion/Common/Passes.td
new file mode 100644
index 0000000..256f14f
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/Passes.td
@@ -0,0 +1,18 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#ifndef IREE_COMPILER_INPUTCONVERSION_TOSA_PASSES
+#define IREE_COMPILER_INPUTCONVERSION_TOSA_PASSES
+
+include "mlir/Pass/PassBase.td"
+
+def TopLevelSCFToCFG :
+    Pass<"iree-top-level-scf-to-cfg", "FuncOp"> {
+  let summary = "Converts non-nested SCF constructs to CFG (not traversing into opaque operations).";
+  let constructor = "mlir::iree_compiler::createTopLevelSCFToCFGPass()";
+}
+
+#endif // IREE_COMPILER_INPUTCONVERSION_TOSA_PASSES
diff --git a/iree/compiler/InputConversion/Common/TopLevelSCFToCFG.cpp b/iree/compiler/InputConversion/Common/TopLevelSCFToCFG.cpp
new file mode 100644
index 0000000..1e29649
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/TopLevelSCFToCFG.cpp
@@ -0,0 +1,54 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#include "iree/compiler/InputConversion/Common/PassDetail.h"
+#include "iree/compiler/InputConversion/Common/Passes.h"
+#include "mlir/Conversion/SCFToStandard/SCFToStandard.h"
+#include "mlir/Dialect/Linalg/IR/LinalgOps.h"
+#include "mlir/Dialect/SCF/SCF.h"
+#include "mlir/Pass/Pass.h"
+#include "mlir/Pass/PassManager.h"
+#include "mlir/Transforms/DialectConversion.h"
+
+namespace mlir {
+namespace iree_compiler {
+
+namespace {
+
+struct TopLevelSCFToCFGPass
+    : public TopLevelSCFToCFGBase<TopLevelSCFToCFGPass> {
+  void runOnOperation() override;
+};
+
+}  // namespace
+
+void TopLevelSCFToCFGPass::runOnOperation() {
+  RewritePatternSet patterns(&getContext());
+  populateLoopToStdConversionPatterns(patterns);
+  // Configure conversion to lower out scf.for, scf.if, scf.parallel and
+  // scf.while. Anything else is fine.
+  ConversionTarget target(getContext());
+  target.addIllegalOp<scf::ForOp, scf::IfOp, scf::ParallelOp, scf::WhileOp>();
+  target.markUnknownOpDynamicallyLegal([](Operation *) { return true; });
+
+  // For nested, opaque ops that we support, mark them recursively legal.
+  // Otherwise, SCF within them will be processed by this pass.
+  // It would be nice to be able to set this for the whole dialect, but
+  // upstream does not support that yet.
+  target.addLegalOp<linalg::GenericOp>();
+  target.markOpRecursivelyLegal<linalg::GenericOp>();
+
+  if (failed(
+          applyPartialConversion(getOperation(), target, std::move(patterns))))
+    signalPassFailure();
+}
+
+std::unique_ptr<OperationPass<FuncOp>> createTopLevelSCFToCFGPass() {
+  return std::make_unique<TopLevelSCFToCFGPass>();
+}
+
+}  // namespace iree_compiler
+}  // namespace mlir
diff --git a/iree/compiler/InputConversion/Common/test/BUILD b/iree/compiler/InputConversion/Common/test/BUILD
new file mode 100644
index 0000000..d9e87bd
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/test/BUILD
@@ -0,0 +1,30 @@
+# Copyright 2021 The IREE Authors
+#
+# Licensed under the Apache License v2.0 with LLVM Exceptions.
+# See https://llvm.org/LICENSE.txt for license information.
+# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+# Tests for common transforms.
+
+load("//iree:lit_test.bzl", "iree_lit_test_suite")
+load("//build_tools/bazel:enforce_glob.bzl", "enforce_glob")
+
+package(
+    default_visibility = ["//visibility:public"],
+    features = ["layering_check"],
+    licenses = ["notice"],  # Apache 2.0
+)
+
+iree_lit_test_suite(
+    name = "lit",
+    srcs = enforce_glob(
+        [
+            "top_level_scf_to_cfg.mlir",
+        ],
+        include = ["*.mlir"],
+    ),
+    data = [
+        "//iree/tools:IreeFileCheck",
+        "//iree/tools:iree-opt",
+    ],
+)
diff --git a/iree/compiler/InputConversion/Common/test/CMakeLists.txt b/iree/compiler/InputConversion/Common/test/CMakeLists.txt
new file mode 100644
index 0000000..ab43294
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/test/CMakeLists.txt
@@ -0,0 +1,23 @@
+################################################################################
+# Autogenerated by build_tools/bazel_to_cmake/bazel_to_cmake.py from           #
+# iree/compiler/InputConversion/Common/test/BUILD                              #
+#                                                                              #
+# Use iree_cmake_extra_content from iree/build_defs.oss.bzl to add arbitrary   #
+# CMake-only content.                                                          #
+#                                                                              #
+# To disable autogeneration for this file entirely, delete this header.        #
+################################################################################
+
+iree_add_all_subdirs()
+
+iree_lit_test_suite(
+  NAME
+    lit
+  SRCS
+    "top_level_scf_to_cfg.mlir"
+  DATA
+    iree::tools::IreeFileCheck
+    iree::tools::iree-opt
+)
+
+### BAZEL_TO_CMAKE_PRESERVES_ALL_CONTENT_BELOW_THIS_LINE ###
diff --git a/iree/compiler/InputConversion/Common/test/top_level_scf_to_cfg.mlir b/iree/compiler/InputConversion/Common/test/top_level_scf_to_cfg.mlir
new file mode 100644
index 0000000..c06f57c
--- /dev/null
+++ b/iree/compiler/InputConversion/Common/test/top_level_scf_to_cfg.mlir
@@ -0,0 +1,45 @@
+// RUN: iree-opt -split-input-file -iree-top-level-scf-to-cfg %s | IreeFileCheck %s
+
+// CHECK-LABEL: @generic_nested_for
+// While not super recommended, we do have cases of SCF constructs embedded
+// in linalg.generic. This sample was reduced from a lowering of tf.pow.
+// The normal -convert-scf-to-std pass will produce an illegal linalg op
+// (multiple basic blocks). The -iree-top-level-scf-to-cfg should not touch it.
+#map = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
+func @generic_nested_for(%arg0: tensor<?x?x?x?xi32>, %arg1: tensor<?x?x?x?xi32>, %out0: tensor<?x?x?x?xi32>) -> tensor<?x?x?x?xi32> {
+  %c0 = constant 0 : index
+  %c1 = constant 1 : index
+  %c6 = constant 6 : index
+  %c-1_i32 = constant -1 : i32
+  %c0_i32 = constant 0 : i32
+  %c1_i32 = constant 1 : i32
+  %c2_i32 = constant 2 : i32
+  // CHECK: linalg.generic
+  // CHECK: scf.for
+  // CHECK: linalg.yield
+  %0 = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel", "parallel", "parallel", "parallel"]}
+    ins(%arg0, %arg1 : tensor<?x?x?x?xi32>, tensor<?x?x?x?xi32>) outs(%out0 : tensor<?x?x?x?xi32>) {
+  ^bb0(%arg2: i32, %arg3: i32, %arg4: i32):  // no predecessors
+    %18:3 = scf.for %arg5 = %c0 to %c6 step %c1 iter_args(%arg6 = %c1_i32, %arg7 = %arg2, %arg8 = %arg3) -> (i32, i32, i32) {
+      %28 = and %arg8, %c1_i32 : i32
+      %29 = cmpi eq, %28, %c1_i32 : i32
+      %30 = muli %arg6, %arg7 : i32
+      %31 = select %29, %30, %arg6 : i32
+      %32 = muli %arg7, %arg7 : i32
+      %33 = shift_right_unsigned %arg8, %c1_i32 : i32
+      scf.yield %31, %32, %33 : i32, i32, i32
+    }
+    %19 = remi_signed %arg3, %c2_i32 : i32
+    %20 = cmpi eq, %19, %c0_i32 : i32
+    %21 = cmpi slt, %arg3, %c0_i32 : i32
+    %22 = cmpi eq, %arg2, %c1_i32 : i32
+    %23 = cmpi eq, %arg2, %c-1_i32 : i32
+    %24 = select %22, %c1_i32, %c0_i32 : i32
+    %25 = select %20, %c1_i32, %c-1_i32 : i32
+    %26 = select %23, %25, %24 : i32
+    %27 = select %21, %26, %18#0 : i32
+    linalg.yield %27 : i32
+  } -> tensor<?x?x?x?xi32>
+
+  return %0 : tensor<?x?x?x?xi32>
+}
diff --git a/iree/compiler/InputConversion/MHLO/BUILD b/iree/compiler/InputConversion/MHLO/BUILD
index 3b3062b..23c3f33 100644
--- a/iree/compiler/InputConversion/MHLO/BUILD
+++ b/iree/compiler/InputConversion/MHLO/BUILD
@@ -46,6 +46,7 @@
     name = "MHLO",
     srcs = [
         "BroadcastingToLinalgPatterns.cpp",
+        "ConvertAndDistributeMHLOToLinalgExt.cpp",
         "ConvertComplexToReal.cpp",
         "ConvertMHLOToFlow.cpp",
         "ConvertMHLOToFlow.h",
@@ -63,7 +64,9 @@
         ":PassesIncGen",
         "//iree/compiler/Dialect/Flow/IR",
         "//iree/compiler/Dialect/Flow/Transforms",
+        "//iree/compiler/Dialect/LinalgExt/IR",
         "//iree/compiler/Dialect/Shape/IR",
+        "//iree/compiler/InputConversion/Common",
         "@llvm-project//llvm:Support",
         "@llvm-project//mlir:Affine",
         "@llvm-project//mlir:ComplexDialect",
@@ -85,6 +88,7 @@
         "@mlir-hlo//:hlo",
         "@mlir-hlo//:legalize_gather_to_torch_index_select",
         "@mlir-hlo//:legalize_to_linalg",
+        "@mlir-hlo//:map_lmhlo_to_scalar_op",
         "@mlir-hlo//:materialize_broadcasts",
         "@mlir-hlo//:mhlo_to_mhlo_lowering_patterns",
         "@mlir-hlo//:unfuse_batch_norm",
diff --git a/iree/compiler/InputConversion/MHLO/CMakeLists.txt b/iree/compiler/InputConversion/MHLO/CMakeLists.txt
index 9203fae..3bedf58 100644
--- a/iree/compiler/InputConversion/MHLO/CMakeLists.txt
+++ b/iree/compiler/InputConversion/MHLO/CMakeLists.txt
@@ -40,6 +40,7 @@
     "Passes.h"
   SRCS
     "BroadcastingToLinalgPatterns.cpp"
+    "ConvertAndDistributeMHLOToLinalgExt.cpp"
     "ConvertComplexToReal.cpp"
     "ConvertMHLOToFlow.cpp"
     "ConvertMHLOToFlow.h"
@@ -69,7 +70,9 @@
     MLIRTransforms
     iree::compiler::Dialect::Flow::IR
     iree::compiler::Dialect::Flow::Transforms
+    iree::compiler::Dialect::LinalgExt::IR
     iree::compiler::Dialect::Shape::IR
+    iree::compiler::InputConversion::Common
     tensorflow::mlir_hlo
   PUBLIC
 )
diff --git a/iree/compiler/InputConversion/MHLO/ConvertAndDistributeMHLOToLinalgExt.cpp b/iree/compiler/InputConversion/MHLO/ConvertAndDistributeMHLOToLinalgExt.cpp
new file mode 100644
index 0000000..ad01857
--- /dev/null
+++ b/iree/compiler/InputConversion/MHLO/ConvertAndDistributeMHLOToLinalgExt.cpp
@@ -0,0 +1,191 @@
+// Copyright 2021 The IREE Authors
+//
+// Licensed under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+#include "iree/compiler/Dialect/Flow/IR/FlowDialect.h"
+#include "iree/compiler/Dialect/Flow/IR/FlowOps.h"
+#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtDialect.h"
+#include "iree/compiler/Dialect/LinalgExt/IR/LinalgExtOps.h"
+#include "iree/compiler/InputConversion/MHLO/PassDetail.h"
+#include "iree/compiler/InputConversion/MHLO/Passes.h"
+#include "iree/compiler/InputConversion/MHLO/Rewriters.h"
+#include "mlir-hlo/Dialect/mhlo/IR/hlo_ops.h"
+#include "mlir-hlo/Dialect/mhlo/transforms/map_lmhlo_to_scalar_op.h"
+#include "mlir/IR/BuiltinOps.h"
+#include "mlir/IR/BuiltinTypes.h"
+#include "mlir/IR/Matchers.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Transforms/DialectConversion.h"
+
+namespace mlir {
+namespace iree_compiler {
+
+static bool isInBodyOfLinalgExtOps(Operation *op) {
+  auto parent_op = op->getParentRegion()->getParentOp();
+  return parent_op->getDialect() ==
+         parent_op->getContext()
+             ->getLoadedDialect<linalg_ext::LinalgExtDialect>();
+}
+
+namespace {
+
+//===----------------------------------------------------------------------===//
+// Base classes.
+//===----------------------------------------------------------------------===//
+
+template <typename Derived, typename OpTy>
+struct ConvertToLinalgExtPattern : public OpConversionPattern<OpTy> {
+  using OpConversionPattern<OpTy>::OpConversionPattern;
+
+  LogicalResult matchAndRewrite(
+      OpTy op, ArrayRef<Value> args,
+      ConversionPatternRewriter &rewriter) const final {
+    Value one = rewriter.create<ConstantIndexOp>(op.getLoc(), 1);
+    SmallVector<Value> workload(3, one);
+    auto dispatchOp = rewriter.create<IREE::Flow::DispatchWorkgroupsOp>(
+        op.getLoc(), workload, op->getResultTypes(),
+        /*result_dims=*/ValueRange{},
+        /*operands=*/args,
+        /*operand_dims=*/ValueRange{},
+        /*tied_operands=*/Derived::getTiedResultOperandIndices(args));
+    {
+      OpBuilder::InsertionGuard guard(rewriter);
+      rewriter.setInsertionPointToStart(&dispatchOp.getRegion().front());
+      if (failed(Derived::lowerMHLOOp(dispatchOp, op, args, rewriter))) {
+        return failure();
+      }
+      rewriter.create<IREE::Flow::ReturnOp>(op.getLoc());
+    }
+    rewriter.replaceOp(op, dispatchOp.getResults());
+    return success();
+  }
+};
+
+//===----------------------------------------------------------------------===//
+// Region operations lowering.
+//===----------------------------------------------------------------------===//
+
+template <typename OpTy>
+struct LinalgExtRegionHLOOpConversion : public OpConversionPattern<OpTy> {
+  using OpConversionPattern<OpTy>::OpConversionPattern;
+  LogicalResult matchAndRewrite(
+      OpTy op, ArrayRef<Value> args,
+      ConversionPatternRewriter &rewriter) const final {
+    if (!isInBodyOfLinalgExtOps(op)) return failure();
+    if (!op.getResult().getType().template isa<TensorType>()) return failure();
+    if (llvm::all_of(args, [](Value arg) {
+          return arg.getType().template isa<TensorType>();
+        })) {
+      return failure();
+    }
+    Value result = lmhlo::HloOpToStdScalarOp::map<OpTy>(
+        op, getElementTypeOrSelf(op.getType()), args, &rewriter);
+    rewriter.replaceOp(op, result);
+    return success();
+  }
+};
+
+struct LinalgExtRegionReturnOpConversion
+    : public OpConversionPattern<mhlo::ReturnOp> {
+  using OpConversionPattern<mhlo::ReturnOp>::OpConversionPattern;
+  LogicalResult matchAndRewrite(
+      mhlo::ReturnOp op, ArrayRef<Value> args,
+      ConversionPatternRewriter &rewriter) const final {
+    if (!isInBodyOfLinalgExtOps(op)) return failure();
+    rewriter.replaceOpWithNewOp<linalg_ext::YieldOp>(op, args);
+    return success();
+  }
+};
+
+//===----------------------------------------------------------------------===//
+// SortOp
+//===----------------------------------------------------------------------===//
+
+struct SortOpConversion
+    : public ConvertToLinalgExtPattern<SortOpConversion, mhlo::SortOp> {
+  using ConvertToLinalgExtPattern<SortOpConversion,
+                                  mhlo::SortOp>::ConvertToLinalgExtPattern;
+
+  static SmallVector<int64_t> getTiedResultOperandIndices(
+      ArrayRef<Value> args) {
+    return llvm::to_vector<4>(llvm::seq<int64_t>(0, args.size()));
+  }
+
+  static LogicalResult lowerMHLOOp(IREE::Flow::DispatchWorkgroupsOp dispatchOp,
+                                   mhlo::SortOp op, ArrayRef<Value> args,
+                                   ConversionPatternRewriter &rewriter) {
+    auto blockArgs = dispatchOp.getClosureBodyRegion().getArguments();
+    SmallVector<Value> initValues;
+    ImplicitLocOpBuilder b(op.getLoc(), rewriter);
+    for (auto it : llvm::zip(args, blockArgs)) {
+      auto argTy = std::get<0>(it).getType().cast<RankedTensorType>();
+      auto blockArg = std::get<1>(it);
+      initValues.push_back(
+          b.create<IREE::Flow::DispatchTensorLoadOp>(argTy, blockArg));
+    }
+
+    auto sortOp = b.create<linalg_ext::SortOp>(op.getResultTypes(),
+                                               /*inputs=*/ValueRange{},
+                                               initValues, op.dimensionAttr());
+    rewriter.inlineRegionBefore(op.comparator(), sortOp.region(),
+                                sortOp.region().begin());
+    Region &region = sortOp.region();
+    Block &block = region.front();
+    TypeConverter::SignatureConversion signature_converter(
+        block.getNumArguments());
+    for (auto en : llvm::enumerate(block.getArguments())) {
+      signature_converter.addInputs(en.index(),
+                                    getElementTypeOrSelf(en.value().getType()));
+    }
+    rewriter.applySignatureConversion(&region, signature_converter);
+
+    for (auto it : llvm::zip(sortOp.getResults(), blockArgs)) {
+      auto value = std::get<0>(it);
+      auto target = std::get<1>(it);
+      b.create<IREE::Flow::DispatchTensorStoreOp>(value, target);
+    }
+
+    return success();
+  }
+};
+
+struct ConvertAndDistributeMHLOToLinalgExtPass
+    : public ConvertAndDistributeMHLOToLinalgExtBase<
+          ConvertAndDistributeMHLOToLinalgExtPass> {
+  void getDependentDialects(DialectRegistry &registry) const override {
+    registry.insert<linalg_ext::LinalgExtDialect, IREE::Flow::FlowDialect,
+                    StandardOpsDialect, tensor::TensorDialect>();
+  }
+
+  void runOnOperation() override {
+    OwningRewritePatternList patterns(&getContext());
+    MLIRContext *context = &getContext();
+
+    patterns.insert<SortOpConversion>(context);
+    patterns.insert<LinalgExtRegionHLOOpConversion<mhlo::CompareOp>,
+                    LinalgExtRegionReturnOpConversion>(context,
+                                                       PatternBenefit(1000));
+
+    ConversionTarget target(getContext());
+    target
+        .addLegalDialect<linalg_ext::LinalgExtDialect, IREE::Flow::FlowDialect,
+                         StandardOpsDialect, tensor::TensorDialect>();
+    target.addIllegalOp<mhlo::SortOp>();
+
+    if (failed(applyPartialConversion(getOperation(), target,
+                                      std::move(patterns)))) {
+      signalPassFailure();
+    }
+  }
+};
+}  // namespace
+
+std::unique_ptr<OperationPass<FuncOp>>
+createConvertAndDistributeMHLOToLinalgExtPass() {
+  return std::make_unique<ConvertAndDistributeMHLOToLinalgExtPass>();
+}
+
+}  // namespace iree_compiler
+}  // namespace mlir
diff --git a/iree/compiler/InputConversion/MHLO/Passes.cpp b/iree/compiler/InputConversion/MHLO/Passes.cpp
index 7a57a4e..2866a4e 100644
--- a/iree/compiler/InputConversion/MHLO/Passes.cpp
+++ b/iree/compiler/InputConversion/MHLO/Passes.cpp
@@ -7,7 +7,7 @@
 #include "iree/compiler/InputConversion/MHLO/Passes.h"
 
 #include "iree/compiler/Dialect/Flow/Transforms/Passes.h"
-#include "mlir/Conversion/SCFToStandard/SCFToStandard.h"
+#include "iree/compiler/InputConversion/Common/Passes.h"
 #include "mlir/Conversion/ShapeToStandard/ShapeToStandard.h"
 #include "mlir/Dialect/Shape/Transforms/Passes.h"
 #include "mlir/Pass/PassOptions.h"
@@ -31,8 +31,7 @@
   // Currently we don't handle SCF ops well and have to convert them all to CFG.
   // In the future it would be nice if we could have all of flow be both scf
   // and cfg compatible.
-  // TODO: Currently recurses into SCF in Linalg generic - with hilarity.
-  passManager.addNestedPass<FuncOp>(mlir::createLowerToCFGPass());
+  passManager.addNestedPass<FuncOp>(createTopLevelSCFToCFGPass());
 
   // Various shape functions may have been materialized in the `shape.shape_of`
   // style of treating shapes as tensors. We prefer to legalize these to
diff --git a/iree/compiler/InputConversion/MHLO/Passes.h b/iree/compiler/InputConversion/MHLO/Passes.h
index 8acb2aa..5761952 100644
--- a/iree/compiler/InputConversion/MHLO/Passes.h
+++ b/iree/compiler/InputConversion/MHLO/Passes.h
@@ -34,6 +34,10 @@
 /// Creates XLA-HLO to Linalg on tensors transformation pass.
 std::unique_ptr<OperationPass<FuncOp>> createMHLOToLinalgOnTensorsPass();
 
+/// Creates XLA-HLO to LinalgExt and Flow transformation pass.
+std::unique_ptr<OperationPass<FuncOp>>
+createConvertAndDistributeMHLOToLinalgExtPass();
+
 /// Creates XLA-HLO preprocessing transformation pass. In this pass we should
 /// have all mhlo -> mhlo transformations that are shared between all
 /// backends.
diff --git a/iree/compiler/InputConversion/MHLO/Passes.td b/iree/compiler/InputConversion/MHLO/Passes.td
index e7ce2ff..a08fa08 100644
--- a/iree/compiler/InputConversion/MHLO/Passes.td
+++ b/iree/compiler/InputConversion/MHLO/Passes.td
@@ -15,6 +15,14 @@
   let constructor = "mlir::iree_compiler::createMHLOToLinalgOnTensorsPass()";
 }
 
+def ConvertAndDistributeMHLOToLinalgExt
+    : Pass<"iree-mhlo-to-linalg-ext", "FuncOp"> {
+  let summary =
+      "Convert from XLA-HLO ops to LinalgExt ops and distribute to Flow ops";
+  let constructor =
+      "mlir::iree_compiler::createConvertAndDistributeMHLOToLinalgExtPass()";
+}
+
 def LegalizeInputTypes :
     Pass<"iree-mhlo-legalize-input-types", "ModuleOp"> {
   let summary = "Legalizes input types to ones supported by the IREE flow dialect";
diff --git a/iree/compiler/InputConversion/MHLO/test/BUILD b/iree/compiler/InputConversion/MHLO/test/BUILD
index 67fa943..c3f4485 100644
--- a/iree/compiler/InputConversion/MHLO/test/BUILD
+++ b/iree/compiler/InputConversion/MHLO/test/BUILD
@@ -20,6 +20,7 @@
     srcs = enforce_glob(
         [
             "broadcasting.mlir",
+            "convert_and_distribute_mhlo_to_linalg_ext.mlir",
             "convert_complex_to_real.mlir",
             "dynamic_shape.mlir",
             "fft.mlir",
diff --git a/iree/compiler/InputConversion/MHLO/test/CMakeLists.txt b/iree/compiler/InputConversion/MHLO/test/CMakeLists.txt
index af885e2..9f3a453 100644
--- a/iree/compiler/InputConversion/MHLO/test/CMakeLists.txt
+++ b/iree/compiler/InputConversion/MHLO/test/CMakeLists.txt
@@ -15,6 +15,7 @@
     lit
   SRCS
     "broadcasting.mlir"
+    "convert_and_distribute_mhlo_to_linalg_ext.mlir"
     "convert_complex_to_real.mlir"
     "dynamic_shape.mlir"
     "fft.mlir"
diff --git a/iree/compiler/InputConversion/MHLO/test/convert_and_distribute_mhlo_to_linalg_ext.mlir b/iree/compiler/InputConversion/MHLO/test/convert_and_distribute_mhlo_to_linalg_ext.mlir
new file mode 100644
index 0000000..7902e19
--- /dev/null
+++ b/iree/compiler/InputConversion/MHLO/test/convert_and_distribute_mhlo_to_linalg_ext.mlir
@@ -0,0 +1,82 @@
+// RUN: iree-opt -split-input-file -iree-mhlo-to-linalg-ext %s | IreeFileCheck %s
+
+func @sort_1d(%arg0: tensor<128xi32>) -> (tensor<128xi32>) {
+  %0 = "mhlo.sort"(%arg0) ( {
+  ^bb0(%arg2: tensor<i32>, %arg3: tensor<i32>):  // no predecessors
+    %1 = "mhlo.compare"(%arg2, %arg3) {comparison_direction = "GT"} : (tensor<i32>, tensor<i32>) -> tensor<i1>
+    "mhlo.return"(%1) : (tensor<i1>) -> ()
+  }) {dimension = 0 : i64, is_stable = false} : (tensor<128xi32>) -> (tensor<128xi32>)
+  return %0 : tensor<128xi32>
+}
+// CHECK-LABEL: func @sort_1d
+// CHECK:         %[[ARG0:[a-zA-Z0-9]+]]
+// CHECK-DAG:     %[[C1:.+]] = constant 1 : index
+// CHECK:         %[[RES:.+]] = flow.dispatch.workgroups
+// CHECK-SAME:      [%[[C1]], %[[C1]], %[[C1]]](%[[ARG0]]) : (tensor<128xi32>) -> %[[ARG0]]
+// CHECK:           %[[ARG1:.+]]: !flow.dispatch.tensor<readwrite:128xi32>
+// CHECK:           %[[IN:.+]] = flow.dispatch.tensor.load %[[ARG1]]
+// CHECK:           %[[SORT:.+]] = linalg_ext.sort
+// CHECK-SAME:        dimension = 0 : i64
+// CHECK-SAME:        outs(%[[IN]] : tensor<128xi32>)
+// CHECK:          ^bb0(%[[ARG2:.+]]: i32, %[[ARG3:.+]]: i32)
+// CHECK:            %[[CMP:.+]] = cmpi sgt, %[[ARG2]], %[[ARG3]]
+// CHECK:            linalg_ext.yield %[[CMP]]
+// CHECK:          flow.dispatch.tensor.store %[[SORT]], %[[ARG1]]
+// CHECK:        return %[[RES]]
+
+// -----
+
+func @sort_2d(%arg0: tensor<16x32xi32>) -> (tensor<16x32xi32>) {
+  %0 = "mhlo.sort"(%arg0) ( {
+  ^bb0(%arg2: tensor<i32>, %arg3: tensor<i32>):  // no predecessors
+    %1 = "mhlo.compare"(%arg2, %arg3) {comparison_direction = "GT"} : (tensor<i32>, tensor<i32>) -> tensor<i1>
+    "mhlo.return"(%1) : (tensor<i1>) -> ()
+  }) {dimension = 0 : i64, is_stable = false} : (tensor<16x32xi32>) -> (tensor<16x32xi32>)
+  return %0 : tensor<16x32xi32>
+}
+// CHECK-LABEL: func @sort_2d
+// CHECK:         %[[ARG0:[a-zA-Z0-9]+]]
+// CHECK-DAG:     %[[C1:.+]] = constant 1 : index
+// CHECK:         %[[RES:.+]] = flow.dispatch.workgroups
+// CHECK-SAME:      [%[[C1]], %[[C1]], %[[C1]]](%[[ARG0]]) : (tensor<16x32xi32>) -> %[[ARG0]]
+// CHECK:           %[[ARG1:.+]]: !flow.dispatch.tensor<readwrite:16x32xi32>
+// CHECK:           %[[IN:.+]] = flow.dispatch.tensor.load %[[ARG1]]
+// CHECK:           %[[SORT:.+]] = linalg_ext.sort
+// CHECK-SAME:        dimension = 0 : i64
+// CHECK-SAME:        outs(%[[IN]] : tensor<16x32xi32>)
+// CHECK:          ^bb0(%[[ARG2:.+]]: i32, %[[ARG3:.+]]: i32)
+// CHECK:            %[[CMP:.+]] = cmpi sgt, %[[ARG2]], %[[ARG3]]
+// CHECK:            linalg_ext.yield %[[CMP]]
+// CHECK:          flow.dispatch.tensor.store %[[SORT]], %[[ARG1]]
+// CHECK:        return %[[RES]]
+
+// -----
+
+func @topk(%arg0: tensor<128xi32>, %arg1: tensor<128xi32>) -> (tensor<128xi32>) {
+  %0:2 = "mhlo.sort"(%arg0, %arg1) ( {
+  ^bb0(%arg2: tensor<i32>, %arg3: tensor<i32>, %arg4: tensor<i32>, %arg5: tensor<i32>):  // no predecessors
+    %1 = "mhlo.compare"(%arg2, %arg3) {comparison_direction = "GT"} : (tensor<i32>, tensor<i32>) -> tensor<i1>
+    "mhlo.return"(%1) : (tensor<i1>) -> ()
+  }) {dimension = 0 : i64, is_stable = false} : (tensor<128xi32>, tensor<128xi32>) -> (tensor<128xi32>, tensor<128xi32>)
+  return %0#0 : tensor<128xi32>
+}
+// CHECK-LABEL: func @topk
+// CHECK:         %[[ARG0:[a-zA-Z0-9]+]]
+// CHECK:         %[[ARG1:[a-zA-Z0-9]+]]
+// CHECK-DAG:     %[[C1:.+]] = constant 1 : index
+// CHECK:         %[[RES:.+]]:2 = flow.dispatch.workgroups
+// CHECK-SAME:      [%[[C1]], %[[C1]], %[[C1]]](%[[ARG0]], %[[ARG1]])
+// CHECK-SAME:    : (tensor<128xi32>, tensor<128xi32>) -> (%[[ARG0]], %[[ARG1]])
+// CHECK:           %[[ARG2:[a-zA-Z0-9]+]]: !flow.dispatch.tensor<readwrite:128xi32>
+// CHECK:           %[[ARG3:[a-zA-Z0-9]+]]: !flow.dispatch.tensor<readwrite:128xi32>
+// CHECK:           %[[IN1:.+]] = flow.dispatch.tensor.load %[[ARG2]]
+// CHECK:           %[[IN2:.+]] = flow.dispatch.tensor.load %[[ARG3]]
+// CHECK:           %[[SORT:.+]]:2 = linalg_ext.sort
+// CHECK-SAME:        dimension = 0 : i64
+// CHECK-SAME:        outs(%[[IN1]], %[[IN2]] : tensor<128xi32>, tensor<128xi32>)
+// CHECK:          ^bb0(%[[ARG4:.+]]: i32, %[[ARG5:.+]]: i32, %{{.*}}: i32, %{{.*}}: i32)
+// CHECK:            %[[CMP:.+]] = cmpi sgt, %[[ARG4]], %[[ARG5]]
+// CHECK:            linalg_ext.yield %[[CMP]]
+// CHECK:          flow.dispatch.tensor.store %[[SORT]]#0, %[[ARG2]]
+// CHECK:          flow.dispatch.tensor.store %[[SORT]]#1, %[[ARG3]]
+// CHECK:        return %[[RES]]#0
diff --git a/iree/compiler/InputConversion/TOSA/BUILD b/iree/compiler/InputConversion/TOSA/BUILD
index b410131..50bc1e5 100644
--- a/iree/compiler/InputConversion/TOSA/BUILD
+++ b/iree/compiler/InputConversion/TOSA/BUILD
@@ -54,6 +54,7 @@
         ":PassHeaders",
         ":PassesIncGen",
         "//iree/compiler/Dialect/Flow/Transforms",
+        "//iree/compiler/InputConversion/Common",
         "@llvm-project//mlir:Pass",
         "@llvm-project//mlir:SCFToStandard",
         "@llvm-project//mlir:TosaDialect",
diff --git a/iree/compiler/InputConversion/TOSA/CMakeLists.txt b/iree/compiler/InputConversion/TOSA/CMakeLists.txt
index c3b37b5..504540e 100644
--- a/iree/compiler/InputConversion/TOSA/CMakeLists.txt
+++ b/iree/compiler/InputConversion/TOSA/CMakeLists.txt
@@ -50,6 +50,7 @@
     MLIRTosaToStandard
     MLIRTransforms
     iree::compiler::Dialect::Flow::Transforms
+    iree::compiler::InputConversion::Common
   PUBLIC
 )
 
diff --git a/iree/compiler/InputConversion/TOSA/Passes.cpp b/iree/compiler/InputConversion/TOSA/Passes.cpp
index 2011cc1..b3951d3 100644
--- a/iree/compiler/InputConversion/TOSA/Passes.cpp
+++ b/iree/compiler/InputConversion/TOSA/Passes.cpp
@@ -7,7 +7,7 @@
 #include "iree/compiler/InputConversion/TOSA/Passes.h"
 
 #include "iree/compiler/Dialect/Flow/Transforms/Passes.h"
-#include "mlir/Conversion/SCFToStandard/SCFToStandard.h"
+#include "iree/compiler/InputConversion/Common/Passes.h"
 #include "mlir/Conversion/TosaToLinalg/TosaToLinalg.h"
 #include "mlir/Conversion/TosaToSCF/TosaToSCF.h"
 #include "mlir/Conversion/TosaToStandard/TosaToStandard.h"
@@ -30,13 +30,11 @@
 
 // Prepare TOSA for use as an input to the Flow dialect.
 void buildTOSAInputConversionPassPipeline(OpPassManager &passManager) {
-  passManager.addNestedPass<FuncOp>(tosa::createTosaToSCF());
-
   // Currently we don't handle SCF ops well and have to convert them all to CFG.
   // In the future it would be nice if we could have all of flow be both scf
   // and cfg compatible.
-  // TODO: Currently recurses into SCF in Linalg generic - with hilarity.
-  passManager.addNestedPass<FuncOp>(mlir::createLowerToCFGPass());
+  passManager.addNestedPass<FuncOp>(tosa::createTosaToSCF());
+  passManager.addNestedPass<FuncOp>(createTopLevelSCFToCFGPass());
 
   // Now that control flow has been lowered, promote and extract_element
   // to tensor loads. This will be done again later once everything that can
diff --git a/iree/tools/BUILD b/iree/tools/BUILD
index ae9fc3e..612ef75 100644
--- a/iree/tools/BUILD
+++ b/iree/tools/BUILD
@@ -107,6 +107,7 @@
         "//iree/compiler/Dialect/IREE/IR",
         "//iree/compiler/Dialect/IREE/Transforms",
         "//iree/compiler/Dialect/LinalgExt/IR",
+        "//iree/compiler/Dialect/LinalgExt/Transforms",
         "//iree/compiler/Dialect/Modules/VMVX/IR:VMVXDialect",
         "//iree/compiler/Dialect/Modules/VMVX/Transforms",
         "//iree/compiler/Dialect/Shape/IR",
@@ -115,6 +116,7 @@
         "//iree/compiler/Dialect/VM/IR",
         "//iree/compiler/Dialect/VM/Transforms",
         "//iree/compiler/Dialect/Vulkan/IR",
+        "//iree/compiler/InputConversion/Common",
         "//iree/compiler/InputConversion/MHLO",
         "//iree/compiler/InputConversion/TOSA",
         "//iree/compiler/Translation:IREEVM",
diff --git a/iree/tools/CMakeLists.txt b/iree/tools/CMakeLists.txt
index 38f7127..2bee9c0 100644
--- a/iree/tools/CMakeLists.txt
+++ b/iree/tools/CMakeLists.txt
@@ -209,6 +209,7 @@
       iree::compiler::Dialect::IREE::IR
       iree::compiler::Dialect::IREE::Transforms
       iree::compiler::Dialect::LinalgExt::IR
+      iree::compiler::Dialect::LinalgExt::Transforms
       iree::compiler::Dialect::Modules::VMVX::IR::VMVXDialect
       iree::compiler::Dialect::Modules::VMVX::Transforms
       iree::compiler::Dialect::Shape::IR
@@ -217,6 +218,7 @@
       iree::compiler::Dialect::VM::IR
       iree::compiler::Dialect::VM::Transforms
       iree::compiler::Dialect::Vulkan::IR
+      iree::compiler::InputConversion::Common
       iree::compiler::InputConversion::MHLO
       iree::compiler::InputConversion::TOSA
       iree::compiler::Translation::IREEVM
diff --git a/iree/tools/init_iree_passes.h b/iree/tools/init_iree_passes.h
index d9b3bc2..20bd549 100644
--- a/iree/tools/init_iree_passes.h
+++ b/iree/tools/init_iree_passes.h
@@ -19,10 +19,12 @@
 #include "iree/compiler/Dialect/Flow/Transforms/Passes.h"
 #include "iree/compiler/Dialect/HAL/Transforms/Passes.h"
 #include "iree/compiler/Dialect/IREE/Transforms/Passes.h"
+#include "iree/compiler/Dialect/LinalgExt/Transforms/Passes.h"
 #include "iree/compiler/Dialect/Modules/VMVX/Transforms/Passes.h"
 #include "iree/compiler/Dialect/Shape/Transforms/Passes.h"
 #include "iree/compiler/Dialect/VM/Analysis/TestPasses.h"
 #include "iree/compiler/Dialect/VM/Transforms/Passes.h"
+#include "iree/compiler/InputConversion/Common/Passes.h"
 #include "iree/compiler/InputConversion/MHLO/Passes.h"
 #include "iree/compiler/InputConversion/TOSA/Passes.h"
 #include "iree/compiler/Translation/IREEVM.h"
@@ -38,9 +40,11 @@
   IREE::TFLite::registerPasses();
   IREE::TFLite::registerTransformPassPipeline();
 
+  registerCommonInputConversionPasses();
   registerMHLOConversionPasses();
   registerTOSAConversionPasses();
 
+  linalg_ext::registerLinalgExtPasses();
   IREE::Flow::registerFlowPasses();
   IREE::HAL::registerHALPasses();
   IREE::registerTransformPasses();