)]}'
{
  "commit": "cd5347c06974e4d4d1b65f07eff210e2ebd6280d",
  "tree": "aa8645c5170170a06c8eeda0a76fcba8ed19eb73",
  "parents": [
    "16d47a4cc446faff323f1b1e0725070a602d9261"
  ],
  "author": {
    "name": "Okwan Kwon",
    "email": "okwan@google.com",
    "time": "Fri Feb 17 08:44:52 2023 -0800"
  },
  "committer": {
    "name": "GitHub",
    "email": "noreply@github.com",
    "time": "Fri Feb 17 08:44:52 2023 -0800"
  },
  "message": "Support mhlo collective ops (#11988)\n\nAdd the support for the following mhlo collective ops:\r\n- mhlo.replica_id\r\n- mhlo.all_gather\r\n- mhlo.all_reduce\r\n- mhlo.reduce_scatter\r\n\r\nSince NCCL only supports the splitting and concatenation on dim 0 for\r\nall_gather and reduce_scatter, transposes are inserted when the\r\nsplit/concat dimension is not 0.\r\n\r\nTo make the implementation simple and incremental, several stages are\r\nplanned as follows:\r\n\r\nStage 1 (The current PR):\r\n\r\nIt assumes a deterministic order of collective operations with 1:1\r\nmapping from replica_id to rank. This means that there is a single\r\nreplica group in the mhlo operation and all replicas participate in the\r\ncollective operation. Since the order is deterministic and all ranks are\r\ninvolved in the communication, we can simply use the default channel for\r\ncommunication. In the MHLO ops, `use_global_device_ids` is set to use\r\nthe flattened IDs.\r\n\r\nNote that the MHLO collective ops have multiple strategies to interpret\r\n`replica_groups` attribute, such as `flattened_ids`, `cross_replica`,\r\n`cross_partition`, and `cross_replica_and_partition`. (See\r\nhttps://github.com/openxla/stablehlo/blob/main/docs/spec.md#parallel-execution\r\nfor more details of the strategies.)\r\n\r\nStage 2:\r\n\r\nSupports multiple channels. This allows us to have multiple replica\r\ngroups for collective operations.\r\n\r\nStage 3:\r\n\r\nSupports `partition_id` the other strategies: `cross_replica`,\r\n`cross_partition`, and `cross_replica_and_partition`.\r\n\r\nStage 4:\r\n\r\nSupports `all_to_all` and `collective_permute`. This would need to\r\nsupport the NCCL group markers to support multiple collective ops in\r\nparallel since the ops are composite and will need to be lowered into\r\nthe existing collective ops.\r\n\r\nStage 5:\r\nPJRT integration and model level testing.",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "456e6c1a62f6ff006669cc6c423e2df4fe09b57f",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Flow/IR/FlowBase.td",
      "new_id": "1f6e1b6d3f54ac5b0476e88f25aa30a42121422c",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Flow/IR/FlowBase.td"
    },
    {
      "type": "modify",
      "old_id": "244dbd405566f91534150878753fb8d36ca2d037",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Flow/IR/FlowOps.cpp",
      "new_id": "46596ffadc61a5dce1a051c06dbfca36257e5597",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Flow/IR/FlowOps.cpp"
    },
    {
      "type": "modify",
      "old_id": "5e1acabb4bc63b65bf9a80d0f635c8a63819ed37",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Flow/IR/FlowOps.td",
      "new_id": "e6f8c29fb81c492b4070801e36a8ba0963ebbf1e",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Flow/IR/FlowOps.td"
    },
    {
      "type": "modify",
      "old_id": "863b855071352a04ff955d9aee378a194f7bfd2e",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Flow/Transforms/CollapseReductionDims.cpp",
      "new_id": "7161888ac2c4c7509b7fa4d05d07b67548bd7c8d",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Flow/Transforms/CollapseReductionDims.cpp"
    },
    {
      "type": "modify",
      "old_id": "2631e4e1d647fa0d557b67ba578aaf33a40d5d6f",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Flow/Transforms/test/collapse_reduction.mlir",
      "new_id": "25484ec4a631e3c93c9235a2daea2222464f3f3e",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Flow/Transforms/test/collapse_reduction.mlir"
    },
    {
      "type": "modify",
      "old_id": "3a9fb2fcfa707472be14db433ebec9c76aabc8db",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/Patterns.cpp",
      "new_id": "5565bdf94604d936b39c338f1221f2fbab3d0f7e",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/Patterns.cpp"
    },
    {
      "type": "modify",
      "old_id": "c027a47acebd1a0ba5963a25435aecc6d91fd107",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/test/BUILD",
      "new_id": "b6a902b614e5e85158988bdc9086d360caa16594",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/test/BUILD"
    },
    {
      "type": "modify",
      "old_id": "82ea14850a7503a3248e9a216ef525974be211b2",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/test/CMakeLists.txt",
      "new_id": "c038d006d8741640863d31b98cec1a3db34771a0",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/test/CMakeLists.txt"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "9e800f615c13010e7bd1c3d040cf885094c3b5e6",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Stream/Conversion/FlowToStream/test/collective_ops.mlir"
    },
    {
      "type": "modify",
      "old_id": "baa36a5fe882f5bd11c4b99690b9c8e2a2e59ffe",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Dialect/Stream/Transforms/ConvertToStream.cpp",
      "new_id": "0b0acc86dd76090fba9e52b723cabae2b3cd3f12",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Dialect/Stream/Transforms/ConvertToStream.cpp"
    },
    {
      "type": "modify",
      "old_id": "4127a848a76b0c5d81d22c62f892d68ee33d797b",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/BUILD",
      "new_id": "2de86f406f03faaafb67ab1064593ac3254fd390",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/BUILD"
    },
    {
      "type": "modify",
      "old_id": "c713b78cf00f048d3e1604536da7be0b9a063ea6",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/CMakeLists.txt",
      "new_id": "26d37ce04f6bb455d2e4f231f114b7ecf9324cce",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/CMakeLists.txt"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "94607d9800692dcddbaadc9777b4dbd910dc3882",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/ConvertCollectiveOps.cpp"
    },
    {
      "type": "modify",
      "old_id": "03aa7810650796af1a85efa6bcd30b40df5c326a",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/MHLOToLinalgOnTensors.cpp",
      "new_id": "ea51f00d475689de952abbac88e0174bacfc5ceb",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/MHLOToLinalgOnTensors.cpp"
    },
    {
      "type": "modify",
      "old_id": "1855004910c7b42670b30bbc322dd6b69212385a",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/Passes.td",
      "new_id": "984b92a7c4147d4250482a6f7a523a514091753f",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/Passes.td"
    },
    {
      "type": "modify",
      "old_id": "ac996f0cdfdc62bbeecfa2175a509b7a360be7af",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/Rewriters.h",
      "new_id": "e839f38c843c136fcc47bdbc438cdeba0409c60a",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/Rewriters.h"
    },
    {
      "type": "modify",
      "old_id": "99e5b4fcf23e20db2d4f4e4073e305c02db2d929",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/test/BUILD",
      "new_id": "5ac112d8978f49f4263e44a22aac5046dadc0902",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/test/BUILD"
    },
    {
      "type": "modify",
      "old_id": "8404a205f7116bfd5911c734adb4dcdfbfc5d926",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/InputConversion/MHLO/test/CMakeLists.txt",
      "new_id": "8b9aa57630049c25da2e25127134e841812169c9",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/test/CMakeLists.txt"
    },
    {
      "type": "add",
      "old_id": "0000000000000000000000000000000000000000",
      "old_mode": 0,
      "old_path": "/dev/null",
      "new_id": "a93557d95346a18f1d3ebcc07f543b00e3046785",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/InputConversion/MHLO/test/convert_collective_ops.mlir"
    },
    {
      "type": "modify",
      "old_id": "c88700e6ae0d19a097dd7856dcf4d8d2dea002e1",
      "old_mode": 33188,
      "old_path": "runtime/src/iree/hal/command_buffer.h",
      "new_id": "6e5fd65248af7981821caa0efa26cc4a58cfcce0",
      "new_mode": 33188,
      "new_path": "runtime/src/iree/hal/command_buffer.h"
    },
    {
      "type": "modify",
      "old_id": "ddb3c8b419d3040c1ef423a105aaf2ae0fcf1ecd",
      "old_mode": 33188,
      "old_path": "runtime/src/iree/hal/drivers/cuda/cuda_device.c",
      "new_id": "ef0a82c596fa3925943b9409ac726c0419171c54",
      "new_mode": 33188,
      "new_path": "runtime/src/iree/hal/drivers/cuda/cuda_device.c"
    }
  ]
}
