)]}'
{
  "commit": "b21fa2a333fcdb1dd26238f2f2fde4ee49c1cd50",
  "tree": "bad1a0224bbb8a505afe5c62f556b0887e3f4719",
  "parents": [
    "d7c7b98e5de3ae574b87516b81ff489e0f49e7ee"
  ],
  "author": {
    "name": "Cheng-Yu Tsai",
    "email": "127166739+Charlie-Tsai1123@users.noreply.github.com",
    "time": "Sun May 31 23:49:01 2026 +0800"
  },
  "committer": {
    "name": "GitHub",
    "email": "noreply@github.com",
    "time": "Sun May 31 15:49:01 2026 +0000"
  },
  "message": "[CUDA] Add sm_121/Blackwell to known target (#24523)\n\n## Summary\nAdd initial CUDA known target support for `sm_121` / Blackwell NVIDIA\nGB10.\n\nThe CUDA execution limits are based on local cudaDeviceProp results from\nan sm_121 device. Existing NVIDIA MMA ops are reused as a conservative\nbaseline until Blackwell-specific MMA intrinsics are modeled.\n\nRelated to #24477.\n#24477 reports that IREE does not currently recognize newer Blackwell\nCUDA targets such as `sm_120`. This PR addresses the same\ntarget-enablement path for `sm_121`, which is the Blackwell target I can\nvalidate locally on NVIDIA GB10.\nIt intentionally does not add `sm_120` support because I do not have\n`sm_120` hardware to confirm the device limits or runtime behavior.\n\n## Testing\nTested locally on NVIDIA GB10 / `sm_121`. `sm_121` requires PTX 8.8.\nUsing `+ptx88` compiles successfully.\n\nCompiled and ran a local abs.mlir smoke test:\n```\n../iree-build/tools/iree-compile abs.mlir \\\n  --iree-hal-target-device\u003dcuda \\\n  --iree-cuda-target\u003dsm_121 \\\n  --iree-cuda-target-features\u003d+ptx88 \\\n  -o abs_cuda.vmfb\n\n../iree-build/tools/iree-run-module \\\n  --device\u003dcuda \\\n  --module\u003dabs_cuda.vmfb \\\n  --function\u003dabs \\\n  --input\u003d4xf32\u003d-1,-2,3,-4\n```\n\nResults:\n```\n4xf32\u003d1 2 3 4\n```\n\nCompiled and ran a local matmul.mlir smoke test:\n```\n../iree-build/tools/iree-compile matmul.mlir \\\n  --iree-hal-target-device\u003dcuda \\\n  --iree-cuda-target\u003dsm_121 \\\n  --iree-cuda-target-features\u003d+ptx88 \\\n  -o matmul_cuda.vmfb\n\n../iree-build/tools/iree-run-module \\\n  --device\u003dcuda \\\n  --module\u003dmatmul_cuda.vmfb \\\n  --function\u003dmatmul \\\n  --input\u003d128x256xf16\u003d1 \\\n  --input\u003d256x128xf16\u003d1\n```\nResult: 128x128xf32 values are 256 as expected.\n\n---------\n\nSigned-off-by: Charlie-Tsai1123 \u003ccharlie1123tsai@gmail.com\u003e",
  "tree_diff": [
    {
      "type": "modify",
      "old_id": "aa83c677604528bd03069718e432ca93fb391821",
      "old_mode": 33188,
      "old_path": "compiler/plugins/target/CUDA/test/target_device_features.mlir",
      "new_id": "5cc399c2dbefa62c932af7c1cc98811fb7e0c8e3",
      "new_mode": 33188,
      "new_path": "compiler/plugins/target/CUDA/test/target_device_features.mlir"
    },
    {
      "type": "modify",
      "old_id": "fc48dd55d659cf740ca3397fb1d28e437dd5811f",
      "old_mode": 33188,
      "old_path": "compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/KnownTargets.cpp",
      "new_id": "8894df984129c1b50940979da0dc46378cca111d",
      "new_mode": 33188,
      "new_path": "compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/KnownTargets.cpp"
    }
  ]
}
