[VectorDistribute] Do not handle bit extend during matmul configuration (#21798)

We shouldn't be handling bit extends outside of the contraction during
configuration. The bit extend should be fused with the contraction for
it to target intrinsics. This behavior matches what TileAndFuse does and
prevents problems with cases like:

```

%lhs = linalg.generic {
  %ext = arith.extf %in
  %scaled = arith.mulf %ext, %scale
}

...

// mma
linalg.generic %lhs, %rhs
```

Here the scaling doesn't allow us to target intrinsics, but currently
vector distribute will target intrinsics.

Fixes: https://github.com/iree-org/iree/issues/21434
diff --git a/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp b/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp
index ad629b0..dae9254 100644
--- a/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp
+++ b/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp
@@ -1130,15 +1130,6 @@
   Type rhsElemType = getElementTypeOrSelf(rhs);
   Type initElemType = getElementTypeOrSelf(init);
 
-  if (auto lhsOp = lhs.getDefiningOp<linalg::GenericOp>()) {
-    if (IREE::LinalgExt::isBitExtendOp(lhsOp))
-      lhsElemType = getElementTypeOrSelf(lhsOp.getDpsInputs()[0]);
-  }
-  if (auto rhsOp = rhs.getDefiningOp<linalg::GenericOp>()) {
-    if (IREE::LinalgExt::isBitExtendOp(rhsOp))
-      rhsElemType = getElementTypeOrSelf(rhsOp.getDpsInputs()[0]);
-  }
-
   SmallVector<int64_t> batchDims;
   for (int64_t batchDim : contractionDims->batch) {
     if (ShapedType::isStatic(bounds[batchDim])) {