Sign in
opensecura
/
3p
/
openxla
/
iree
/
HEAD
597629e
[Codegen] Limit async scope in pipelining (#24350)
by Lukas Sommer
· 2 hours ago
main
98ddf0c
[VectorExt] Change `to_layout` `shared_memory_conversion` (#24377)
by Lukas Sommer
· 3 hours ago
c0f5d4b
[Codegen] Move remaining pipelines to `iree_codegen` attrs (#24398)
by Jakub Kuderski
· 8 hours ago
8cc05f5
[CPU] Improve tiling config for elementwise ops with dynamic shapes. (#24383)
by Han-Chung Wang
· 10 hours ago
16191ce
Revert "[Codegen] Enable DMA by default for F16/BF16 Gemm on gfx950 (#24373)" (#24395)
by Zhewen Yu
· 10 hours ago
0fe0ca8
[Torch][LinalgExt] Support GQA in torch.hop_flex_attention lowering (#24313)
by Keshav Vinayak Jha
· 12 hours ago
d62d69b
[Compiler] Use Repeated<T> for repeated value ranges. NFC. (#24392)
by Jakub Kuderski
· 13 hours ago
c4da71c
[Codegen][CPU] Add a type-polymorphic generic-scalar MMA fallback. (#24389)
by Benoit Jacob
· 13 hours ago
fac9d3d
[INTEGRATION] Bump llvm to 0f3ca6bb9 (#24390)
by Alan Li
· 14 hours ago
f4fb944
[ROCM] Workaround LLVM #194924 partial-unroll regression (#24379)
by Alan Li
· 15 hours ago
1c14508
[Codegen] Canonicalize transfer_{read,write} vector<1xT> (#24382)
by Erick Ochoa Lopez
· 21 hours ago
4030534
[INTEGRATION] Bump llvm to 3ed76d05a78d (#24376)
by Alan Li
· 35 hours ago
latest-snapshot
ac49ab6
[ROCm] Drop deprecated --iree-hip flag aliases (#24381)
by Jakub Kuderski
· 2 days ago
4f99043
Reapply "[Codegen] Enable DMA by default for F16/BF16 Gemm on gfx950 (#24117)" (#24235) (#24373)
by Zhewen Yu
· 2 days ago
5fbfe29
[LinalgExt] Fix attention NaN for fully-masked rows (#24178)
by Keshav Vinayak Jha
· 2 days ago
d2f4f44
Bump iree-org/torch-mlir@46cbd27f7c (#24380)
by Keshav Vinayak Jha
· 2 days ago
9092be0
[InputConversion] Lower AtenArgmax/AtenArgmin to iree_linalg_ext.arg_compare (#24291)
by Bangtian Liu
· 2 days ago
6efc2ca
Generalize FoldMaskedTransferRaw and add FoldTransferReadOfEmptyTensor (#24301)
by Erick Ochoa Lopez
· 2 days ago
ca7e063
[VectorDistribute] Lower and distribute `async_dma` (#24299)
by Lukas Sommer
· 2 days ago
c6525dd
[Codegen] Duplicate operations in tile size analysis (#24246)
by Lukas Sommer
· 2 days ago
2608330
[IREEGPU] Bufferize `async_dma` (#24300)
by Lukas Sommer
· 2 days ago
2725310
[AMDGPU] Roll-up of AMDGPU HAL improvements for CDNA support (#24359)
by Ben Vanik
· 2 days ago
a1b8b72
[test][nfc] Add regression tests about strided vector.gather back. (#24370)
by Han-Chung Wang
· 2 days ago
f9562fe
[HAL] Preserve tensor import effects when folding (#24364)
by Ben Vanik
· 2 days ago
915c6ea
[HAL] Add executable global lookup buffers (#24336)
by Ben Vanik
· 2 days ago
b63db90
[LLVMGPU] Add TileAndFuse fallback for iree_linalg_ext.arg_compare (#24347)
by Bangtian Liu
· 2 days ago
b014947
[INTEGRATION] bump llvm @ 8be29edc2 (#24363)
by Alan Li
· 3 days ago
020f6be
[Codegen][CPU] Lower data-tiled inner_tiled in VirtualVectorLoweringPass. (#24358)
by Benoit Jacob
· 3 days ago
4f6cd98
[DispatchCreation][LinalgExt] Add OnlineAttentionOp support in dispatch formation and reshape fusion (#24068)
by Keshav Vinayak Jha
· 3 days ago
d7eed39
[LinalgExt] Fix attention index remap after unit-dim folding (#24349)
by Keshav Vinayak Jha
· 3 days ago
76a8215
Bump dawidd6/action-download-artifact from 20 to 21 in the github-actions group (#24360)
by dependabot[bot]
· 3 days ago
18a49cb
[Codegen] NFC: Lift InnerTiledOp unroll pattern to Codegen. (#24357)
by Benoit Jacob
· 4 days ago
2ba8b6f
Revert "[LLVMGPU] Fall back to scalar lowering for tiny attention shapes (#24239)" (#24356)
by Nirvedh Meshram
· 4 days ago
f2b0897
[Codegen] Fix iree-compile --debug crash on CPU/GPU codegen pass options (#24190)
by Han-Chung Wang
· 4 days ago
e623d00
[INTEGRATION] Bump llvm to f306525759 (#24354)
by Alan Li
· 4 days ago
d097bad
[DispatchCreation] Prevent fill->scatter cloning (#24214)
by Ian Wood
· 4 days ago
1192630
[Codegen] NFC: Lift InnerTiledOp lower & drop-unit-dim patterns to Codegen. (#24351)
by Benoit Jacob
· 4 days ago
d358e81
[Codegen][CPU] Lower inner_tiled to llvm.call_intrinsic. (#24345)
by Benoit Jacob
· 4 days ago
d4e04f7
compiler/plugins/input/TOSA: fix: TOSA arith lowering must handle apply scale introduced by linalg lowering (#24121)
by Florian Walbroel
· 4 days ago
81f4dec
[LLVMGPU] Fall back to scalar lowering for tiny attention shapes (#24239)
by Keshav Vinayak Jha
· 4 days ago
fdf1392
[DispatchCreation] Allow fusion of multi-result producers (#24169)
by Keshav Vinayak Jha
· 4 days ago
0935008
[DispatchCreation] Tighten scatter-skip predicate in CollapseDimensions (#24334)
by Vivian Zhang
· 6 days ago
f4d1908
[Codegen][DMA] Fix unaligned swizzle offset computation in gather-to-lds lowering (#24241)
by Zhewen Yu
· 6 days ago
7098bdf
[LLVMGPU][nfc] Modernize the rest of LLVMGPU pipeline tests. (#24341)
by Han-Chung Wang
· 7 days ago
316c1c1
[LLVMGPU][nfc] Modernize vector distribution pipeline tests. (#24340)
by Han-Chung Wang
· 7 days ago
8fc32e0
[DispatchCreation] Refactor and add low-parallelism split reduction parameter set (#24293)
by Vivian Zhang
· 7 days ago
3be9dc6
Refactor vector.multi_reduction into flattening, unrolling, and lowering passes. (#24183)
by Erick Ochoa Lopez
· 7 days ago
d13374f
Bump llvm to llvm-project@88e5eeb292f (#24339)
by Nirvedh Meshram
· 7 days ago
dd5a6e3
[Codegen] Support pack/unpack/linalg generic transpose in CombineLayoutTransformation (#24273)
by Muzammiluddin Syed
· 7 days ago
c40c7a3
[Codegen][CPU] Teach the lowering strategy about inner_tiled. (#24328)
by Benoit Jacob
· 7 days ago
174808a
[LinalgExt] Fix ArgCompareOp::generateResultTileValue for producer fusion (#24317)
by Bangtian Liu
· 7 days ago
83a30bb
Update CODEOWNERS for spreading review responsibility (#24332)
by Han-Chung Wang
· 7 days ago
404b958
[Codegen] NFC: Lift DataTiledMMA inner_tiled lowering helpers into MMAUtils. (#24326)
by Benoit Jacob
· 7 days ago
0a73681
[Codegen][CPU] Fix RHS indexing map in materialize-encoding inner_tiled lowering. (#24325)
by Benoit Jacob
· 7 days ago
01c52eb
[DispatchCreation] Fuse scalar reductions with their parallel consumers (#24166)
by Abhishek Varma
· 7 days ago
64031dd
Reapply "[Codegen] Use local binders for optimization flags in codegen (#24220)" (#24333)
by Han-Chung Wang
· 7 days ago
e6139f6
[CI] Ease contention on self hosted machines (#24316)
by Erick Ochoa Lopez
· 7 days ago
d055923
Bump stablehlo to stablehlo@806a6844dfd92cca (#24330)
by Nirvedh Meshram
· 8 days ago
7247601
[LLVMGPU][ROCDL] Add pass to group global loads for better instruction scheduling (#24247)
by Max191
· 8 days ago
ce12fef
Bump iree-org/torch-mlir@d2768f876d (#24320)
by Rob Suderman
· 8 days ago
967b794
[CPU] Add ContiguousMemrefGather1DToConditionalLoads vector lowering. (#24327)
by Han-Chung Wang
· 8 days ago
fcbd569
Bump LLVM to llvm-project@6f1e6e47bdf (#24314)
by Nirvedh Meshram
· 8 days ago
9f7a14e
[CI] Update iree-test-suite ref (#24304)
by Erick Ochoa Lopez
· 8 days ago
5872dc2
[Codegen][CPU] Pick inner-tiled unroll factors from a register budget. (#24303)
by Benoit Jacob
· 8 days ago
0380544
[IREEGPU] Define and expand `subgroup_scan` (#24188)
by Lukas Sommer
· 8 days ago
a79bb7b
[LLVMGPU] Remove unused `--iree-codegen-llvmgpu-use-unaligned-gemm-vector-distribution` flag (#24308)
by Vivian Zhang
· 8 days ago
dfe8134
[HAL/AMDGPU] Initial host-side AMDGPU HAL implementation (#24298)
by Ben Vanik
· 8 days ago
b42f44c
[HAL/AMDGPU] Use status matcher in notification test
by Ben Vanik
· 9 days ago
bf7d2b5
[HAL/AMDGPU] Disable queue upload rings by default
by Ben Vanik
· 9 days ago
3ac4164
[HAL/AMDGPU] Remove dynamic binding slot sidecars
by Ben Vanik
· 12 days ago
8d437bd
[HAL/AMDGPU] Specialize all-dynamic dispatch replay
by Ben Vanik
· 12 days ago
237efe6
[HAL/AMDGPU] Compact dynamic binding pointer replay
by Ben Vanik
· 12 days ago
34c5ba3
[HAL/AMDGPU] Bake dynamic binding slots into command buffers
by Ben Vanik
· 13 days ago
2fc0cac
[HAL/AMDGPU] Sample counter ranges on a profile queue
by Ben Vanik
· 13 days ago
85117aa
[HAL/AMDGPU] Fix patched-template profile metadata
by Ben Vanik
· 14 days ago
6ebc9c9
[HAL/AMDGPU] Add queue-range counter profiling
by Ben Vanik
· 10 days ago
d664e03
[HAL/AMDGPU] Initialize queue upload rings
by Ben Vanik
· 10 days ago
7953fbd
[HAL/AMDGPU] Track upload ring reclaim positions
by Ben Vanik
· 10 days ago
593764a
[HAL/AMDGPU] Add queue upload ring primitive
by Ben Vanik
· 10 days ago
4871281
[HAL/AMDGPU] Record mixed dynamic kernarg templates
by Ben Vanik
· 10 days ago
c6caae6
[HAL/AMDGPU] Test external buffer fail-loud contracts
by Ben Vanik
· 10 days ago
f60c162
[HAL/AMDGPU] Centralize physical topology edge selection
by Ben Vanik
· 10 days ago
1d27820
[HAL/AMDGPU] Split device metrics source sampling
by Ben Vanik
· 10 days ago
fc1dcfe
[HAL/AMDGPU] Abstract profile device clock sampling
by Ben Vanik
· 10 days ago
f8505e9
[HAL] Extract profile event ring utility
by Ben Vanik
· 10 days ago
b383a52
[HAL/AMDGPU] Stage generated inputs through coarse memory
by Ben Vanik
· 10 days ago
14c82d9
[HAL/AMDGPU] Document command-buffer fence policy
by Ben Vanik
· 10 days ago
c156d33
[HAL/AMDGPU] Mark grant-required peer memory
by Ben Vanik
· 10 days ago
55eea4a
[HAL/AMDGPU] Separate SVM facts from peer flags
by Ben Vanik
· 10 days ago
9b634df
[HAL/AMDGPU] Split kernarg benchmark counters
by Ben Vanik
· 10 days ago
fb363f6
[HAL/AMDGPU] Report prepublished kernarg replay counters
by Ben Vanik
· 10 days ago
75fbea9
[HAL/AMDGPU] Record prepublished kernarg totals
by Ben Vanik
· 10 days ago
a81bd5e
[HAL/AMDGPU] Document profiling and replay workflows
by Ben Vanik
· 14 days ago
2a36353
[HAL/AMDGPU] Split device-library target selection
by Ben Vanik
· 10 days ago
b905b7d
[HAL/AMDGPU] Test executable target inference
by Ben Vanik
· 10 days ago
a809ed5
[HAL/AMDGPU] Name ISA commonality agents
by Ben Vanik
· 10 days ago
cdf2f71
[HAL/AMDGPU] Model target feature support
by Ben Vanik
· 10 days ago
dfe4644
[HAL/AMDGPU] Table vendor packet capabilities
by Ben Vanik
· 10 days ago
00760ee
[HAL/AMDGPU] Split physical-device capability policy
by Ben Vanik
· 10 days ago
40a269e
[HAL/AMDGPU] Name prepublished kernarg storage
by Ben Vanik
· 10 days ago
Next »