Sign in
opensecura
/
3p
/
openxla
/
iree
/
HEAD
278e63a
Move gcc dangling-reference check to GCC>=13. (#19772)
by Scott Todd
· 4 hours ago
main
a430695
[GPU] Add pattern to fuse tensor.collapse_shape into forall producer (#19295)
by Max191
· 7 hours ago
ba30557
[runtime][hip] Cast IREE_[HOST|DEVICE]_MAX_SIZE to iree_host_size_t type (#19766)
by Han-Chung Wang
· 7 hours ago
03c5a0f
[LLVMGPUVectorDistribute] Refactor vector.contract distribute (#19631)
by Manupa Karunaratne
· 11 hours ago
6933c39
[Codegen] add mi308x target (#19756)
by Bangtian Liu
· 18 hours ago
26dcb8e
Fixes for CMake 3.31 policy changes. (#19759)
by Scott Todd
· 18 hours ago
cac390d
Fix or work around gcc-14 warnings/errors. (#19758)
by Scott Todd
· 18 hours ago
525389c
[LLVMGPU] Enable forall distr on the gpuvectorization pass pipeline (#19753)
by Prashant Kumar
· 19 hours ago
8c7eeca
[DispatchCreation] Enable Rope computation fusion with attention. (#19745)
by MaheshRavishankar
· 21 hours ago
eb21715
Bump StableHLO to openxla/stablehlo@23d7f60. (#19754)
by Scott Todd
· 26 hours ago
1cd62fd
[DT][GPU] Permute cross-thread dims of TileSwizzle to outermost (#19734)
by Max191
· 26 hours ago
3e15a5a
[Global Opt] Add option to generalize matmul ops (#19741)
by Ian Wood
· 27 hours ago
b47fbdf
[LinalgExt] Update scatter to allow dropping unit dims (#19704)
by Ian Wood
· 28 hours ago
4c0ba9c
[DispatchCreation] Add constant expression hoisting (#19750)
by Quinn Dawkins
· 29 hours ago
21d5db3
[Codegen] Use linearize_index op when swapping slice and expand (#19730)
by Max191
· 29 hours ago
e154e8b
[hip] Enable caching in the hip async allocator (#19667)
by Andrew Woloszyn
· 30 hours ago
2e21a9a
[LLVMGPU] Enable scf.forall distr. on vectorDistribute Pipeline (#19420)
by Prashant Kumar
· 31 hours ago
38ca3be
[GPU] Add SwapExpandShapeWithSlice pattern to loop fusion pass (#19729)
by Max191
· 31 hours ago
4a7af87
Integrate llvm 1_20_2025 (#19740)
by Nirvedh Meshram
· 2 days ago
8ba420d
Bump sarisia/actions-status-discord from 1.15.1 to 1.15.2 in the github-actions group (#19744)
by dependabot[bot]
· 2 days ago
88887e7
Add ubuntu-24.04-arm runtime and runtime_tracing CI jobs. (#19724)
by Scott Todd
· 2 days ago
4c600ca
[Codegen] Refactor RemoveSingleIteratinLoop to use ValueBoundsOpInte… (#19678)
by Krzysztof Drewniak
· 2 days ago
9aae362
[Codegen] Sprinkle in PropagateDispatchSizeBounds passes (#19677)
by Krzysztof Drewniak
· 2 days ago
a64d713
Implement ValueBoundsOpInterface on HAL ID/count ops, util.assume.int (#19676)
by Krzysztof Drewniak
· 2 days ago
db129e5
Update torch-mlir to llvm/torch-mlir@f42c7e4 (#19736)
by zjgarvey
· 2 days ago
be8e3d2
Convert barriers into copies during allocation (#19735)
by Rob Suderman
· 5 days ago
latest-snapshot
6052a1d
Bump to llvm/llvm-project@3f1486f (#19683)
by MaheshRavishankar
· 5 days ago
b4c7de6
[Python] Enable building Python bindings as editable wheels, document it (#19716)
by Krzysztof Drewniak
· 5 days ago
b08d152
[HAL] Use util.assume.int for memref alignments (#19691)
by Krzysztof Drewniak
· 5 days ago
4d3f06a
[VectorDistribution] Clone vector.step on layout conflict (#19732)
by Kunwar Grover
· 5 days ago
c5bc37f
[Util] Fix OptimizeIntArithmetic pattern failure condition (#19731)
by Kunwar Grover
· 5 days ago
e1f010c
[Dispach] Clone chain of ops into dispatch (#19723)
by Ian Wood
· 5 days ago
f31cc72
Update resource placement and transfer for barrier operations (#19725)
by Rob Suderman
· 5 days ago
75c9e86
[GPU] Avoid fusing slices of already tiled ops (#19404)
by Max191
· 5 days ago
c1cc4cc
[LLVMGPU] Add pass to distribute undistributed copies to threads (#19715)
by Quinn Dawkins
· 5 days ago
dde5992
[Codegen] Allow memref type propagation through collapse_shape (#19400)
by Max191
· 5 days ago
36c2353
[Codegen] Push up the extract slice op (#19680)
by Prashant Kumar
· 6 days ago
08b44e2
[hip] Try again to fix the semaphore busy loop. (#19712)
by Andrew Woloszyn
· 6 days ago
6f33cd4
[Stream] Specialize encoding for TensorPhaseOp that have result_encoding (#19707)
by Han-Chung Wang
· 6 days ago
3e8c81c
Remove legacy sync path (#19714)
by Rob Suderman
· 6 days ago
f6f6388
[Codegen] Add workgroups reordering to distribute using forall (#19681)
by Prashant Kumar
· 6 days ago
5ee9b27
Clean up encoding-related code. NFC. (#19717)
by Jakub Kuderski
· 7 days ago
3032df2
Fix newlines in markdown mermaid.js diagrams. (#19657)
by Scott Todd
· 7 days ago
c285d58
Copy sample code into samples/dynamic_shapes/README.md. (#19699)
by Scott Todd
· 7 days ago
3c95042
Re-enable MI250 workflows. (#19705)
by saienduri
· 8 days ago
27e7a90
[DT][Encoding] Use layouts to calculate storage size when it is present. (#19686)
by Han-Chung Wang
· 8 days ago
a953763
Temporarily Disable MI250 workflow due to machine outage (#19702)
by Akansha Bansal
· 8 days ago
c320935
Bump dawidd6/action-download-artifact from 3.1.4 to 7 in the github-actions group (#19692)
by dependabot[bot]
· 8 days ago
6fd0fd0
[LinalgExt] Implement PartialReductionOpInterface for OnlineAttentionOp (#19684)
by Kunwar Grover
· 8 days ago
3c963dd
Update PyTorch sample notebooks using latest iree-turbine code. (#19658)
by Scott Todd
· 8 days ago
01c9f14
[LLVMGPUVectorDistribute] Add support for inter-subgroup multi_reduction (#19596)
by Manupa Karunaratne
· 9 days ago
21b0101
[GPU] Disable prefetching for loops with no computation (#19695)
by Nirvedh Meshram
· 9 days ago
8d1d867
[GPU] Add thread tile size inference for scatter (#19694)
by Quinn Dawkins
· 9 days ago
158c636
Revert "Increase default threshold of TileLargeTensor pass (#19671)" (#19693)
by Nirvedh Meshram
· 9 days ago
3e34e03
Bump the github-actions group with 8 updates (#19689)
by dependabot[bot]
· 9 days ago
3978ce6
Increase default threshold of TileLargeTensor pass (#19671)
by Nirvedh Meshram
· 9 days ago
2452b22
[Codegen][GPU] Let integer range optimization narrow GPU computations to i32 (#19473)
by Krzysztof Drewniak
· 9 days ago
2b29155
Update GH actions with Dependabot (#19663)
by Marius Brehler
· 9 days ago
9b35412
Run on schedule in iree-org only (#19685)
by Marius Brehler
· 9 days ago
d90c505
Reshape propagation to enable broadcast(transpose) -> attention(q, kt, vt) fusion. (#19661)
by MaheshRavishankar
· 9 days ago
cac7a96
Update IREE test suite to use iree-org/iree-test-suites@c47d13c (#19617)
by MaheshRavishankar
· 9 days ago
40c19e3
Better support multidevice placement with `stream.async.barrier` (#19651)
by Rob Suderman
· 9 days ago
88d5f59
Update PkgCI test_amd to use MI300x conductor cluster (#19517)
by yamiyysu
· 9 days ago
ae50c5e
[DOCS] Update VectorExt::NestedLayoutAttr docs (#19246)
by Manupa Karunaratne
· 9 days ago
1441caa
Enable macOS Tracy CI build. (#19668)
by Scott Todd
· 9 days ago
a583b25
[GPU] Teach GPUApplyTilingLevel PartialReduction tiling (#19682)
by Kunwar Grover
· 10 days ago
9f93691
[LLVMGPU] Use LLVMGPUDistribute for small input scatters (#19670)
by Quinn Dawkins
· 12 days ago
f7a2157
Remove Upcasting schedule from TileAndFuse (#19669)
by Nirvedh Meshram
· 12 days ago
039b8b4
Using tracy::GetQueue instead of the sketchy static variable reference. (#19653)
by Ben Vanik
· 12 days ago
1d91bec
Supporting file descriptors in iree_io_stream_open. (#19665)
by Ben Vanik
· 12 days ago
106371d
Bump torch-mlir to f92c587cb6150e73078f32cf847dc3892be16f93 (#19659)
by jinchen
· 12 days ago
a88555c
Add macOS workflow running on M1 (#19656)
by Marius Brehler
· 12 days ago
e64cb12
Increase strictness of global isel use for ROCM (#19247)
by Tres
· 12 days ago
2aca091
[Codegen][Nearly NFC] Move PropagateDispatchSizeBounds to Common/ (#19650)
by Krzysztof Drewniak
· 12 days ago
6245db1
[Stream] Attach layouts to tensor ops in encoding specialization pass. (#19649)
by Han-Chung Wang
· 13 days ago
c793f90
[i1] Implement `packed_storage` layout encoding attribute (#19354)
by lialan
· 13 days ago
801e2c1
Expand runtime_tracing job to include Windows and macOS. (#19655)
by Scott Todd
· 13 days ago
7d21c5d
Revert (2nd) of "Propagate reshapes through generics with reduction" (#19647)
by MaheshRavishankar
· 13 days ago
b3ff1ed
Rename `unroll_{m,n,k}` to `intrinsics_{m,n,k}` (#19652)
by Benoit Jacob
· 13 days ago
6d6bd6e
[runtime] Fix runtime tracing compile failure on gcc (#19642)
by Ian Wood
· 13 days ago
bb1c561
Erase all address spaces and get inlined ukernels (#19646)
by Benoit Jacob
· 13 days ago
a7bac5d
[Flow] Fix dispatch naming for dynamic shaped fusions (#19439)
by Quinn Dawkins
· 13 days ago
9055c9d
[hip] Fix race in the cleanup of queue read operations. (#19645)
by Andrew Woloszyn
· 13 days ago
82e37d6
Fix (cross) compiling for 32-bit targets (#19644)
by Marius Brehler
· 13 days ago
02d145e
[Stream] Implement SpecializeEncodings pass (1/n) (#19502)
by Han-Chung Wang
· 14 days ago
74f8d3c
[LinalgExt] Scatter fusion by expansion 3/3 (#19588)
by Ian Wood
· 14 days ago
2347d9f
Supporting (and renaming) IREE_HAL_WHOLE_BUFFER in binding table resolve. (#19640)
by Ben Vanik
· 14 days ago
126f0ac
Add docs for updating release git tags manually. (#19637)
by Scott Todd
· 14 days ago
af416b3
Bump version to 3.2.0 after releasing 3.1.0. (#19638)
by Scott Todd
· 2 weeks ago
c484058
[GPU] Add barriers when resolving GPUMappedForall to fix race condition (#19635)
by Nirvedh Meshram
· 2 weeks ago
9b4906e
[DispatchCreation] Drop fusion restriction for stride != 1 conv (#19634)
by Quinn Dawkins
· 2 weeks ago
c75b686
[GPU][Codegen] Allowing mfma for narrow problem config sizes (#19615)
by Zhuoran Yin
· 2 weeks ago
7b9aa28
When dumping intermediates, dump how to reproduce the `.optimized.ll` (#19633)
by Benoit Jacob
· 2 weeks ago
be75a30
Update minor Python versions used to build packages (#19632)
by Marius Brehler
· 2 weeks ago
fb21dd6
Adding experimental Tracy API for TLS-less event recording. (#19625)
by Ben Vanik
· 2 weeks ago
a5c3879
Reapply "Propagate reshapes through generics with reduction… (#18968)
by Ian Wood
· 2 weeks ago
80cbf6b
[GPU] Add a pass to convert accumulating GEMMs to GEMMs (#19587)
by Nirvedh Meshram
· 2 weeks ago
550d88e
[GPU] Add lowering configuration logic for scatter (#19624)
by Quinn Dawkins
· 2 weeks ago
349026b
Add explicit tolerances to SDXL benchmark test times. (#19628)
by Scott Todd
· 2 weeks ago
9a83239
[GPU] Add chained reshape support for scf.forall expand destination pattern (#19597)
by Nirvedh Meshram
· 2 weeks ago
Next »