RAM
d850660872
[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989)
* reset decoder_block_shape_q buffer
* refactor GetBlockShapeAndSplitKVBlock Kernel and cudagraph padding batch
* update decode_max_tile_size
* fix pre-commit
* update block_multihead_attn_backend
* update flas attn backend
* update MLA Attention
* update XPU Attention
* update gcu,iluvatar model runner
* Update MTP
* fix MTP bug
2025-07-31 00:09:31 +08:00
..
2025-06-09 19:20:15 +08:00
2025-07-28 10:51:52 +08:00
2025-07-19 23:19:27 +08:00
2025-07-24 20:22:45 +08:00
2025-07-31 00:09:31 +08:00
2025-07-29 22:45:20 +08:00
2025-07-31 00:09:31 +08:00
2025-07-28 10:51:52 +08:00
2025-07-31 00:09:31 +08:00
2025-07-29 22:45:20 +08:00
2025-07-28 10:51:52 +08:00
2025-07-29 14:17:37 +08:00
2025-07-19 23:19:27 +08:00
2025-07-28 10:51:52 +08:00
2025-07-30 21:03:12 +08:00
2025-07-30 19:09:38 +08:00
2025-07-29 22:45:20 +08:00