RAM
d850660872
[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989)
* reset decoder_block_shape_q buffer
* refactor GetBlockShapeAndSplitKVBlock Kernel and cudagraph padding batch
* update decode_max_tile_size
* fix pre-commit
* update block_multihead_attn_backend
* update flas attn backend
* update MLA Attention
* update XPU Attention
* update gcu,iluvatar model runner
* Update MTP
* fix MTP bug
2025-07-31 00:09:31 +08:00
..
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-03 15:43:53 +08:00
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-17 18:41:31 +08:00
2025-07-19 23:19:27 +08:00
2025-07-24 12:00:52 +08:00
2025-07-17 18:41:31 +08:00
2025-07-31 00:09:31 +08:00
2025-07-28 14:31:37 +08:00
2025-06-09 19:20:15 +08:00
2025-07-22 15:03:41 +08:00
2025-07-17 18:41:31 +08:00
2025-06-09 19:20:15 +08:00
2025-07-17 18:41:31 +08:00
2025-06-09 19:20:15 +08:00
2025-07-24 12:00:52 +08:00
2025-07-19 23:19:27 +08:00
2025-07-19 23:19:27 +08:00
2025-07-03 15:43:53 +08:00