Files
FastDeploy/custom_ops/gpu_ops/append_attn
RAM d97aab25bc [Excutor] Fixed the issue of CUDA graph execution failure caused by different branches during decoding (#3223) (#3512)
* 彻底解决解码切块问题

* update C8 and C4 kernel

* fix problem

* fix with pre-commit

* retain branch for mtp

Co-authored-by: Jundong Liu <61149469+littledgg@users.noreply.github.com>
2025-08-21 20:58:47 +08:00
..
2025-07-28 14:31:37 +08:00
2025-07-03 15:43:53 +08:00