Commit Graph

4 Commits

Author SHA1 Message Date
ltd0924
3792345c3a [LLM] update function name (#2985)
* [LLM] update function name
2025-07-24 15:03:40 +08:00
Zero Rains
89a485b69f [Feature] Support using prefix-caching + cudagraph for inference (#2924)
* fix the bug in cudagraph+prefix-caching but still have some bug with profile

Change-Id: Ibf2ba3f2e3b08641d03f4b1391d7c862c3efa397

* add the signal to make sure cache manager launched

* fix judge condition

* reomove useless control

* update control stream

* update

* fix xpu

* change the do_profile flag

* update

* add new threads to init cache_manager

---------

Co-authored-by: RAM <gstian5555@outlook.com>
2025-07-22 00:59:45 -07:00
Zero Rains
25698d56d1 polish code with new pre-commit rule (#2923) 2025-07-19 23:19:27 +08:00
liddk1121
1b54a2831e Adapt for iluvatar gpu (#2684) 2025-07-07 16:53:14 +08:00