chen
823a47e64a
[Feature] Support return logprob of generated tokens ( #2784 )
...
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner
* only cuda support logprob
* get_worker() check platform
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-10 15:47:42 +08:00
Sunny-bot1
1107e08cd9
[Feature 2.0.2] support top_k_top_p sampling ( #2789 )
...
* support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* change func name
2025-07-09 21:01:51 -07:00
gaoziyuan
337d76f094
[sync fix] ( #2759 )
...
* add rl qwen model support
* fix
* fix
* add_commit_config
* fix
2025-07-08 19:29:23 +08:00
gaoziyuan
ae2f78184d
【Sync develop】 add commit info ( #2755 )
...
* add rl qwen model support
* fix
* fix
* add_commit_config
2025-07-08 17:02:50 +08:00
Yuanle Liu
240bdac2a4
[feat] support fa3 backend for pd disaggregated ( #2695 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00