chen
|
823a47e64a
|
[Feature] Support return logprob of generated tokens (#2784)
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner
* only cuda support logprob
* get_worker() check platform
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-07-10 15:47:42 +08:00 |
|
gaoziyuan
|
39d2a1de46
|
fix num_blocks_local when small size model in TP2 running mode (#2793)
|
2025-07-10 13:44:56 +08:00 |
|
Jiang-Jia-Jun
|
1fe37cb7e8
|
[BugFix] Fix vocab size error for ernie model
|
2025-07-09 22:33:04 +08:00 |
|
gaoziyuan
|
6851489425
|
【Sync】Release/2.0.1 (#2745)
* add rl qwen model support
* fix
* fix
|
2025-07-08 14:38:18 +08:00 |
|
Jiang-Jia-Jun
|
05c670e593
|
[Sync] Update to latest code (#2679)
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
|
2025-07-03 15:43:53 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|