Jiang-Jia-Jun
e421d51001
[Feature] Support include_stop_str_in_output ( #2919 )
...
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
v2.0.2
2025-07-18 19:43:19 +08:00
sg263
c71d955e9c
[Trace]fix opentelemetry can not work in uvicorn ( #2907 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* fix opentelemetry-instrumentation-fastapi
* fix annotation
* fix opentelemetry-bootstrap
* fix opentelemetry-bootstrap
* fix opentelemetry can not work in uvicorn
* remove unless import
* move conf to env
* fix useless commit
---------
Co-authored-by: shige <shige@baidu.com >
2025-07-17 23:16:29 +08:00
gaoziyuan
2d2468ae72
fix config get ( #2883 )
2025-07-17 15:03:26 +08:00
sg263
7deac64233
[Bug Fix] fix opentelemetry-bootstra ( #2875 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* fix opentelemetry-instrumentation-fastapi
* fix annotation
* fix opentelemetry-bootstrap
* fix opentelemetry-bootstrap
---------
Co-authored-by: shige <shige@baidu.com >
2025-07-17 00:51:02 +08:00
sg263
5a5f17cf97
fix put opentelemetry-instrumentation-fastapi in requierment ( #2874 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* fix opentelemetry-instrumentation-fastapi
* fix annotation
---------
Co-authored-by: shige <shige@baidu.com >
2025-07-17 00:41:53 +08:00
sg263
0d61c65de1
[Trace] Support trace log ( #2864 )
...
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
2025-07-16 15:35:44 +08:00
Jiang-Jia-Jun
e5de28bff2
Update setup.py
2025-07-15 10:11:26 +08:00
AIbin
b9eede57b6
cp PR#2820 to release/2.0.2 ( #2839 )
2025-07-14 17:05:56 +08:00
lddfym
94e1a895e3
fix spelling error ( #2826 )
...
* fix spelling error
* fix scheduler reset error
2025-07-14 13:13:08 +08:00
zhenwenDang
87203ec87b
After enabling "top_logprobs supports passing 0 and fix max_completion_tokens", an incorrect finish_reason was returned. ( #2815 )
...
* /v1/chat/completions endpoint now supports max_completion_tokens and fixes the return value of finish_reason
* top_logprobs supports passing 0
2025-07-11 16:53:12 +08:00
Sunny-bot1
4596dd7248
[FIX 2.0.2]fix topp topk default value ( #2810 )
...
* fix topp topk default value
* update topk
2025-07-11 16:12:02 +08:00
lddfym
ec986642df
Global scheduler supports configuring hot updates ( #2812 )
2025-07-11 13:39:30 +08:00
chen
94691bcd90
fix enable_logprob not in rl_config ( #2808 )
2025-07-11 11:52:48 +08:00
Sunny-bot1
4025ea7e5b
[FIX 2.0.2] Topk topp sampling fix ( #2805 )
...
* fix topk-topp
* fix
2025-07-10 06:15:03 -07:00
lizexu123
e681e1e719
[BugFix] fix RMSNorm rms_norm_esp ( #2804 )
2025-07-10 05:39:02 -07:00
chen
823a47e64a
[Feature] Support return logprob of generated tokens ( #2784 )
...
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner
* only cuda support logprob
* get_worker() check platform
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-10 15:47:42 +08:00
gaoziyuan
39d2a1de46
fix num_blocks_local when small size model in TP2 running mode ( #2793 )
2025-07-10 13:44:56 +08:00
Sunny-bot1
1107e08cd9
[Feature 2.0.2] support top_k_top_p sampling ( #2789 )
...
* support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* change func name
2025-07-09 21:01:51 -07:00
Jiang-Jia-Jun
1fe37cb7e8
[BugFix] Fix vocab size error for ernie model
2025-07-09 22:33:04 +08:00
gaoziyuan
337d76f094
[sync fix] ( #2759 )
...
* add rl qwen model support
* fix
* fix
* add_commit_config
* fix
2025-07-08 19:29:23 +08:00
gaoziyuan
ae2f78184d
【Sync develop】 add commit info ( #2755 )
...
* add rl qwen model support
* fix
* fix
* add_commit_config
2025-07-08 17:02:50 +08:00
gaoziyuan
6851489425
【Sync】Release/2.0.1 ( #2745 )
...
* add rl qwen model support
* fix
* fix
2025-07-08 14:38:18 +08:00
Jiang-Jia-Jun
ea787d8f62
fix bug. ( #2718 ) ( #2720 )
...
Co-authored-by: Ting <wtmlon@foxmail.com >
2025-07-05 09:00:01 +08:00
Ting
90ef28d982
spec token map lazy. ( #2715 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-05 00:14:54 +08:00
YuBaoku
b37585e693
[BugFix] fix paddle_git_commit_id error ( #2714 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* set git identity to avoid merge failure in CI
* add ci cases
* [CI] Add validation for MTP and CUDAGraph
* [BugFix] fix paddle_git_commit_id error
2025-07-04 22:16:37 +08:00
lizexu123
9cb08e71e8
add support QWQ enable_thinking ( #2706 )
...
* add support QWQ enable_thinking
* add stream=True
* fix stream=true
* fix qwen
---------
Co-authored-by: lizexu <lizexu@baidu.com >
2025-07-04 20:55:23 +08:00
YuBaoku
dacc46f04c
[CI] Add validation for MTP and CUDAGraph ( #2710 )
...
* set git identity to avoid merge failure in CI
* add ci cases
* [CI] Add validation for MTP and CUDAGraph
2025-07-04 18:13:54 +08:00
Jiang-Jia-Jun
09ded7715f
Update mkdocs.yml
2025-07-04 17:55:52 +08:00
LQX
11cfdf5d89
添加XPU CI, test=model ( #2701 )
...
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
2025-07-04 16:16:06 +08:00
GoldPancake
e7fa57ebae
Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue ( #2707 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mtp eh_proj layer
* fix mtp update_cfg function
* fix stringdoc
* simplify class name
2025-07-04 14:15:04 +08:00
gaoziyuan
a5ae88ded9
[feature]add fd whl version info ( #2698 )
2025-07-04 14:12:42 +08:00
ltd0924
87e638498c
[RL] update reschedule finish reason ( #2709 )
2025-07-04 13:47:36 +08:00
freeliuzc
667547be59
support chunk_prefill in MTP ( #2705 )
2025-07-04 11:55:48 +08:00
LiqinruiG
b38823bc66
modify reasoning_output docs ( #2696 )
2025-07-04 11:30:02 +08:00
Divano
050d9658a5
Update requirements.txt
2025-07-04 09:53:03 +08:00
Divano
be5cabaf80
add quick benchmark ( #2703 )
...
测试脚本不需要过CI
2025-07-04 09:32:36 +08:00
Yuanle Liu
240bdac2a4
[feat] support fa3 backend for pd disaggregated ( #2695 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd
[Bug] fix logger format ( #2689 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
kevin
3d3bccdf79
[doc] update docs ( #2690 )
2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
9fd74f75bd
Update dynamic_weight_manager.py
2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
d222248d00
Update README.md
2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun
e5b94d4117
Update README.md
2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun
87e2e58a22
Update gh-pages.yml
2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun
de20e5a992
Update Dockerfile.xpu
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun
2f9c0618f0
Update Dockerfile.gpu
2025-07-03 10:14:39 +08:00
Yuanle Liu
9a14ab6572
add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl ( #2682 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 05:32:20 -07:00
Divano
d1cb3ed571
Update gh-pages.yml ( #2680 )
2025-07-02 17:36:18 +08:00
handiz
b8a8a19689
add wint2 performance ( #2673 )
2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun
97ac82834f
Update nvidia_gpu.md
2025-07-02 16:54:14 +08:00