Commit Graph

2640 Commits

Author SHA1 Message Date
Jiang-Jia-Jun
e421d51001 [Feature] Support include_stop_str_in_output (#2919)
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
v2.0.2
2025-07-18 19:43:19 +08:00
sg263
c71d955e9c [Trace]fix opentelemetry can not work in uvicorn (#2907)
* add opentelemetry

* add opentelemetry

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* fix opentelemetry-instrumentation-fastapi

* fix annotation

* fix opentelemetry-bootstrap

* fix opentelemetry-bootstrap

* fix opentelemetry can not work in uvicorn

* remove unless import

* move conf to env

* fix useless commit

---------

Co-authored-by: shige <shige@baidu.com>
2025-07-17 23:16:29 +08:00
gaoziyuan
2d2468ae72 fix config get (#2883) 2025-07-17 15:03:26 +08:00
sg263
7deac64233 [Bug Fix] fix opentelemetry-bootstra (#2875)
* add opentelemetry

* add opentelemetry

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* fix opentelemetry-instrumentation-fastapi

* fix annotation

* fix opentelemetry-bootstrap

* fix opentelemetry-bootstrap

---------

Co-authored-by: shige <shige@baidu.com>
2025-07-17 00:51:02 +08:00
sg263
5a5f17cf97 fix put opentelemetry-instrumentation-fastapi in requierment (#2874)
* add opentelemetry

* add opentelemetry

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* fix opentelemetry-instrumentation-fastapi

* fix annotation

---------

Co-authored-by: shige <shige@baidu.com>
2025-07-17 00:41:53 +08:00
sg263
0d61c65de1 [Trace] Support trace log (#2864)
* add opentelemetry

* add opentelemetry

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* add opentelemetry on dequeue
2025-07-16 15:35:44 +08:00
Jiang-Jia-Jun
e5de28bff2 Update setup.py 2025-07-15 10:11:26 +08:00
AIbin
b9eede57b6 cp PR#2820 to release/2.0.2 (#2839) 2025-07-14 17:05:56 +08:00
lddfym
94e1a895e3 fix spelling error (#2826)
* fix spelling error

* fix scheduler reset error
2025-07-14 13:13:08 +08:00
zhenwenDang
87203ec87b After enabling "top_logprobs supports passing 0 and fix max_completion_tokens", an incorrect finish_reason was returned. (#2815)
* /v1/chat/completions endpoint now supports max_completion_tokens and fixes the return value of finish_reason

* top_logprobs supports passing 0
2025-07-11 16:53:12 +08:00
Sunny-bot1
4596dd7248 [FIX 2.0.2]fix topp topk default value (#2810)
* fix topp topk default value

* update topk
2025-07-11 16:12:02 +08:00
lddfym
ec986642df Global scheduler supports configuring hot updates (#2812) 2025-07-11 13:39:30 +08:00
chen
94691bcd90 fix enable_logprob not in rl_config (#2808) 2025-07-11 11:52:48 +08:00
Sunny-bot1
4025ea7e5b [FIX 2.0.2] Topk topp sampling fix (#2805)
* fix topk-topp

* fix
2025-07-10 06:15:03 -07:00
lizexu123
e681e1e719 [BugFix] fix RMSNorm rms_norm_esp (#2804) 2025-07-10 05:39:02 -07:00
chen
823a47e64a [Feature] Support return logprob of generated tokens (#2784)
* online chat support logprobs

* check xpu

* check vl_gpu_model_runner

* only cuda support logprob

* get_worker() check platform

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-07-10 15:47:42 +08:00
gaoziyuan
39d2a1de46 fix num_blocks_local when small size model in TP2 running mode (#2793) 2025-07-10 13:44:56 +08:00
Sunny-bot1
1107e08cd9 [Feature 2.0.2] support top_k_top_p sampling (#2789)
* support top_k_top_p sampling

* fix

* add api param

* add api para

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* change func name
2025-07-09 21:01:51 -07:00
Jiang-Jia-Jun
1fe37cb7e8 [BugFix] Fix vocab size error for ernie model 2025-07-09 22:33:04 +08:00
gaoziyuan
337d76f094 [sync fix] (#2759)
* add rl qwen model support

* fix

* fix

* add_commit_config

* fix
2025-07-08 19:29:23 +08:00
gaoziyuan
ae2f78184d 【Sync develop】 add commit info (#2755)
* add rl qwen model support

* fix

* fix

* add_commit_config
2025-07-08 17:02:50 +08:00
gaoziyuan
6851489425 【Sync】Release/2.0.1 (#2745)
* add rl qwen model support

* fix

* fix
2025-07-08 14:38:18 +08:00
Jiang-Jia-Jun
ea787d8f62 fix bug. (#2718) (#2720)
Co-authored-by: Ting <wtmlon@foxmail.com>
2025-07-05 09:00:01 +08:00
Ting
90ef28d982 spec token map lazy. (#2715)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-05 00:14:54 +08:00
YuBaoku
b37585e693 [BugFix] fix paddle_git_commit_id error (#2714)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* set git identity to avoid merge failure in CI

* add ci cases

* [CI] Add validation for MTP and CUDAGraph

* [BugFix] fix paddle_git_commit_id error
2025-07-04 22:16:37 +08:00
lizexu123
9cb08e71e8 add support QWQ enable_thinking (#2706)
* add support QWQ enable_thinking

* add stream=True

* fix stream=true

* fix qwen

---------

Co-authored-by: lizexu <lizexu@baidu.com>
2025-07-04 20:55:23 +08:00
YuBaoku
dacc46f04c [CI] Add validation for MTP and CUDAGraph (#2710)
* set git identity to avoid merge failure in CI

* add ci cases

* [CI] Add validation for MTP and CUDAGraph
2025-07-04 18:13:54 +08:00
Jiang-Jia-Jun
09ded7715f Update mkdocs.yml 2025-07-04 17:55:52 +08:00
LQX
11cfdf5d89 添加XPU CI, test=model (#2701)
* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model

* 添加XPU CI,  test=model
2025-07-04 16:16:06 +08:00
GoldPancake
e7fa57ebae Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mtp eh_proj layer

* fix mtp update_cfg function

* fix stringdoc

* simplify class name
2025-07-04 14:15:04 +08:00
gaoziyuan
a5ae88ded9 [feature]add fd whl version info (#2698) 2025-07-04 14:12:42 +08:00
ltd0924
87e638498c [RL] update reschedule finish reason (#2709) 2025-07-04 13:47:36 +08:00
freeliuzc
667547be59 support chunk_prefill in MTP (#2705) 2025-07-04 11:55:48 +08:00
LiqinruiG
b38823bc66 modify reasoning_output docs (#2696) 2025-07-04 11:30:02 +08:00
Divano
050d9658a5 Update requirements.txt 2025-07-04 09:53:03 +08:00
Divano
be5cabaf80 add quick benchmark (#2703)
测试脚本不需要过CI
2025-07-04 09:32:36 +08:00
Yuanle Liu
240bdac2a4 [feat] support fa3 backend for pd disaggregated (#2695)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd [Bug] fix logger format (#2689)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
kevin
3d3bccdf79 [doc] update docs (#2690) 2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
9fd74f75bd Update dynamic_weight_manager.py 2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593 [Sync] Update to latest code (#2679)
* [Sync] Update to latest code

* Add new code files

* Add new code files

* update code

* Try to fix build.sh

* Try to fix build.sh

* Update code

* Update requirements.txt

* Update code

---------

Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
d222248d00 Update README.md 2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun
e5b94d4117 Update README.md 2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun
87e2e58a22 Update gh-pages.yml 2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun
de20e5a992 Update Dockerfile.xpu
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun
2f9c0618f0 Update Dockerfile.gpu 2025-07-03 10:14:39 +08:00
Yuanle Liu
9a14ab6572 add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl (#2682)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 05:32:20 -07:00
Divano
d1cb3ed571 Update gh-pages.yml (#2680) 2025-07-02 17:36:18 +08:00
handiz
b8a8a19689 add wint2 performance (#2673) 2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun
97ac82834f Update nvidia_gpu.md 2025-07-02 16:54:14 +08:00