Copilot
5cec66adb8
[Docs] 更新环境变量文档以同步最新代码 ( #5713 )
...
* Initial plan
* 更新环境变量文档以匹配最新代码
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-23 19:49:20 +08:00
Copilot
e9f5397bc9
[Docs] Update parameters documentation with latest code defaults and new parameters ( #5709 )
...
* Initial plan
* Update parameters documentation with correct default values and new parameters
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-23 17:31:44 +08:00
Divano
c1aa66df02
Revert "[Optim] Remove limitation of number of kvcache blocks ( #5612 )" ( #5702 )
...
This reverts commit 9da89a374b .
2025-12-23 15:41:33 +08:00
Jiang-Jia-Jun
9da89a374b
[Optim] Remove limitation of number of kvcache blocks ( #5612 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Optim] Remove limitation of number of kvcache blocks
* Update fastdeploy/envs.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/worker/iluvatar_worker.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Add docs
* Update fastdeploy/worker/worker_process.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix ci case
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-23 11:18:29 +08:00
yzwu
ac013803f3
[Iluvatar] Support V1_KVCACHE_SCHEDULER and paddleocr-vl rope mode ( #5555 )
2025-12-18 02:14:25 -08:00
xiaolei373
a30b4da260
[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 ( #5458 )
2025-12-16 16:36:09 +08:00
Echo-Nie
e1347be4d9
[Docs] Fix nvidia_gpu.md, add sm80 in precompiled ( #5462 )
...
* Update supported GPU architectures in installation guide
* Update supported architectures in GPU installation guide
* Update GPU architectures support in installation guide
2025-12-11 14:41:50 +08:00
qwes5s5
d79438bb86
add detoken switch ( #5463 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-10 21:44:02 +08:00
Jiang-Jia-Jun
3bdd54ef6e
Disable unsupported feature in multi-node deployment docs
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-10 20:23:19 +08:00
lizexu123
95eab9f9ee
[Feature] support stop_token_ids ( #5399 )
...
* support stop_token_ids
* fix
* delete chinese
* support both
* delete print
2025-12-09 17:49:12 +08:00
Juncai
80efe98f8d
[PD Disaggregation] Add timestamp for analyzing splitwise deployment ( #5317 )
...
* Add timestamp for analyzing splitwise deployment
* up
* up
* up
* up
* up
* up
* fix format
* fix
2025-12-08 10:08:44 +08:00
SunLei
3697110599
[Docs] update FAQ with logprobs MQ limits and deprecation ( #5368 )
...
* [doc] update FAQ with logprobs MQ limits and deprecation
* [doc] update FAQ with logprobs MQ limits and deprecation
* update faq
2025-12-04 15:57:04 +08:00
Daci
83dbc4e5dd
[Feature] Guided Decoding add LLguidance backend ( #5124 )
...
* llguidance
* add requirements_guided_decoding.txt and doc
* fix test_guidance_*.py
* fix test_guidance_*.py && mv
* fix llguidance choice
* test_guidance_*
* rm lazy loader
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-03 20:23:57 +08:00
Jiang-Jia-Jun
0eb799a324
Update installation requirements for Kunlunxin XPU
2025-12-03 10:04:29 +08:00
Jiang-Jia-Jun
335ae0f4a4
Update installation requirements for Kunlunxin XPU
2025-12-03 10:04:17 +08:00
lizexu123
c563eca791
[Feature] support reward model ( #5301 )
...
* Your commit message here
* add test
* update develop
* support reward
* support enable_chunk_prefill
* support bingfa
* support convert is reward
* update test
* delete print
* fix enable_thinking
* add document
* fix place
* fix test
* fix
* support enable_prefix_caching
* add no-enable_prefix-caching test
* fix
* support enable_prefix_caching
* delete print
* fix document
* fix
* fix test
* fix document and delete chinese
* udpate
* enable_thinking
* fix test
2025-12-02 14:55:31 +08:00
CSWYF3634076
051b82b4c8
[Docs] add qwen25-vl docs ( #5243 )
...
* [Docs] add qwen25-vl docs
* [Docs] add qwen25-vl docs
* [Docs] add qwen25-vl docs
2025-11-27 15:05:57 +08:00
LiqinruiG
df427ba06d
[Docs] add request params ( #5207 )
...
* [BugFix] rollback max_tokens and min_tokens when continue to infer
* [BugFix] rollback max_tokens and min_tokens when continue to infer
* [fix] add more logger info: max_tokens
* [Docs] add request params
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-26 15:04:22 +08:00
Yonghua Li
cead6b26fa
[Metrics] Update time_to_first_token to include tokenization & queue time, and remove redundant metrics ( #4993 )
...
* [update] update time_to_first_tokens to include queue time, and remove first_token_latency and infer_latency
* [doc] update docs
* [ci] fix test
* [chore] delete redundant code
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-26 14:42:17 +08:00
ApplEOFDiscord
287751f19d
[Docs] add docs of base64 or local file mm inputs ( #5193 )
2025-11-26 14:41:43 +08:00
qw86972190
f5c1066245
[XPU]Update documentation ( #5180 )
...
* [XPU]Update document
* [XPU]Update documentation
2025-11-24 14:00:51 +08:00
qw86972190
857d152464
[XPU][Docs]Update document ( #5091 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-19 10:20:14 +08:00
FocusLuo
c2c1942db9
[INTEL_HPU] [CI] enabled fastdeploy PR testing ( #4596 )
...
* [INTEL HPU] added hpu ci work flow support
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] added run ci hpu test scripts
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] enabled HPU ernie test case
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] updated Intel Gaudi Readme with Warmup disable cmdline
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* Modify paddlepaddle installation command
Updated paddlepaddle installation command to use a specific index URL.
* Update run_ci_hpu.sh
* Rename json directory to nlohmann_json
Rename extracted json directory to nlohmann_json.
* Update ci_hpu.yml
* Set pip global index URL to Tsinghua mirror
* Update CI workflow to use self-hosted runner and paths
* Update Docker image in CI workflow
* Modify HPU installation URLs in run_ci_hpu.sh
Updated the installation URL for paddle_intel_hpu and added paddlenlp_ops installation.
* Fix paddle_intel_hpu installation URL
Corrected the URL for paddle_intel_hpu wheel installation.
---------
Signed-off-by: Luo, Focus <focus.luo@intel.com >
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-17 19:24:41 +08:00
Jiang-Jia-Jun
d41cf643f8
Update nvidia_gpu.md
2025-11-14 18:26:08 +08:00
Jiang-Jia-Jun
692d69229b
Update nvidia_gpu.md
2025-11-14 18:17:32 +08:00
Echo-Nie
ee1ea43e36
[Docs] Fix broken commitID ( #5008 )
...
* fix commitID
* Update nvidia_gpu.md
2025-11-14 10:39:41 +08:00
Juncai
36822fa49c
[PD Disaggregation] remove splitwise deployment on single node and refine the code ( #4891 )
...
* remove splitwise deployment on single node and refine the code
* up
* up
* up
* add test
* up
2025-11-14 09:56:53 +08:00
Echo-Nie
a5e949d9d0
[Feature] Enhance build script, add pre_wheel logic ( #4729 )
...
* Enhance build script, add pre_wheel logic
Updated copyright year and added precompiled wheel installation logic.
* update the nvidia_gpu.md, add pre_wheel description
* fix zh .md
* update the url, automatically detect CUDA and SM
* Fix GPU architecture string formatting in build.sh
* Change default for FD_USE_PRECOMPILED to 0
* fix build.sh
* add ./dist, pre-wheel path
* simplify the process,just save the whl
* del pre_wheel dir
* fix function name, extract_ops_from_precompiled_wheel
* fix docs
* add default commitID in docs
---------
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 19:03:52 +08:00
Jiang-Jia-Jun
8329338d37
Update nvidia_gpu.md
2025-11-13 10:25:22 +08:00
Jiang-Jia-Jun
c8140326fa
Update nvidia_gpu.md
2025-11-12 20:50:09 +08:00
JYChen
a1218076dc
remove load default_v1 since already been as default ( #4980 )
2025-11-12 16:49:48 +08:00
yzwu
08b96baa4a
[Iluvatar][Doc] Add ERNIE-4.5-VL-28B-A3B-Thinking doc ( #4955 )
2025-11-11 19:15:19 +08:00
Lucas
5280b9e0b4
[XPU] fix xpu deployment md ( #4941 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-11 14:39:52 +08:00
yinwei
215cda2f80
[XPU][Doc]Update XPU release2.3 note ( #4939 )
...
* update doc
* update
* update
* udpate
2025-11-11 11:57:49 +08:00
LiqinruiG
75294bcfb1
[Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction ( #4944 )
...
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-11 11:40:52 +08:00
yzwu
3707af7a4f
[Iluvatar] add vl into ci and support v1 loader ( #4774 )
2025-11-11 10:50:17 +08:00
LiqinruiG
3f74281496
[Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction ( #4937 )
...
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-11 10:43:44 +08:00
yangjianfengo1
d7f14dba8b
uodate docx ( #4938 )
...
Co-authored-by: root <root@yq02-inf-sci-k8s-a100-aa2ni5-0018.yq02.baidu.com >
2025-11-11 10:28:46 +08:00
Yuanle Liu
3dc0ffa46d
[TSP] Support qwen3 moe tsp + cudagraph ( #4871 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3_moe tsp mode
* fix
* fix
* update
* update
* update
* fix
* support external_rmsnorm
* update
* fix
2025-11-10 23:37:51 +08:00
chen
927bd74075
[Docs] add doc for glm ( #4933 )
...
* add doc for glm
* del v1 loader
* delete mtp
2025-11-10 21:21:33 +08:00
LiqinruiG
aa79e6185a
[Docs] Improve reasoning_out docs ( #4901 )
...
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-10 19:20:38 +08:00
qw86972190
07b21d241d
[XPU]Update documentation ( #4917 )
...
* [XPU]Update documentation
* [XPU]Update documentation
* [XPU]Update documentation
* [XPU]Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
2025-11-10 19:11:42 +08:00
LiqinruiG
90b0936ae9
[Docs] add api-key usage instructions ( #4902 )
...
* [Docs] add api-key usage instructions
* [Docs] add api-key usage instructions
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-10 13:39:39 +08:00
zhuzixuan
8a9e7b53af
[Docs]Supplement the English and Chinese user documentation for Tool calling ( #4895 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* tool calling文档编写,v1.0
* tool calling文档编写,v1.0
* tool calling文档编写,v1.0
* tool calling doc,v1.1
* tool calling doc,v1.1
* tool calling doc,v1.1
* tool calling doc,v1.1
2025-11-08 20:05:14 +08:00
ddchenhao66
72d5ee9a7c
[XPU] modify 424B model deployment parameter ( #4888 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-11-07 17:34:37 +08:00
ming1753
cba185f1fe
[Feature] Optim PaddleOCR-VL ( #4873 )
...
* [Feature] Optim PaddleOCR-VL
* fix bug
2025-11-07 14:56:44 +08:00
Ding
6c316286c1
fix: correct typo in nvidia_gpu.md ( #4848 )
2025-11-06 16:03:02 +08:00
Juncai
08ca0f6aea
[Feature] [PD] add simple router and refine splitwise deployment ( #4709 )
...
* add simple router and refine splitwise deployment
* fix
2025-11-06 14:56:02 +08:00
Jiang-Jia-Jun
aec1a84886
[Doc] Update docs for v2.3.0rc0 ( #4828 )
...
* [Doc] Update docs for v2.3.0rc0
* [Doc] Update docs for v2.3.0rc0
* [Doc] Update docs for v2.3.0rc0
* Update README_CN.md
* Add deployment guide link for FastDeploy v2.3-rc0
Updated release note for FastDeploy v2.3-rc0 to include deployment guide link.
* Add Deployment Guide link for FastDeploy v2.3-rc0
Updated the news section to include a link to the Deployment Guide for FastDeploy v2.3-rc0.
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-11-05 19:45:53 +08:00
chen
1c3ca48128
[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs ( #4769 )
2025-11-05 10:43:25 +08:00