YuBaoku
71bbedaf50
[Cherry-Pick][BugFix][CI] fix vl moe( #4867 ) ( #4869 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [CI] update paddlepaddle_gpu==3.2.1 and fix rollout_model test logic
* [Cherry-Pick][BugFix][CI] fix vl moe(#4867 )
2025-11-07 00:03:36 +08:00
YuBaoku
7cee8030af
[CI] Disable unstable test jobs and cases ( #4799 )
...
[CI] Disable unstable test jobs and cases
2025-11-05 10:28:53 +08:00
YuBaoku
8632b778f5
[CI] update paddlepaddle_gpu==3.2.1 and fix rollout_model test logic ( #4738 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-02 21:30:23 +08:00
RAM
00d0da0c18
[Graph Optimization] Add the CUDAGraph usage switch for Draft Model ( #4669 )
...
* add draft model using cudagraph switch
* set default as false
* capture draft model in ci
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-10-31 17:34:09 +08:00
YuBaoku
e1ac90d787
[CI] Revert test_rollout_model directory change ( #4626 )
2025-10-28 20:14:00 +08:00
YuBaoku
b2c6c41447
[CI] Relocate server test cases from ci_use directory to e2e ( #4608 )
2025-10-28 11:37:30 +08:00
CSWYF3634076
acd331780c
[V1 loader] Qwen25 VL support v1 loader and torch style safetensors load ( #4388 )
...
* [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix
* [Docs]offine infer add apply_chat_template add_generation_prompt parameter
* [Model]qwen2.5VL support --use-cudagraph
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v2
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v3
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v4
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v5
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v6
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v7
* qwen25vl v1 loader
* qwen25vl v1 loader v2
* qwen25vl v1 loader v3
* qwen25vl v1 loader fix tp2 weight PySafeSlice
* qwen25vl v1 loader no test
* qwen25vl v1 loader add unit test
* qwen25vl v1 loader add unit test v2
* qwen25vl v1 loader add torch unit test v3
* qwen25vl v1 loader add torch unit test v4
* qwen25vl v1 loader add torch unit test v5
* qwen25vl v1 loader add torch unit test v6
2025-10-27 10:54:15 +08:00
kxz2002
327fa4c255
[DataProcessor] add reasoning_tokens into usage info ( #4520 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
2025-10-25 16:57:58 +08:00
Sunny-bot1
4ffe41a747
WINT4/WINT8 dense gemm default use Machete ( #4451 )
2025-10-23 17:57:59 +08:00
luukunn
bbf06b9ff7
[BugFix]Fix finish reason ( #4543 )
...
* fix finish reason
* add unit test
* add unit test
* fix unie test
* fix unit test
2025-10-23 14:04:43 +08:00
RAM
775edcc09a
[Executor] Default use CUDAGraph ( #3594 )
...
* add start intercept
* Adjustment GraphOptConfig
* pre-commit
* default use cudagraph
* set default value
* default use cuda graph
* pre-commit
* fix test case bug
* disable rl
* fix moba attention
* only support gpu
* Temporarily disable PD Disaggregation
* set max_num_seqs of test case as 1
* set max_num_seqs and temperature
* fix max_num_batched_tokens bug
* close cuda graph
* success run wint2
* profile run with max_num_batched_tokens
* 1.add c++ memchecker 2.success run wint2
* updatee a800 yaml
* update docs
* 1. delete check 2. fix plas attn test case
* default use use_unique_memory_pool
* add try-except for warmup
* ban mtp, mm, rl
* fix test case mock
* fix ci bug
* fix form_model_get_output_topp0 bug
* fix ci bug
* refine deepseek ci
* refine code
* Disable PD
* fix sot yaml
2025-10-21 14:25:45 +08:00
YuBaoku
70a29ec49e
[CI] update ernie-4_5-vl baseline ( #4495 )
...
* [CI] update ernie-4_5-vl baseline
* [CI] update Qwen2.5-VL-7B-Instruct baseline
2025-10-21 10:18:29 +08:00
kxz2002
b5b993e48e
【feature】support n parameter ( #4273 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support n parameter
* pre-commit check
* pre-commit check
* restore format_and_add_data
* update n_param
* bug fix index - str to int
* bug fix del child_task
* bug fix metrics
* add debug info
* add debug info2
* remove debug info
* change connecting symbol to '-'
* bugfix change connecting symbol
* bugfix change connecting symbol2
* unit tests fix
* unit test fix2
* unittest add param n=2
* n param add unit tests and adapt to echo
* pre-commit fix
* resolve review
* adjust stop reason
* add unittest for _create_chat_completion_choice
* modify unittest
* solve confict
* solve conflict
* resolve conflict
---------
Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com >
Co-authored-by: gaoziyuan <m13689897706@163.com >
2025-10-17 20:51:59 +08:00
YuBaoku
01510876ab
[CI] Fix partial instability issues ( #4461 )
2025-10-17 14:17:06 +08:00
chen
db82e9a022
[BugFix]Fix wfp8afp8 triton moe group_topk renormalized=True ( #4449 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix group_topk renormalized=True
* check test
2025-10-16 23:17:48 +08:00
AIbin
6938df9c23
【Fix CI Bug】Fix ci bug ( #4413 )
...
* Support DSK-v3.2 model
* Support DSK-v3.2
* Support DSK-v3.2
* Support DSK-3.2
* fix CI bug
* fix_CI_BUG
* update ci bug
2025-10-15 14:19:04 +08:00
AIbin
368049673b
Add DeepSeek model end-to-end CI ( #4360 )
...
Add DeepSeek-v3-5layers end-to-end CI
2025-10-11 08:33:37 +08:00
K11OntheBoat
4515ad21e9
Support limit thinking lengths ( #4069 )
...
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”>
2025-09-25 19:55:56 +08:00
CSWYF3634076
5ff10c8ced
[Model] Qwen2.5VL support --use-cudagraph and unit testing ( #4087 )
...
* [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix
* [Docs]offine infer add apply_chat_template add_generation_prompt parameter
* [Model]qwen2.5VL support --use-cudagraph
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v2
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v3
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v4
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v5
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v6
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v7
2025-09-24 19:45:01 +08:00
chen
7c1fd19f0f
[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 ( #4238 )
2025-09-24 16:39:51 +08:00
bukejiyu
62d1c48363
[v1 loader]code style ( #4204 )
...
* code style
* update
2025-09-23 19:36:00 +08:00
luukunn
ee9d8a840a
[fix]Modify follow-up push parameters and Modify the verification method for thinking length ( #4086 )
...
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* add completion_token_ids
* add logger
* fix reasoning_max_tokens ParameterError
* add unittest
* add unittest
* add unittest
* add unittest
* add unittest
* add unit test
2025-09-19 14:26:01 +08:00
chen
4859f40b20
[Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) ( #4051 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-11 20:08:09 +08:00
chen
637d96c6ae
[Feature] Support zai-org/GLM-4.5-Air BF16 model ( #3928 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support glm45_air
2025-09-10 19:36:10 +08:00
qwes5s5
17169a14f2
[metrics] Add serveral observability metrics ( #3868 )
...
* Add several observability metrics
* [wenxin-tools-584] 【可观测性】支持查看本节点的并发数、剩余block_size、排队请求数等信息
* adjust some metrics and md files
* trigger ci
* adjust ci file
* trigger ci
* trigger ci
---------
Co-authored-by: K11OntheBoat <your_email@example.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-08 14:13:13 +08:00
Zhang Yulong
349aa6348b
add cache queue port ( #3904 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* add cache queue port
* add cache queue port
* add cache queue port
2025-09-05 21:17:06 +08:00
李泳桦
88297240e7
[feat] completion api supports passing input token ids in either prompt or prompt_token_ids ( #3311 )
...
* [feat] completion api supports passing input token ids in either `prompt` or `prompt_token_ids`
* [fix] update comment
* [fix] fix type error
* [test] add a unittest file for serving api test
* [test] try to fix ci error
* [chore] rename test function names
* [test] try to fix ci error
* [test] try to fix ci error
* [test] add tests for qwen
2025-08-29 14:19:42 +08:00
Yuanle Liu
cbce94a00e
rename ernie_xxx to ernie4_5_xxx ( #3621 )
...
* rename ernie_xxx to ernie4_5_xxx
* ci fix
2025-08-26 19:29:27 +08:00
Sunny-bot1
c68c3c4b8b
[Feature] bad words support v1 scheduler and specifiy token ids ( #3608 )
...
* support bad_words_token_ids
* docs
* fix test
* fix
* bad words support kvcache v1 and token ids
* fix
2025-08-25 20:14:51 -07:00
YUNSHEN XIE
cb166053ba
fix test name ( #3493 )
...
* fix test name
* update
* update
* fix
* fix
* update
* update
* update
* update
* update
* fix
* update
2025-08-22 23:43:47 +08:00