28 Commits

Author SHA1 Message Date
YuBaoku
7cee8030af [CI] Disable unstable test jobs and cases (#4799)
[CI] Disable unstable test jobs and cases
2025-11-05 10:28:53 +08:00
ApplEOFDiscord
52a6e0be41 [Cherry-Pick] add mm token usage (#4648)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Feature] add mm token usage (#4570)

* add mm token usage

* fix unit test

* fix unit test

* fix unit test

* fix model path

* fix unit test

* fix unit test

* fix unit test

* remove uncomment

* change var name

* fix code style

* fix code style

* fix code style

* fix code style

* fix unit test

* update doc

* update doc

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-10-30 09:58:07 +08:00
SunLei
2a9ed72533 feat: add support for API usage with multimodal models (#4548)
* feat: add support for API usage with multimodal models

* completion_tokens contains num_image_tokens

* remove test_request.py

* fix: paddle.device.is_compiled_with_cuda()

* fix test_unstream_without_logprobs
2025-10-28 20:23:46 +08:00
Zhang Yulong
b4014834a9 Extend sleep time to 10 seconds in switch_service (#4618)
Increase sleep duration before switching services.
2025-10-28 15:19:21 +08:00
kxz2002
327fa4c255 [DataProcessor] add reasoning_tokens into usage info (#4520)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* add reasoning_tokens into usage info initial commit

* add unit tests

* modify unit test

* modify and add unit tests

* fix unit test

* move steam usage to processor

* modify processor

* modify test_logprobs

* modify test_logprobs.py

* modify stream reasoning tokens accumulation

* fix unit test
2025-10-25 16:57:58 +08:00
RAM
8a02ab43a8 [FDConfig]Turn on the CUDAGraph + RL switch (#4508)
* Turn on the CUDAGraph + RL switch

* reduce max_num_seqs and number of request
2025-10-23 11:08:07 +08:00
Divano
2b53c4d684 【CI】Add test cases for n parameter and streaming validation (#4503)
* add repitation early stop cases

* add repitation early stop cases

* add structure test openai

* add n parameters case
2025-10-21 15:33:29 +08:00
RAM
775edcc09a [Executor] Default use CUDAGraph (#3594)
* add start intercept

* Adjustment GraphOptConfig

* pre-commit

* default use cudagraph

* set default value

* default use cuda graph

* pre-commit

* fix test case bug

* disable rl

* fix moba attention

* only support gpu

* Temporarily disable PD Disaggregation

* set max_num_seqs of test case as 1

* set max_num_seqs and temperature

* fix max_num_batched_tokens bug

* close cuda graph

* success run wint2

* profile run with max_num_batched_tokens

* 1.add c++ memchecker 2.success run wint2

* updatee a800 yaml

* update docs

* 1. delete check 2. fix plas attn test case

* default use use_unique_memory_pool

* add try-except for warmup

* ban mtp, mm, rl

* fix test case mock

* fix ci bug

* fix form_model_get_output_topp0 bug

* fix ci bug

* refine deepseek ci

* refine code

* Disable PD

* fix sot yaml
2025-10-21 14:25:45 +08:00
kxz2002
b5b993e48e 【feature】support n parameter (#4273)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support n parameter

* pre-commit check

* pre-commit check

* restore format_and_add_data

* update n_param

* bug fix index - str to int

* bug fix del child_task

* bug fix metrics

* add debug info

* add debug info2

* remove debug info

* change connecting symbol to '-'

* bugfix change connecting symbol

* bugfix change connecting symbol2

* unit tests fix

* unit test fix2

* unittest add param n=2

* n param add unit tests and adapt to echo

* pre-commit fix

* resolve review

* adjust stop reason

* add unittest for _create_chat_completion_choice

* modify unittest

* solve confict

* solve conflict

* resolve conflict

---------

Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com>
Co-authored-by: gaoziyuan <m13689897706@163.com>
2025-10-17 20:51:59 +08:00
Ryan
49cea8fb1c [SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694)
* rm inplace info && to(gpu)

* update append_attention

* unpin paddle version

* add full_cuda_graph=False

* add blank line

---------

Co-authored-by: SigureMo <sigure.qaq@gmail.com>
2025-10-17 10:57:55 +08:00
LiqinruiG
4251ac5e95 【Fix】 remove text_after_process & raw_prediction (#4421)
* remove text_after_process &  raw_prediction

* remove text_after_process &  raw_prediction
2025-10-16 19:00:18 +08:00
yinwei
20c7b741f4 [XPU] Support W4A8C8-TP4-300B Model (#4068)
* support w4a8

* delete ep block attn

* delete moe_topk_select

* update note

* update

* delte useless info

* update

* add some note

* fix some format

* update scale info

* add ans baseline

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-10-10 15:41:32 +08:00
co63oc
c4830ef24c fix typos (#4176)
* fix typos

* fix
2025-09-22 14:27:17 +08:00
Divano
0b62648924 test xly ci 2025-09-22 14:13:00 +08:00
Divano
8e49d99009 Addcase (#4112)
logprob 没跑,不影响,增加校验openai 异常情况下 错误输出格式字段的case
2025-09-16 16:12:14 +08:00
xiaolei373
9ac539471d [format] Valid para format error info (#4035)
* feat(log):add_request_and_response_log

* 报错信息与OpenAI对齐
2025-09-12 19:05:17 +08:00
Zhang Yulong
2359c8d21c update ci (#3962)
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-09-09 10:09:13 +08:00
Zhang Yulong
349aa6348b add cache queue port (#3904)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* add cache queue port

* add cache queue port

* add cache queue port
2025-09-05 21:17:06 +08:00
Divano
17731a8acd add concurrency cases (#3689) 2025-08-28 18:30:19 +08:00
ltd0924
3d92fb09f7 [BugFix] fix parameter is 0 (#3592)
* Update engine_client.py

* fix

* Update common_engine.py
2025-08-28 09:52:36 +08:00
xjkmfa
afb9f327ef 【CI case】for echo finish_reason text_after_process and raw_prediction check (#3630)
* Add ci case for min token and max token

* 【CI case】include total_tokens in the last packet of completion interface stream output

* echo&finish_reason&text_after_process&raw_prediction check

* echo&finish_reason&text_after_process&raw_prediction check

* echo&finish_reason&text_after_process&raw_prediction check

* echo&finish_reason&text_after_process&raw_prediction check

* echo&finish_reason&text_after_process&raw_prediction check

---------

Co-authored-by: xujing43 <xujing43@baidu.com>
2025-08-27 15:21:16 +08:00
Ryan
a5b4866ff1 [CudaGraph][SOT] Add unit tests for splitting the static graph into piecewise graphs that support cuda_graph (#3590)
* add unitest

* change sot_warmup_sizes

* wtf; add missed commit
2025-08-26 11:25:04 +08:00
chen
9cab3f47ff [Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3552)
* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing

* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs

* delete some code

* code check

* code check and add doc

* fix tokenizer.decoder(-1), return 'Invalid Token'

* add ci for temp_scaled and top_p logprobs

* check test

* check seq len time shape

* logprob clip inf

---------

Co-authored-by: sunlei1024 <sunlei5788@gmail.com>
2025-08-25 14:11:49 +08:00
YuBaoku
7821534ff5 [CI] add sot test (#3579)
* [CI] add sot test

* [CI] add sot test
2025-08-25 10:14:50 +08:00
Zhang Yulong
3cc182236a update ci (#3519)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-21 20:05:50 +08:00
Yzc216
466cbb5a99 [Feature] Models api (#3073)
* add v1/models interface related

* add model parameters

* default model verification

* unit test

* check model err_msg

* unit test

* type annotation

* model parameter in response

* modify document description

* modify document description

* unit test

* verification

* verification update

* model_name

* pre-commit

* update test case

* update test case

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/entrypoints/openai/serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-21 17:02:56 +08:00
Zhang Yulong
b7eee3aec1 Update CI (#3474)
* update CI cases

* update CI cases

* update CI cases

* update CI cases

* Merge upstream/develop and resolve directory rename conflict

* Merge upstream/develop and resolve directory rename conflict

* Merge upstream/develop and resolve directory rename conflict

* update deploy

* update deploy

* update deploy

* update deploy

* update deploy
2025-08-21 16:49:20 +08:00
YUNSHEN XIE
3a6058e445 Add stable ci (#3460)
* add stable ci

* fix

* update

* fix

* rename tests dir;fix stable ci bug

* add timeout limit

* update
2025-08-20 08:57:17 +08:00