FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Author	SHA1	Message	Date
FocusLuo	c2c1942db9	[INTEL_HPU] [CI] enabled fastdeploy PR testing (#4596 ) * [INTEL HPU] added hpu ci work flow support Signed-off-by: Luo, Focus <focus.luo@intel.com> * [INTEL HPU] added run ci hpu test scripts Signed-off-by: Luo, Focus <focus.luo@intel.com> * [INTEL HPU] enabled HPU ernie test case Signed-off-by: Luo, Focus <focus.luo@intel.com> * [INTEL HPU] updated Intel Gaudi Readme with Warmup disable cmdline Signed-off-by: Luo, Focus <focus.luo@intel.com> * Modify paddlepaddle installation command Updated paddlepaddle installation command to use a specific index URL. * Update run_ci_hpu.sh * Rename json directory to nlohmann_json Rename extracted json directory to nlohmann_json. * Update ci_hpu.yml * Set pip global index URL to Tsinghua mirror * Update CI workflow to use self-hosted runner and paths * Update Docker image in CI workflow * Modify HPU installation URLs in run_ci_hpu.sh Updated the installation URL for paddle_intel_hpu and added paddlenlp_ops installation. * Fix paddle_intel_hpu installation URL Corrected the URL for paddle_intel_hpu wheel installation. --------- Signed-off-by: Luo, Focus <focus.luo@intel.com> Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>	2025-11-17 19:24:41 +08:00
plusNew001	7f94d77e08	[XPU][CI] fix ci case bug (#5084 ) * Ignore markdown and text files in CI workflow * Change GPU_ID to XPU_ID in run_ci_xpu.sh * Change GPU_ID to XPU_ID in test configuration * Change GPU_ID to XPU_ID for service port calculation * Change GPU_ID to XPU_ID for device identification * Change GPU_ID to XPU_ID in test_ep function * Update run_w4a8.py * Redirect stop_processes output to kill.log Redirect output of stop_processes to kill.log to capture logs. * Log server output for failed test cases Added logging of server.log for failed tests. * Add '-s' option to pytest commands in run_ci_xpu.sh * Refactor assertion to validate multiple keywords Updated assertion to check for multiple keywords in response. * Fix assertany to assert any in run_45vl.py	2025-11-17 16:01:27 +08:00
plusNew001	0e819cd596	[CI][XPU] Optimize CI logs and variable names (#5025 ) * Ignore markdown and text files in CI workflow * Change GPU_ID to XPU_ID in run_ci_xpu.sh * Change GPU_ID to XPU_ID in test configuration * Change GPU_ID to XPU_ID for service port calculation * Change GPU_ID to XPU_ID for device identification * Change GPU_ID to XPU_ID in test_ep function * Update run_w4a8.py * Redirect stop_processes output to kill.log Redirect output of stop_processes to kill.log to capture logs. * Log server output for failed test cases Added logging of server.log for failed tests. * Add '-s' option to pytest commands in run_ci_xpu.sh	2025-11-14 19:35:35 +08:00
zccjjj	88da9d9788	[XPU] [CI] Change CI ep test from offline to online (#4885 ) * change CI ep test from offline to online * add ep all2all ci's changes, from offline to online * change env var in ep-all2all ci test * add expected response for ep8tp8 all2all * Adapt to CI refactoring and support dual-concurrent code execution * Adapt to CI refactoring and support dual-concurrent, second * Explicitly specify the #port * change the startup method of all2all * Modify the command of all2all * Update assertion to check multiple keywords * Update assertion to check multiple keywords * Update run_w4a8.py * Update run_w4a8.py --------- Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>	2025-11-13 16:15:45 +08:00
yzwu	76e60e98f8	[Iluvatar][CI] fix safetensors_rust.SafetensorError: framework paddle is invalid (#4972 )	2025-11-12 14:13:40 +08:00
yzwu	3707af7a4f	[Iluvatar] add vl into ci and support v1 loader (#4774 )	2025-11-11 10:50:17 +08:00
Yuanle Liu	3dc0ffa46d	[TSP] Support qwen3 moe tsp + cudagraph (#4871 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * support qwen3_moe tsp mode * fix * fix * update * update * update * fix * support external_rmsnorm * update * fix	2025-11-10 23:37:51 +08:00
plusNew001	3665c283b5	[XPU] [CI]Change CI to multi-concurrency (#4866 ) * Refactor GPU ID logic in CI workflow Updated GPU ID assignment logic and removed unused port calculations. * Refactor GPU device and port configuration * Update engine_worker_queue_port calculation logic * Refactor XPU_VISIBLE_DEVICES export logic * Adjust service port based on GPU ID * Adjust service HTTP port based on GPU ID * Adjust service_http_port based on GPU_ID * Add import for os module in run_45T.py * Update run_45vl.py * Import os module in run_w4a8.py Added import for os module to use environment variables. * Remove duplicate import of os module * Remove duplicate import of os module * Update run_45T.py * Update run_w4a8.py * fix bug * fix bug * Update run_w4a8.py * Fix directory change command in run_ci_xpu.sh	2025-11-10 21:09:48 +08:00
plusNew001	0a3bc84f71	[XPU][CI]Update test assertion and base response value (#4907 )	2025-11-10 11:44:54 +08:00
plusNew001	fa098383f6	[XPU][CI] Ci bug fix (#4889 ) * Refactor test_45t by commenting out responses Comment out base response variables and update assertion. * Update run_w4a8.py * Fix assertion syntax in run_45T.py	2025-11-07 17:50:11 +08:00
YuBaoku	fa28745f19	[CI] Update ERNIE-4.5-VL baseline to adapt to MoE changes (#4867 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-11-06 22:02:10 +08:00
YuBaoku	a139f8f3cb	[CI] Optimize port cleanup logic (#4860 )	2025-11-06 19:13:48 +08:00
plusNew001	fc8bef2c95	[XPU][CI]Change ci vl model to 28 b (#4764 ) * Update XPU_VISIBLE_DEVICES and model parameters * Update base response and adjust max tokens * Implement process cleanup in CI workflow Add process cleanup commands to prevent port conflicts * Remove process cleanup commands from CI workflow Removed old process cleanup commands to prevent port conflicts.	2025-11-06 14:12:23 +08:00
zhupengyang	2fd254e5b7	support ep+tp at op layer (#4688 )	2025-11-05 11:15:57 +08:00
YuBaoku	722110a952	[CI] Refactor CE wheel upload for multiple target paths (#4790 ) * [CI] Refactor CE wheel upload for multiple target paths * [CI] fix test_streaming_with_stop_str error	2025-11-04 18:56:38 +08:00
plusNew001	9887025926	Update run_w4a8.py (#4783 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-11-03 21:41:00 +08:00
yinwei	377f3bf5f2	[XPU] add v1 support for bf16 (#4744 ) * support v1 loader * update code style * update code	2025-11-03 14:13:17 +08:00
YuBaoku	acef624049	[CI] Fix rollout_model test logic (#4730 )	2025-10-31 22:25:24 +08:00
plusNew001	ea866e4b34	[XPU] [CI] Add Vl case (#4649 ) * Enhance CI script with health checks and logging Updated the CI script to include health checks and logging for the VL model testing process. * Add test for OpenAI chat completions * Refactor chat completion user message structure * Fix variable name for exit code in CI script * Update text prompt to Chinese for artifact question * Update service port and response assertions in tests * Refactor assertion for response content comparison * Update run_45vl.py * Change service HTTP port from 8123 to 8188	2025-10-31 10:38:09 +08:00
李泳桦	a012e3608b	[Feature] support logits processors (#4515 ) * [feat] provide an interface for logits processors and a builtin LogitBiasLogitsProcessor * [chore] fix code style * [fix] add unit test & fix existing bugs * [feat] add engine/worker arg --logits-processors * [fix] redefine user args as logits_processors_args and fix some bugs * [fix] fix test_sampler * Update fastdeploy/model_executor/logits_processor/builtin.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/model_executor/logits_processor/__init__.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tests/model_executor/test_logits_processor.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * [fix] fix typo * Update fastdeploy/engine/sampling_params.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * [fix] fix bracelet * [chore] redefine logits processor interface: pass the entire share_inputs into LP, do not copy share_inputs and logits * [doc] add docs * [fix] fix logit bias processor not applied when decoding is too fast & add docs and tests * [fix] fix redundant code * [feat] skip apply() if no bias is specified --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-10-29 00:08:53 +08:00
YuBaoku	e1ac90d787	[CI] Revert test_rollout_model directory change (#4626 )	2025-10-28 20:14:00 +08:00
yyssys	cd6d1f633c	[XPU]add xpu ci w4a8 case (#4501 )	2025-10-28 19:02:29 +08:00
YuBaoku	b2c6c41447	[CI] Relocate server test cases from ci_use directory to e2e (#4608 )	2025-10-28 11:37:30 +08:00
yyssys	822dea8d5f	[XPU]Moe uses a new operator (#4585 ) * [XPU]Moe uses a new operator * [XPU]Moe uses a new operator * update response	2025-10-24 23:01:46 +08:00
Sunny-bot1	4ffe41a747	WINT4/WINT8 dense gemm default use Machete (#4451 )	2025-10-23 17:57:59 +08:00
Yuanle Liu	8e02a509c3	[CI] stable test_rollout_model.py (#4536 ) * stable test_rollout_model.py * update baseline * update baseline	2025-10-22 01:59:44 -07:00
yzwu	dc7facaa7f	[Iluvatar GPU] fix ci error caused by rebuild_padding param and cuda graph (#4504 )	2025-10-21 21:41:41 +08:00
plusNew001	2bd3fb6315	[XPU]add xpu ci ep case (#4432 ) * add xpu ci case * Add xDeepEP download and build steps Download and build xDeepEP before running tests. * Fix formatting and add missing sleep command * Update Docker image version in CI workflow * Modify run_ci_xpu.sh for log cleanup and error handling Clean up log files before running tests and output worker log on failure. * Enhance test_ep.py with process management and assertions Refactor test function to include process cleanup and assertions. * Replace test_fastdeploy_llm with test_fd_ep * Fix conditional statement in run_ci_xpu.sh * Update test_ep.py for string handling and formatting Fix string encoding issues and improve readability. * Rename test_ep.py to run_ep.py * Change test script from test_ep.py to run_ep.py	2025-10-21 19:19:40 +08:00
RAM	775edcc09a	[Executor] Default use CUDAGraph (#3594 ) * add start intercept * Adjustment GraphOptConfig * pre-commit * default use cudagraph * set default value * default use cuda graph * pre-commit * fix test case bug * disable rl * fix moba attention * only support gpu * Temporarily disable PD Disaggregation * set max_num_seqs of test case as 1 * set max_num_seqs and temperature * fix max_num_batched_tokens bug * close cuda graph * success run wint2 * profile run with max_num_batched_tokens * 1.add c++ memchecker 2.success run wint2 * updatee a800 yaml * update docs * 1. delete check 2. fix plas attn test case * default use use_unique_memory_pool * add try-except for warmup * ban mtp, mm, rl * fix test case mock * fix ci bug * fix form_model_get_output_topp0 bug * fix ci bug * refine deepseek ci * refine code * Disable PD * fix sot yaml	2025-10-21 14:25:45 +08:00
YuBaoku	70a29ec49e	[CI] update ernie-4_5-vl baseline (#4495 ) * [CI] update ernie-4_5-vl baseline * [CI] update Qwen2.5-VL-7B-Instruct baseline	2025-10-21 10:18:29 +08:00
yinwei	a64c0408b9	[XPU]Fix w4a8 precision bug && rollback moe algo (#4463 ) * fix w4a8 precision bug * add env * code stype check	2025-10-17 18:27:53 +08:00
YuBaoku	01510876ab	[CI] Fix partial instability issues (#4461 )	2025-10-17 14:17:06 +08:00
plusNew001	a21e16ee5f	[XPU] fix XPU CI bug (#4358 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Run Accuracy Tests (push) Has been cancelled Details CI Images Build / Run Stable Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details * Update assertions for response content in test_45t fix XPU CI bug * Comment out base_response print statement Comment out the print statement for base_response. * Refactor assertion for clarity in run_45T.py * Add blank line before main function call	2025-10-11 14:48:14 +08:00
yinwei	20c7b741f4	[XPU] Support W4A8C8-TP4-300B Model (#4068 ) * support w4a8 * delete ep block attn * delete moe_topk_select * update note * update * delte useless info * update * add some note * fix some format * update scale info * add ans baseline --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-10-10 15:41:32 +08:00
memoryCoderC	4ec00df2b0	[Feature] add config api (#4254 )	2025-09-26 11:21:02 +08:00
K11OntheBoat	4515ad21e9	Support limit thinking lengths (#4069 ) Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com”>	2025-09-25 19:55:56 +08:00
CSWYF3634076	5ff10c8ced	[Model] Qwen2.5VL support --use-cudagraph and unit testing (#4087 ) * [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix * [Docs]offine infer add apply_chat_template add_generation_prompt parameter * [Model]qwen2.5VL support --use-cudagraph * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v2 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v3 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v4 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v5 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v6 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v7	2025-09-24 19:45:01 +08:00
chen	ec99474e71	[Test]add glm45_air logprob test and rollout model (#4175 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * add glm45_air logprob test * add glm rollout model and pretrainedmodel for rl * add glm rollout model and test * check * delete cudagraph in glm45 * add UT for glm rollout model * revert glm UT	2025-09-23 21:06:07 +08:00
plusNew001	2c34a557f4	[XPU]change xpu ci model (#4117 ) * change xpu ci model * change xpu ci model * change xpu ci model * change xpu ci model * Update model path and XPU settings in run_ci_xpu.sh * Increase health check timeout to 10 minutes Increased the timeout duration for health checks from 5 minutes to 10 minutes in two places. * Implement test for OpenAI chat completion Add a test function for the OpenAI client chat response. * Change script to use pytest for running tests * Update health check timeout to 15 minutes Increase the timeout for health checks from 10 minutes to 15 minutes. * Add pytest installation to CI script * Modify base response in test_45t function Updated the base response message for the test. * Add V0 and V1 mode test echo statements --------- Co-authored-by: root <root@yq01-inf-hic-k8s-a100-aa24-0591.yq01.baidu.com>	2025-09-23 10:21:17 +08:00
yzwu	504461b6b5	[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651 )	2025-09-22 21:13:59 +08:00
co63oc	c4830ef24c	fix typos (#4176 ) * fix typos * fix	2025-09-22 14:27:17 +08:00
chenjian	618ccdbfba	[Feature] Support mixed deployment with yiyan adapter in develop (#3976 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * [Feature] Support mixed deployment with yiyan adapter in release2.2 * fix metrics * add unit test * add unit test * add unit test * fix ci * fix for eb5 * fix ci * fix ci * fix ci --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com> Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-09-18 01:52:20 +08:00
YuBaoku	2745f37017	[CI] enhance clean port and add waiting time (#4152 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Run Accuracy Tests (push) Has been cancelled Details CI Images Build / Run Stable Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details	2025-09-17 20:31:49 +08:00
YUNSHEN XIE	c01a756912	mv test to tests (#4129 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Run Accuracy Tests (push) Has been cancelled Details CI Images Build / Run Stable Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details	2025-09-16 20:45:40 +08:00
zhouchong	958abebeab	Support offline inference with streaming output (#4071 ) * Support offline inference with streaming output * add unit test --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-09-15 10:27:03 +08:00
YuBaoku	fec58639db	[CI] skip test_structured_outputs* temporarily (#4055 )	2025-09-11 18:07:50 +08:00
qwes5s5	17169a14f2	[metrics] Add serveral observability metrics (#3868 ) * Add several observability metrics * [wenxin-tools-584] 【可观测性】支持查看本节点的并发数、剩余block_size、排队请求数等信息 * adjust some metrics and md files * trigger ci * adjust ci file * trigger ci * trigger ci --------- Co-authored-by: K11OntheBoat <your_email@example.com> Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-09-08 14:13:13 +08:00
Zhang Yulong	349aa6348b	add cache queue port (#3904 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Run Accuracy Tests (push) Has been cancelled Details CI Images Build / Run Stable Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details * add cache queue port * add cache queue port * add cache queue port	2025-09-05 21:17:06 +08:00
Zhang Yulong	4c160aa4dd	Update test_ernie_21b_mtp.py (#3885 )	2025-09-04 20:20:36 +08:00
kevin	1908465542	[Feature] mm and thinking model support structred output (#2749 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * mm support structured output * update code * update code * update format * update code * update code * add enable_thinking default * update code * add structured_outputs test case * add ci install xgrammar * add ci timeout time * update test for structured_outputs * update code * add error traceback info * update error msg * update structred output code * update code * update code * update config * update torch version --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-09-02 16:21:09 +08:00

1 2

62 Commits