Commit Graph

3910 Commits

Author SHA1 Message Date
YuBaoku
0b0f2e320e [CI] Unified diff coverage upload logic (#5127)
* [CI] fix diff_coverage_report upload
2025-11-21 10:50:57 +08:00
kevin
7454480e07 [Feature] support bos download retry (#5137)
* support bos download retry

* update code

* update code
2025-11-21 10:18:32 +08:00
Yonghua Li
43097a512a [BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol (#5132)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [fix] fix v1 scheduler profile run for append attention in prefill node

* [fix] skip send_signal if kv signal not inited for gpu and xpu

* [fix] extend fix to flash_attn & mla_attn

* [fix] fix v1 pd run in ipc transfer protocol

* [ci] add test for v1 pd profile run using ipc transfer protocol

* [style] fix code style check

* [style] fix code style again

* [fix] fix profile run

* [update] remove --num-gpu-blocks-override in example script

* [chore] rename forward_meta is_profiling to is_dummy_or_profile_run
2025-11-20 21:39:22 +08:00
Juncai
01c30f6b87 Fix schedule error in splitwise deployment (#5149) 2025-11-20 21:18:10 +08:00
Jundong Liu
147b2e5eb0 [BugFix] Fix zero workspace returned by CUB size query under CUDA Graph in MoE dispatch (#5087)
* fix bug about CubKeyValueSorter::run

* pre-commit and add comment

* pre-commit

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix precommit

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-11-20 20:00:29 +08:00
Ryan
0857099191 mv import (#5146) 2025-11-20 19:25:56 +08:00
Jiaxin Sui
c3994750b1 [CI][XPU] Add XPU chunked_prefill && prefix_caching case (#5139)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-20 18:51:50 +08:00
周周周
385fe6dade [Others] clean code (#5133) 2025-11-20 18:44:08 +08:00
Yuanle Liu
7ac25935c7 [Optimization] default compile rdma, reduce cudagraph buffer size in mm, fix some config bug (#5121)
* default compile rdma, reduce cudagraph buffer size in mm, fix some config logic

* update

* update

* fix bug

* enhance rdma compile

* fix
2025-11-20 17:19:47 +08:00
周周周
6fa34102e8 [Others]get_block_shape_and_split_kv_block clean code (#5123) 2025-11-20 16:40:04 +08:00
yangjianfengo1
af715db763 [Scheduler] Support chunk prefill for video input (#5107)
* add video chunk prefill

* add vit_merge=True for test_tokenizer_client.py

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-11-20 16:29:13 +08:00
Neil Zhu
0edda75a56 [Metax] optimize cutlass moe and flash attention backend (#5128) 2025-11-20 16:12:35 +08:00
freeliuzc
f1e36ff2f7 [Speculative Decoding][MTP]Support stop_seqs and pd-split mode (#5029)
* support multi_stop_seqs in speculative decoding

* support mtp tp with ep split

* fix custom op register

* fix spec stop_seqs params
2025-11-20 15:26:01 +08:00
plusNew001
3e3558f492 [HPU][CI]Hpu ci update (#5116)
* Update Docker image in CI workflow

* Update pip configuration and uninstall packages

Set pip global index URL to Tsinghua mirror and uninstall PaddleCustomDevice and fastdeploy.
2025-11-20 14:12:52 +08:00
YuBaoku
e021048318 [CI] Temporarily lock paddlepaddle-gpu as of 20251118 (#5136) 2025-11-20 11:15:08 +08:00
kevin
109d48e456 [Feature] support async download features (#5003)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support async download features

* add test case

* update code
2025-11-19 22:23:36 +08:00
Sunny-bot1
bde97e09f7 support dynamic activation quant for w4afp8 (#5117) 2025-11-19 21:11:16 +08:00
YuBaoku
2716da4220 [CI] Add workflow to auto-remove skip-ci labels after new commits (#5129)
* [CI] Add CI to automatically remove skip-ci labels after new commits
2025-11-19 19:22:06 +08:00
LiqinruiG
a5cd7c9039 [BugFix] rollback max_tokens and min_tokens when continue to infer (#5082)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [BugFix] rollback  max_tokens and min_tokens when continue to infer

* [BugFix] rollback  max_tokens and min_tokens when continue to infer

* [fix] add more logger info:  max_tokens

---------

Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-19 18:43:42 +08:00
Sunny-bot1
43f0c7557e [Feature] Add an unquantized option for MoE and Dense quant type (#4813) 2025-11-19 16:24:03 +08:00
chen
9ff418db73 check METAX_GPU (#5114) 2025-11-19 16:02:21 +08:00
tianlef
de43577a7c [Docs] add ebvlthinking yaml (#5120) 2025-11-19 15:27:46 +08:00
megemini
3c8c0f0d6c 【Hackathon 9th No.109】[CppExtension] [XPU] Support build Custom OP in setuptools 80+ -part (#5106)
* [CppExtension] 添加现代Python打包方法兼容性支持

* [CppExtension] 移除构建脚本中的错误退出逻辑

* [CppExtension] 移除现代Python打包兼容性代码,仅保留传统打包方式

* [CppExtension] 恢复现代Python打包兼容性支持并优化目录检测逻辑
2025-11-19 13:33:39 +08:00
Zhang Yulong
be9541a97b [CI] add metrics case (#5115)
* add case

* add case
2025-11-19 11:50:12 +08:00
YuBaoku
24e9e2d9c8 [CI]Exclude abstract methods and irrelevant backend files (#5031) 2025-11-19 11:48:28 +08:00
bukejiyu
a82f25ea7b [RL]Resolve shape mismatch problems in RL-related modules (#5032)
* RL fix

* update
2025-11-19 11:12:48 +08:00
Winters Montagne
4694ed2a43 [CI]【Hackathon 9th Sprint No.31】NO.31 功能模块 fastdeploy/input/ernie4_5_processor.py 单测补充 (#5097)
* Add unit tests for ernie4_5_processor

* update

* update
2025-11-19 10:51:02 +08:00
qw86972190
857d152464 [XPU][Docs]Update document (#5091)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-19 10:20:14 +08:00
Daci
eab8384da6 [Feature] ThreadPoolExecutor async fill_token_bitmask (#5083)
* ThreadPoolExecutor async fill_token_bitmask

* ThreadPoolExecutor async fill_token_bitmask logging

* fix test_guided_decoding

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* add fill_bitmask_parallel_batch_size ENV

* FD_FILL_BITMASK_BATCH fastdeploy.envs

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-19 10:04:16 +08:00
K11OntheBoat
4a7739ec0b Fix dummy run when use PD Disaggregation with EP inference. (#5112)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com”>
2025-11-18 21:09:30 +08:00
plusNew001
7fdc920a01 [HPU][CI]Update Docker image in CI workflow (#5108) 2025-11-18 20:43:19 +08:00
kxz2002
97189079b9 [BugFix] unify max_tokens (#4968)
* unify max tokens

* modify and add unit test

* modify and add unit test

* modify and add unit tests

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-11-18 20:01:33 +08:00
xiaolei373
3d7f1a843e [Docs]fix_cli_docs (#5109)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-18 17:56:12 +08:00
周周周
6584ee90e8 [unitest]clean code (#5094) 2025-11-18 17:21:35 +08:00
lizhenyun01
d11235333e format flash_mask_attn 2025-11-18 17:18:12 +08:00
lizhenyun01
cd2c4df64a format flash_mask_attn 2025-11-18 17:18:12 +08:00
YuBaoku
d0b3bec585 Revert "[CI] Temporarily lock paddlepaddle-gpu as of 20251112 (#5017)" (#5098)
This reverts commit 91d34c2e35.
2025-11-18 14:17:09 +08:00
yzwu
d5d0602859 [Iluvatar][CI] disable compiling cudaLaunch API (#5100) 2025-11-18 14:15:31 +08:00
MingkunZhang
a36c958c66 [Metax] support default_v1 loader based #4988 (#5001)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-18 09:44:30 +08:00
YuBaoku
5d7516dc8c [CI] Enable check_pr_template in CI rerun (#5093)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [CI] Drop checklist section in PR template check

* [CI] Enable check_pr_template in CI rerun
2025-11-17 22:34:38 +08:00
Echo-Nie
abc9fd31c7 【Hackathon 9th No.76】supplementary unit test for XGrammarChecker (#4075)
* supplementary unit test for XGrammarChecker

* mock the xgrammer,torch
2025-11-17 22:05:53 +08:00
chen
d58c1db8a0 [Feature][OP] Append Attn Support CUDA-PDL (#5072) 2025-11-17 20:47:33 +08:00
FocusLuo
c2c1942db9 [INTEL_HPU] [CI] enabled fastdeploy PR testing (#4596)
* [INTEL HPU] added hpu ci work flow support

Signed-off-by: Luo, Focus <focus.luo@intel.com>

* [INTEL HPU] added run ci hpu test scripts

Signed-off-by: Luo, Focus <focus.luo@intel.com>

* [INTEL HPU] enabled HPU ernie test case

Signed-off-by: Luo, Focus <focus.luo@intel.com>

* [INTEL HPU] updated Intel Gaudi Readme with Warmup disable cmdline

Signed-off-by: Luo, Focus <focus.luo@intel.com>

* Modify paddlepaddle installation command

Updated paddlepaddle installation command to use a specific index URL.

* Update run_ci_hpu.sh

* Rename json directory to nlohmann_json

Rename extracted json directory to nlohmann_json.

* Update ci_hpu.yml

* Set pip global index URL to Tsinghua mirror

* Update CI workflow to use self-hosted runner and paths

* Update Docker image in CI workflow

* Modify HPU installation URLs in run_ci_hpu.sh

Updated the installation URL for paddle_intel_hpu and added paddlenlp_ops installation.

* Fix paddle_intel_hpu installation URL

Corrected the URL for paddle_intel_hpu wheel installation.

---------

Signed-off-by: Luo, Focus <focus.luo@intel.com>
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>
2025-11-17 19:24:41 +08:00
周周周
b23e684b67 revert group size 3 (#5079) 2025-11-17 18:54:13 +08:00
SunLei
d9f64adb0e fix: Fix block allocation issue when MTP and logprobs are enabled (#5077)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-17 17:50:07 +08:00
Sunny-bot1
8a4ddb29df Revert "[BugFix] Revert skip capture (#5023)" (#5080) 2025-11-17 16:14:55 +08:00
plusNew001
7f94d77e08 [XPU][CI] fix ci case bug (#5084)
* Ignore markdown and text files in CI workflow

* Change GPU_ID to XPU_ID in run_ci_xpu.sh

* Change GPU_ID to XPU_ID in test configuration

* Change GPU_ID to XPU_ID for service port calculation

* Change GPU_ID to XPU_ID for device identification

* Change GPU_ID to XPU_ID in test_ep function

* Update run_w4a8.py

* Redirect stop_processes output to kill.log

Redirect output of stop_processes to kill.log to capture logs.

* Log server output for failed test cases

Added logging of server.log for failed tests.

* Add '-s' option to pytest commands in run_ci_xpu.sh

* Refactor assertion to validate multiple keywords

Updated assertion to check for multiple keywords in response.

* Fix assertany to assert any in run_45vl.py
2025-11-17 16:01:27 +08:00
fmiao2372
74f33efdbf [Intel HPU] fix bugs caused by other commits (#5074)
* [Intel HPU] fix bugs caused by other commits

* update code by copilot
2025-11-17 15:28:55 +08:00
LiqinruiG
33f96ff93a [BugFix] rollback max_tokens and min_tokens when continue to infer (#5052)
Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-17 14:31:26 +08:00
Winters Montagne
ff26158f20 Add unit tests for triton_utils_v2 (#5073) 2025-11-17 11:46:38 +08:00