YuBaoku
5d7516dc8c
[CI] Enable check_pr_template in CI rerun ( #5093 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [CI] Drop checklist section in PR template check
* [CI] Enable check_pr_template in CI rerun
2025-11-17 22:34:38 +08:00
Echo-Nie
abc9fd31c7
【Hackathon 9th No.76】supplementary unit test for XGrammarChecker ( #4075 )
...
* supplementary unit test for XGrammarChecker
* mock the xgrammer,torch
2025-11-17 22:05:53 +08:00
chen
d58c1db8a0
[Feature][OP] Append Attn Support CUDA-PDL ( #5072 )
2025-11-17 20:47:33 +08:00
FocusLuo
c2c1942db9
[INTEL_HPU] [CI] enabled fastdeploy PR testing ( #4596 )
...
* [INTEL HPU] added hpu ci work flow support
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] added run ci hpu test scripts
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] enabled HPU ernie test case
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* [INTEL HPU] updated Intel Gaudi Readme with Warmup disable cmdline
Signed-off-by: Luo, Focus <focus.luo@intel.com >
* Modify paddlepaddle installation command
Updated paddlepaddle installation command to use a specific index URL.
* Update run_ci_hpu.sh
* Rename json directory to nlohmann_json
Rename extracted json directory to nlohmann_json.
* Update ci_hpu.yml
* Set pip global index URL to Tsinghua mirror
* Update CI workflow to use self-hosted runner and paths
* Update Docker image in CI workflow
* Modify HPU installation URLs in run_ci_hpu.sh
Updated the installation URL for paddle_intel_hpu and added paddlenlp_ops installation.
* Fix paddle_intel_hpu installation URL
Corrected the URL for paddle_intel_hpu wheel installation.
---------
Signed-off-by: Luo, Focus <focus.luo@intel.com >
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-17 19:24:41 +08:00
周周周
b23e684b67
revert group size 3 ( #5079 )
2025-11-17 18:54:13 +08:00
SunLei
d9f64adb0e
fix: Fix block allocation issue when MTP and logprobs are enabled ( #5077 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-17 17:50:07 +08:00
Sunny-bot1
8a4ddb29df
Revert "[BugFix] Revert skip capture ( #5023 )" ( #5080 )
2025-11-17 16:14:55 +08:00
plusNew001
7f94d77e08
[XPU][CI] fix ci case bug ( #5084 )
...
* Ignore markdown and text files in CI workflow
* Change GPU_ID to XPU_ID in run_ci_xpu.sh
* Change GPU_ID to XPU_ID in test configuration
* Change GPU_ID to XPU_ID for service port calculation
* Change GPU_ID to XPU_ID for device identification
* Change GPU_ID to XPU_ID in test_ep function
* Update run_w4a8.py
* Redirect stop_processes output to kill.log
Redirect output of stop_processes to kill.log to capture logs.
* Log server output for failed test cases
Added logging of server.log for failed tests.
* Add '-s' option to pytest commands in run_ci_xpu.sh
* Refactor assertion to validate multiple keywords
Updated assertion to check for multiple keywords in response.
* Fix assertany to assert any in run_45vl.py
2025-11-17 16:01:27 +08:00
fmiao2372
74f33efdbf
[Intel HPU] fix bugs caused by other commits ( #5074 )
...
* [Intel HPU] fix bugs caused by other commits
* update code by copilot
2025-11-17 15:28:55 +08:00
LiqinruiG
33f96ff93a
[BugFix] rollback max_tokens and min_tokens when continue to infer ( #5052 )
...
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-17 14:31:26 +08:00
Winters Montagne
ff26158f20
Add unit tests for triton_utils_v2 ( #5073 )
2025-11-17 11:46:38 +08:00
megemini
c35e540c18
【Hackathon 9th No.109】[CppExtension] Support build Custom OP in setuptools 80+ ( #4977 )
...
* 添加对现代Python打包方法的兼容性支持
* [CppExtension] 优化构建脚本逻辑并更新.gitignore
2025-11-17 11:46:27 +08:00
Winters Montagne
02c83d65db
[CI]【Hackathon 9th Sprint No.13】NO.13 功能模块 fastdeploy/model_executor/ops/triton_ops/triton_utils.py 单测补充 ( #5035 )
...
* Add unit tests for triton_utils.py
* update name
* update
* update
* update
2025-11-17 11:43:31 +08:00
qwes5s5
36216e62f0
[Log] Add trace log and add loggingInstrumentor tool ( #4692 )
...
* add trace logger and trace print
* trigger ci
* fix unittest
* translate notes and add copyright
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-17 11:08:57 +08:00
zhouchong
5444af6ff6
[APIServer] metrics use port the same as api_port ( #5016 )
...
* metrics use port the same as api_port
* Be tolerant to tests that monkeypatch/partially mock args.
* Reduce code redundancy
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-17 10:42:45 +08:00
xiaozude
68f638f8b9
[Metax] support default_v1 loader and quant_config is None for triton moe ( #5030 )
2025-11-17 10:38:00 +08:00
yangjianfengo1
3afb717995
【Fix】fix deepep dispatch ( #5036 )
...
* fix dispatch
* fix dispatch
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
2025-11-17 10:34:01 +08:00
yzwu
3b80a799ab
[Iluvatar][CI] Fix moe_expert_dispatch cannot support dequant_scale ( #5012 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-17 10:18:42 +08:00
fmiao2372
e43a5fc055
[Intel HPU] enable level 1 prefix caching and fix some bugs ( #4971 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Intel HPU] enable prefix caching and dense tp moe ep and fix some bugs
* update code by copilot
* remove dense tp and moe ep code
2025-11-14 19:42:50 +08:00
plusNew001
0e819cd596
[CI][XPU] Optimize CI logs and variable names ( #5025 )
...
* Ignore markdown and text files in CI workflow
* Change GPU_ID to XPU_ID in run_ci_xpu.sh
* Change GPU_ID to XPU_ID in test configuration
* Change GPU_ID to XPU_ID for service port calculation
* Change GPU_ID to XPU_ID for device identification
* Change GPU_ID to XPU_ID in test_ep function
* Update run_w4a8.py
* Redirect stop_processes output to kill.log
Redirect output of stop_processes to kill.log to capture logs.
* Log server output for failed test cases
Added logging of server.log for failed tests.
* Add '-s' option to pytest commands in run_ci_xpu.sh
2025-11-14 19:35:35 +08:00
Jiang-Jia-Jun
d41cf643f8
Update nvidia_gpu.md
2025-11-14 18:26:08 +08:00
Jiang-Jia-Jun
692d69229b
Update nvidia_gpu.md
2025-11-14 18:17:32 +08:00
Daci
5fc12eddfe
[Optimization] xgrammar async compile, multi thread, speed up ( #4835 )
...
* xgrammar async compile, multi thread, speed up
* fix test_sampler.py & pre-commit err
* add redis version check && fix request.llm_engine_recv_req_timestamp
* xgrammar prefill & decode & v0
* fix test_gpu_prompt_logprobs.py
* add test_guided_decoding.py
* Update fastdeploy/scheduler/splitwise_scheduler.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix torch xgrammar unittest env
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-14 18:05:26 +08:00
Winters Montagne
b925533051
add test_process_video.py ( #5011 )
2025-11-14 17:23:30 +08:00
chen
544ea9cbc2
check max_logprobs ( #5018 )
2025-11-14 17:18:06 +08:00
Sunny-bot1
249feca65a
[BugFix] Revert skip capture ( #5023 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* Revert "[BugFix][Metax] Fix metax compile issue in get_block_shape_and_split_kv_block (#5000 )"
This reverts commit 05da8e34c0 .
* Revert "skip DtoH capture (#4988 )"
This reverts commit 5b24013d46 .
2025-11-13 23:52:51 -08:00
周周周
51b1f13547
[Executor]move batch_id_per_token ( #4853 )
2025-11-14 15:38:48 +08:00
周周周
c0a4393d72
[ATTENTION] unitest ( #4962 )
2025-11-14 13:45:53 +08:00
YuBaoku
91d34c2e35
[CI] Temporarily lock paddlepaddle-gpu as of 20251112 ( #5017 )
2025-11-14 11:55:25 +08:00
Echo-Nie
ee1ea43e36
[Docs] Fix broken commitID ( #5008 )
...
* fix commitID
* Update nvidia_gpu.md
2025-11-14 10:39:41 +08:00
essos
191a597d9f
[CI]【Hackathon 9th Sprint No.56】NO.56 功能模块 fastdeploy/multimodal/utils.py 单测补充 ( #4954 )
...
* update test utils
* update test utils code
* update test file name
2025-11-14 10:37:27 +08:00
Juncai
36822fa49c
[PD Disaggregation] remove splitwise deployment on single node and refine the code ( #4891 )
...
* remove splitwise deployment on single node and refine the code
* up
* up
* up
* add test
* up
2025-11-14 09:56:53 +08:00
kxz2002
9703108c28
[BugFix] adjust max_tokens and min_tokens when continue to generate tokens ( #5010 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* fix max and min tokens initial commit
* fix double subtraction
* add unit tests
2025-11-13 23:52:54 +08:00
carryyu
6c3d1da62f
fix conflicts
2025-11-13 20:30:29 +08:00
yangjianfengo1
ae7bee8122
【New Feature】W4afp8 supports per group quantization ( #4987 )
...
* w4afp8 支持per group
* code style
* fix transpose
* revert fast hardmard
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 19:17:27 +08:00
Echo-Nie
a5e949d9d0
[Feature] Enhance build script, add pre_wheel logic ( #4729 )
...
* Enhance build script, add pre_wheel logic
Updated copyright year and added precompiled wheel installation logic.
* update the nvidia_gpu.md, add pre_wheel description
* fix zh .md
* update the url, automatically detect CUDA and SM
* Fix GPU architecture string formatting in build.sh
* Change default for FD_USE_PRECOMPILED to 0
* fix build.sh
* add ./dist, pre-wheel path
* simplify the process,just save the whl
* del pre_wheel dir
* fix function name, extract_ops_from_precompiled_wheel
* fix docs
* add default commitID in docs
---------
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 19:03:52 +08:00
Sunny-bot1
05da8e34c0
[BugFix][Metax] Fix metax compile issue in get_block_shape_and_split_kv_block ( #5000 )
...
* fix metax compile
* fix
2025-11-13 00:55:06 -08:00
zccjjj
88da9d9788
[XPU] [CI] Change CI ep test from offline to online ( #4885 )
...
* change CI ep test from offline to online
* add ep all2all ci's changes, from offline to online
* change env var in ep-all2all ci test
* add expected response for ep8tp8 all2all
* Adapt to CI refactoring and support dual-concurrent code execution
* Adapt to CI refactoring and support dual-concurrent, second
* Explicitly specify the #port
* change the startup method of all2all
* Modify the command of all2all
* Update assertion to check multiple keywords
* Update assertion to check multiple keywords
* Update run_w4a8.py
* Update run_w4a8.py
---------
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 16:15:45 +08:00
bukejiyu
4a0d881e15
update ( #4985 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-13 15:58:01 +08:00
周周周
6c4ebc5fee
[worker_process.py]modify some var name ( #4749 )
2025-11-13 14:21:27 +08:00
Yonghua Li
6c5ab727c1
[BugFix] fix num_requests_running after clear_data ( #4927 )
...
* [BugFix] fix num_requests_running after clear_data
* [fix] fix tasks_list and stop flags not cleared when _free_blocks failed
2025-11-13 13:50:21 +08:00
Sunny-bot1
5b24013d46
skip DtoH capture ( #4988 )
2025-11-13 10:57:44 +08:00
Jiang-Jia-Jun
8329338d37
Update nvidia_gpu.md
2025-11-13 10:25:22 +08:00
ltd0924
303c986cc7
[FDConfig] add block number verfied ( #4983 )
...
* Update config.py
* fix
* update unit test
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-11-13 09:48:44 +08:00
YuBaoku
1c0b0b08b7
[CI] set DG_NVCC_OVERRIDE_CPP_STANDARD in test_quantized_linear ( #4995 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-12 23:03:21 +08:00
Yuanle Liu
2272160faf
fix mtp tsp ( #4990 )
2025-11-12 22:05:19 +08:00
ming1753
3148dbca06
[BugFix] fix VL fp8 bug when moe token_num is 0 ( #4928 )
...
* [BugFix] fix VL fp8 bug when moe token_num is 0
* fix bug
* format
* fix bug
2025-11-12 21:19:36 +08:00
Jiang-Jia-Jun
c8140326fa
Update nvidia_gpu.md
2025-11-12 20:50:09 +08:00
bukejiyu
f0189292df
[CI] fix test_model_cache ( #4982 )
...
* ci
* update
2025-11-12 20:26:49 +08:00
qwes5s5
a2d06118e1
[Logprobs]Support prompt_logprobs and max_logprobs ( #4897 )
...
* add prompt logprobs
* trigger ci
* fix unitest
* Update fastdeploy/config.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/llm.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix max_logprobs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 19:29:48 +08:00