Commit Graph

3965 Commits

Author SHA1 Message Date
YuBaoku
f133ce501c [CI] disable test_cuda_graph_dynamic_subgraph.py in unit_test 2025-12-11 14:20:53 +08:00
freeliuzc
6715196924 fix attention bug in spec decoding (#5480) 2025-12-10 12:56:13 +08:00
Yuanle Liu
c5973c2087 fix limit_thinking bug (#5477) 2025-12-10 11:50:13 +08:00
lzy
f08fb25cfe [Others] Maintain the mtp branch temporarily. (#5447) 2025-12-09 19:41:33 +08:00
kevin
9b5b08cb72 [Cherry-Pick][BugFix] Fix async download(#5349) (#5347)
* fix mm to_dict bug

* pd support async download

* update code

* update test case

* update log

* Revert "update log"

This reverts commit 6e883150cd.

* update code

* fix mtp bug
2025-12-05 18:59:36 +08:00
lzy
cae2c1ccf5 supports mtp split_kv_attn (#5344) 2025-12-03 13:33:26 +08:00
lzy
04b2c43806 [Optimization] 1.fix tp+ep moe_forward; 2.set max_prefill_batch=env.MAX_PREFILL_NUM (#5316) 2025-12-02 13:03:55 +08:00
kevin
f1e1f5da57 fix mm to_dict bug (#5299) 2025-11-29 20:49:36 +08:00
Yuanle Liu
b99064432e Update load_weight_utils.py (#5285) 2025-11-28 13:39:59 +08:00
Jiaxin Sui
89ed1a9e84 [Cherry-pick][XPU][CI] Set pip index URL to Tsinghua mirror (#5277) (#5281)
* Set pip global index URL to Tsinghua mirror

* Update Docker image tag in CI workflow
2025-11-28 10:13:41 +08:00
lizhenyun01
fd1313cdb4 [Cherry-Pick][Feature] support flash_mask_attention backend(#5134) (#5256)
* [Feature] suppert flash_mask_attention backend

* fix unittest

* clean code
2025-11-28 10:13:00 +08:00
Yuanle Liu
9b0c65ba57 Add method to disable sequence parallel MoE if needed (#5268) 2025-11-27 16:28:24 +08:00
kevin
69b4d058ad cp_fix_bug (#5253) 2025-11-27 15:15:49 +08:00
SunLei
3d74a4baf6 [Cherry-Pick] MTP split draft_tokens into standalone post-processing path(#5205) (#5231)
* refactor(mtp): split draft_tokens into standalone post-processing path for MTP + logprobs

* Restore Request.__repr__ implementation

* ci

* add envs

* fix unittest
2025-11-27 11:23:38 +08:00
freeliuzc
bdcc952eeb fix pd-split first step bug (#5246) 2025-11-26 18:02:32 +08:00
xiaoxiaohehe001
710753377f [Cherry-Pick] Fix eplb noaux(#5239) (#5240)
* fix eplb noaux

* fix eplb noaux
2025-11-26 17:51:10 +08:00
YuBaoku
49be443d02 [Cherry-Pick][CI] Add check trigger and logic(#5191) (#5227)
* [CI] Add Cherry-Pick PR check logic

(cherry picked from commit 8bc2e13fdafaf339ebe518d7e5dccdbd8ff3fc2d)

* [Cherry-Pick][CI] Add check trigger and logic
2025-11-26 13:20:38 +08:00
kevin
e0c7ebff29 [BugFix][Cherry Pick] fix ds type bug (#5220)
* fix ds type bug

* update code
2025-11-25 20:37:09 +08:00
freeliuzc
a11d17cee9 [Speculative Decoding][Cherry Pick]Update extract_mtp_weight script and optimize config (#5213)
* update extract_mtp_model

* modify config usage
2025-11-25 14:42:55 +08:00
freeliuzc
e581b7d7d9 fix kernel output extract (#5212) 2025-11-25 14:25:20 +08:00
Echo-Nie
a418d7b60b [CI] Add Unittest (#5187)
* add test

* Delete tests/model_executor/test_w4afp8.py

* Rename test_utils.py to test_tool_parsers_utils.py

* add test

* add test

* fix platforms

* Delete tests/cache_manager/test_platforms.py

* dont change 

Removed copyright notice and license information.
2025-11-25 11:00:34 +08:00
Jiang-Jia-Jun
717da50b40 Update pull_request_template.md 2025-11-25 10:32:19 +08:00
Jiang-Jia-Jun
86d6ee90be Update pull request template warning message 2025-11-25 10:11:33 +08:00
Jiang-Jia-Jun
6b111ef900 Update pull_request_template.md 2025-11-25 10:10:17 +08:00
zccjjj
ea3bc5b4ca [XPU] Fix the error in MoeExpertFFN operator when valid_token_num=0 (#5196) 2025-11-25 10:07:20 +08:00
chenjian
09b47c7111 [Bug fix] Send first token in D instance (#5199)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Bug fix] Send first token in D instance

* fix
2025-11-24 23:42:20 +08:00
YuBaoku
95b39317a9 [CI] Update redis download source (#5198)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-24 21:14:59 +08:00
Yuanle Liu
f69e0839f7 dummy import fd (#5192) 2025-11-24 20:23:07 +08:00
kevin
8e4e3ff510 [Feature] support eplb in api_server (#4782)
* support eplb in api_server

* update code

* add eplb test case

* update eplb

* support tp+dp eplb

* update test cese

* update code

* update code

* fix bug

* update copilot review

* update test case name
2025-11-24 20:22:29 +08:00
xiaozude
d5bd64336a [Metax] support ENABLE_V1_KVCACHE_SCHEDULER (#5163) 2025-11-24 19:19:49 +08:00
xiaoxiaohehe001
e150a418d4 support moe offline quant (#5142) 2025-11-24 18:59:18 +08:00
Jiaxin Sui
5ff93d4998 [XPU][CI] change VL model to 28B-VL-thinking (#5169)
* Enhance run_ci_xpu.sh with caching and prefill options

* Update model path and configuration in run_ci_xpu.sh

* Add '北朝' keyword to assertion in run_45vl.py

* Enhance process termination logic in run_ci_xpu.sh

* Set timeout for CI_XPU job to 60 minutes

* Remove extra newline in stop_processes function
2025-11-24 16:50:18 +08:00
Juncai
af03da5127 [BugFix] fix release block ids (#5184)
* fix release block ids

* up
2025-11-24 16:48:09 +08:00
xunyoyo
7bac016c77 [CI] 【Hackathon 9th Sprint No.18】NO.18 功能模块单测补充 (#5064)
* Add unit tests for DeepEP buffer functionality

This file contains unit tests for the DeepEP buffer helpers and runners, including various test cases for buffer allocation, cleanup, and dispatching processes.

* Refactor DeepEP tests to use scoped stubs

* Add licensing information to test_ep.py

Added licensing information to the test file.
2025-11-24 15:52:34 +08:00
xiaoxiaohehe001
95f3c8c641 [Fix] Fix eplb bug and support fp8 load weight (#5178)
* fix eplb part2

* fix eplb part2

* fix eplb part2
2025-11-24 15:31:37 +08:00
qw86972190
f5c1066245 [XPU]Update documentation (#5180)
* [XPU]Update document

* [XPU]Update documentation
2025-11-24 14:00:51 +08:00
YuBaoku
98f1ab46a9 [CI] add output for last_token in test_streaming_with_stop_str (#5170)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-24 10:49:17 +08:00
Nyakku Shigure
b9e76f1a7a [Coverage] Ignore new custom ops stub file in codecov (#5177)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-23 22:33:28 +08:00
周周周
e297406263 [Others] unitest tests/layers/test_attention_layer.py (#5174) 2025-11-23 22:21:01 +08:00
YuBaoku
5daa8d0686 [CI] fix coverage_report in daily test (#5175)
* [CI] fix coverage_report in daily test
2025-11-23 21:48:11 +08:00
megemini
c06cfe2447 【Hackathon 9th No.109】[CppExtension] 添加 fastdeploy_ops 目录到 package_data 以支持现代打包方式 (#5156)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: SigureMo <sigure.qaq@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-22 01:32:06 +08:00
kevin
cceaba1c8d [Feature] remove to_numpy (#5162)
* remove to_numpy

* update code

* update name

* update code

* update code

* update code
2025-11-21 21:54:26 +08:00
kevin
c068a4f642 [Feature] dyc8 support prefixcache (#5125)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* dyc8 support prefixcache

* fix cache_trans test case

* update code
2025-11-21 19:46:26 +08:00
GoldPancake
ab3a2e45ff fix mtp reschedule (#5165) 2025-11-21 19:31:35 +08:00
chenjian
3ea1b44a58 [Optimization] Improve perf for fd response token with internal adapter (#4992)
* [Optimize] Improve perf for fd response token with internal adapter

* fix

* fix bug

* fix ci

* fix ci

* fix ci

* fix ci
2025-11-21 19:02:03 +08:00
Yuanle Liu
5bcf79d780 [BugFix] fix num of rdma_comm_ports check (#5168)
* fix num of rdma_comm_ports check

* update

* update

* update
2025-11-21 18:31:14 +08:00
Jiang-Jia-Jun
d2298dcb0c [Polish] Simplify __repr__ method in Request class (#5153)
Remove detailed string representation for Request class.
2025-11-21 17:21:06 +08:00
xiaoxiaohehe001
6471dade4a [Fix] Fix noaux ep test (#5161)
* support noaux eplb

* noaux_eplb

* noaux_eplb

* noaux_eplb

* noaux_eplb
2025-11-21 16:36:41 +08:00
Juncai
f9b0545a7f [PD Disaggregation] [Refine] Refine splitwise deployment (#5151)
* Refine splitwise deployment

* up
2025-11-21 15:30:24 +08:00
freeliuzc
2d1dade5e2 [Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155)
* support static cachekv c8 quantization in mtp mode

* optimize memory allocation
2025-11-21 15:10:13 +08:00