Commit Graph

7 Commits

Author SHA1 Message Date
kesmeey
ac731653b3 [CI]【Hackathon 9th Sprint No.12】功能模块 fastdeploy/spec_decode/mtp.py 单测补充 (#5533)
* Add unit tests for MTPProposer class in spec_decode/mtp.py

* fix: remove non-existent QuantizationConfig import in test_mtp_proposer

* fix: add logprobs_mode attribute to FakeModelConfig

* fix: fix test failures in test_mtp_proposer - fix Mock setup, remove arrival_time, add missing keys

* fix: add seq_lens_this_time initialization and kv_cache init before insert_tasks_v1

* fix: check pos_emb_type attribute existence before assertion

* test: add minimal coverage for mtp cache type, mm init, preempted

* test: fix cache_type_branches unsupported platform on 12

* test: refine MTPProposer tests for cache type, requests and chunked prefill

* chore: remove stray spec_decode copy
2025-12-17 20:09:45 +08:00
Longzhi Wang
5cd17fd662 [Models] Add forward_meta to moe models' forward function (#5138)
* [Models] Add forward_meta to moe models' forward function

* fix missing param

* fix

* fix

* fix forward_meta

* fix test and remove chunked MoE releated in config

* fix test

* fix

* fix
2025-12-04 13:26:58 +08:00
Sunny-bot1
35bd2afab3 [Benchmark] Add GEMM & MoE kernel bench (#4809) 2025-11-12 11:56:40 +08:00
YuanRisheng
a2ec2c4152 [FDConfig]Remove max_model_len in FDConfig (#4350)
* modify max_model_len

* fix unittest

* fix unittest

---------

Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
2025-10-11 14:04:17 +08:00
YuanRisheng
24180fba0a [FDConfig]Remove splitwise_role and engine_worker_queue_port in FDConfig (#4147)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* remove splitwise_role and engine_worker_queue_port

* fix xpu

* fix xpu

* fix xpu

* fix unittest

* resolve conflct
2025-09-19 17:01:52 +08:00
YuanRisheng
2e9e53ff7e [FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116)
* remove max_num_batched_tokens in parallel config

* remove max_num_seqs

* update test case

* fix test

* fix

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-09-17 10:43:35 +08:00
YuanRisheng
6566e29807 Add loader test for mtp (#3724)
* add test for mtp

* fix unittest

* fix
2025-09-01 10:55:49 +08:00