lzy
690bcb8e50
[Optimization] 1.fix tp+ep moe_forward; 2.set max_prefill_batch=env.MAX_PREFILL_NUM ( #5315 )
2025-12-03 13:33:15 +08:00
Sunny-bot1
3629db4129
[Quantization] Support w4afp8 MoE dynamic quantization ( #5282 )
...
* support dynamic activation quant for w4afp8
* support dynamic w4afp8
* add test
* fix
* fix
---------
Co-authored-by: zhoutianzi666 <17801055074@163.com >
2025-12-02 18:56:16 +08:00
周周周
fb7f951612
[UNITEST] add test ( #5305 )
2025-12-02 17:59:01 +08:00
qw86972190
6048ea37bd
[XPU]add enable_logprob ( #5279 )
...
* [XPU]Update document
* [XPU]Update documentation
* [XPU]add enable_logprob
* Fix code style issues
* “doc”
* “docs”
* “doc”
* Fix code style via pre-commit
---------
Co-authored-by: root <root@gajl-bbc-onlinec-com-1498354.gajl.baidu.com >
2025-12-02 15:32:28 +08:00
lizexu123
c563eca791
[Feature] support reward model ( #5301 )
...
* Your commit message here
* add test
* update develop
* support reward
* support enable_chunk_prefill
* support bingfa
* support convert is reward
* update test
* delete print
* fix enable_thinking
* add document
* fix place
* fix test
* fix
* support enable_prefix_caching
* add no-enable_prefix-caching test
* fix
* support enable_prefix_caching
* delete print
* fix document
* fix
* fix test
* fix document and delete chinese
* udpate
* enable_thinking
* fix test
2025-12-02 14:55:31 +08:00
K11OntheBoat
2e1680838f
[PD Disaggregation] Support PD deployment of DeepSeekv3. ( #5251 )
...
* Support deepseekv3 cache transfer for PD deploy
* clean some log info
---------
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”>
2025-12-02 14:11:50 +08:00
qwes5s5
117980dd4e
[LogProbs]Enable prompt logprobs output and modify data transmission method for the online interface. ( #5089 )
...
* add prompt logprobs
* Merge prompt_logprobs_tensors and prompt_logprobs
* fix param check
* trigger ci
* fix unitest
* fix logprobs bug
2025-12-02 13:49:51 +08:00
chen
aa35ce449d
[Optimization] EP empty_input_forward Remove Communication ( #5254 )
2025-12-01 21:10:40 +08:00
Juncai
0925d44f18
[PD Disaggregation] support different tp_size for prefill and decode ( #5296 )
...
* up
* up
* up
* fix
2025-12-01 17:50:20 +08:00
Longzhi Wang
add524d80c
[Feature] support chunked moe ( #4575 )
...
* [Feature] support chunked moe
* update
* update
* fix and add test
* update
* fix conflict and modity test
* fix fused_moe
* fix fused_moe
* fix docstring
* fix
* fix typo
* fix test
* fix
* fix
* fix test
* fix test
2025-12-01 15:17:18 +08:00
Jundong Liu
6f42c37359
[Deterministic] Move paddle version batch invariant pkg to Fastdeploy ( #4763 )
...
* Move batch invariant pkg to Fastdeploy
* fix problem and pre-commit
* move test
* Change testcase to FD style
* Add testcase for log_softmax
* Add testcase for mean
* Add testcase for addmm
* fix pre-commit
* API check v0.9
* move to layers and add comment about log_softmax
* Update fastdeploy/model_executor/layers/batch_invariant_ops/batch_invariant_ops.py
存在于原版代码注释中的版本控制遗留的内容,确实应该去除
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/batch_invariant/test_batch_invariance_op_mean.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/batch_invariant/test_batch_invariance_op_logsoftmax.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/layers/batch_invariant_ops/batch_invariant_ops.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* change comment after copilot fix
* fix bug about addmm
* avoid global effect by enable_torch_proxy
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-01 11:25:48 +08:00
ming1753
70ec1e17c1
[Features] add audio request & fix embedding bug ( #5201 )
...
* [Features] add audio request & fix embedding bug
* fix bug
2025-12-01 11:12:17 +08:00
cmcamdy
9f4977eb74
[xpu] support mtp for xpu(mix) ( #5274 )
...
* [XPU] support kernel for mtp(base)
* [XPU] support kernel for mtp(base)
* format
* format
* format
* fix gather next token
* fix step && add test
* fix
* mv pre/post process
* add adjust batch / gather next token for mtp
* fix code style
* fix mtp kenrel name
* fix mtp kernel test
* mv xpu pre/post process
* mv xpu pre/post process
* [xpu] support mtp
* fix code style
2025-12-01 11:03:14 +08:00
kevin
8aec3acc8c
fix mm type bug ( #5300 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-29 20:48:14 +08:00
kevin
048ca60013
fix aksk check bug
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-29 09:14:28 +08:00
fmiao2372
2c7683d551
[Intel HPU] change MoE weights and scales from list to tensor and add… ( #5289 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Intel HPU] change MoE weights and scales from list to tensor and add q/k rms norm
* update doc
* move HPU_CHUNK_SIZE into envs
2025-11-28 19:17:05 +08:00
Yonghua Li
a535050b11
[FDConfig] remove engine client args, use fd_config instead ( #5217 )
...
* [refactor] remove engine client args, use fd_config instead
* [chore] update
* [fix] fix
* [fix] fix
* [chore] rename config to fd_config
* [fix] fix run_batch
* [ci] add ci case for engine client
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-28 01:20:54 -08:00
周周周
73886204d4
[Others] clean code ( #5235 )
2025-11-28 15:40:49 +08:00
kevin
2d69d91ab8
add aksk check ( #5273 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-28 14:28:24 +08:00
bukejiyu
1539fd6056
[BugFix]Set default OMP_NUM_THREADS=3 and fix extra GPU memory usage in DeepSeek ( #5219 )
...
* fix bug
* update
* update
* update
* fix copy
* update
2025-11-28 14:22:04 +08:00
Daci
7dc06cac6e
[BugFix] race condition [is_fetching] causing multiple fetch requests ( #5238 )
...
* RouterArgs port str -> int
* fix race condition [is_fetching] causing multiple fetch requests
* bugfix: Delete duplicate input_ids tensor creation
2025-11-28 13:41:36 +08:00
Yuanle Liu
35479b691f
[BugFix] fix tsp o_proj bias add ( #5284 )
...
* fix tsp bias add
* fix
* fix
2025-11-28 13:39:55 +08:00
lizhenyun01
aba4fc657f
[Feature] support flash_mask_attention backend ( #5134 )
...
* [Feature] suppert flash_mask_attention backend
* fix unittest
* clean code
2025-11-28 10:12:16 +08:00
chen
35f85baf09
[BugFix]fix v1 loader lm head fp32 ( #5270 )
2025-11-27 20:12:56 +08:00
cmcamdy
5a67a6d960
[XPU] support kernel for mtp(base) ( #4748 )
...
* [XPU] support kernel for mtp(base)
* [XPU] support kernel for mtp(base)
* format
* format
* format
* fix gather next token
* fix step && add test
* fix
* mv pre/post process
* add adjust batch / gather next token for mtp
* fix code style
* fix mtp kenrel name
* fix mtp kernel test
* mv xpu pre/post process
* mv xpu pre/post process
2025-11-27 15:05:44 +08:00
fl0w2o48
e63d715fc3
[BugFix][Metrics] Fix Prometheus Multiprocess Metrics Issues and Add ZMQ Communication Metrics ( #5185 )
...
* [Feature] add metrics for ZMQ and fix multiprocess metrics
* fix test_metrics.py
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-27 15:05:09 +08:00
Yuanle Liu
ef5aa5c03b
[BugFix] fix cuda-python requirement ( #5261 )
...
* fix cuda-python requirement
* update
* fix
2025-11-27 13:58:18 +08:00
GoldPancake
cfc5b0ccf9
[BugFix] fix mtp logprob bugs in chunk prefill ( #5244 )
...
* fix mtp logprob bugs in chunk prefill
* fix
* fix
2025-11-27 11:31:29 +08:00
SunLei
c424e08dc5
[Speculative Decoding] split draft_tokens into standalone post-processing path ( #5205 )
...
* refactor(mtp): split draft_tokens into standalone post-processing path for MTP + logprobs
* Restore Request.__repr__ implementation
* ci
* add envs
* fix unittest
2025-11-27 11:22:41 +08:00
Yuanle Liu
cb56d46694
[Optimization] Refine row parallel bias and nranks and moe all_reduce ( #5247 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* rename nranks to tp_size and fix bias in v1 loader
* fix
* update
2025-11-26 05:09:09 -08:00
kevin
bf30f45738
[BugFix] fix vl performance bug ( #5181 )
...
* fix vl performance bug
* update code
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-11-26 21:06:52 +08:00
chen
209970836e
[BugFix] BF16 MoE Cutlass Backend Support EP ( #5242 )
2025-11-26 19:16:22 +08:00
freeliuzc
ba915e03e1
[BugFix]Fix attention mask bug in D-Node of PD-split mode ( #5245 )
2025-11-26 17:56:28 +08:00
xiaoxiaohehe001
61fc368066
[Fix] fix eplb noaux ( #5239 )
...
* fix eplb noaux
* fix eplb noaux
2025-11-26 17:50:51 +08:00
kxz2002
bc118c3d2d
fix prompt_token_ids is None in request dict ( #5241 )
2025-11-26 17:10:45 +08:00
chen
00d0ef5134
check ( #5237 )
2025-11-26 17:07:26 +08:00
freeliuzc
214942e1ae
fix kernel output extract ( #5208 )
2025-11-26 16:48:42 +08:00
Yonghua Li
cead6b26fa
[Metrics] Update time_to_first_token to include tokenization & queue time, and remove redundant metrics ( #4993 )
...
* [update] update time_to_first_tokens to include queue time, and remove first_token_latency and infer_latency
* [doc] update docs
* [ci] fix test
* [chore] delete redundant code
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-26 14:42:17 +08:00
Daci
f25ee3a26f
[Feature] enable guided decoding ENABLE_V1_KVCACHE_SCHEDULER = 1 ( #5140 )
...
* enable guided decoding ENABLE_V1_KVCACHE_SCHEDULER = 1
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-26 10:22:35 +08:00
kxz2002
2d787590c4
[Feature] The 45VL supports prompt_token_ids + messages input. ( #5148 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support prompt_token_ids + messages
* fix bug
* refact code structure
* support cache mm items
* refact code structure
* delete test cases
* modify unit test
* add unit test
* add unit test
* fix append
* add check for messages
2025-11-25 23:11:44 +08:00
Yuanle Liu
66e096d509
[FDConfig] disable use_sequence_parallel_moe default ( #5222
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* disable use_sequence_parallel_moe default
* update
2025-11-25 21:49:10 +08:00
kevin
df2be1cf16
[BugFix] fix mm_positions type error ( #5182 )
...
* fix mm_positions type error
* update code
* update code
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-11-25 19:28:18 +08:00
Yonghua Li
09379183e2
[BugFix] fix work metrics not returned by metrics api ( #4912 )
...
* [BugFix] fix work metrics not returned by metrics api
* [fix] fix conflict
* [fix] fix ci
2025-11-25 19:12:29 +08:00
freeliuzc
5c8c2d47eb
[Speculative Decoding][MTP]Update extract_mtp_weight script and optimize config ( #5183 )
...
* update extract_mtp_model
* modify config usage
2025-11-25 14:09:03 +08:00
chenjian
09b47c7111
[Bug fix] Send first token in D instance ( #5199 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Bug fix] Send first token in D instance
* fix
2025-11-24 23:42:20 +08:00
Yuanle Liu
f69e0839f7
dummy import fd ( #5192 )
2025-11-24 20:23:07 +08:00
kevin
8e4e3ff510
[Feature] support eplb in api_server ( #4782 )
...
* support eplb in api_server
* update code
* add eplb test case
* update eplb
* support tp+dp eplb
* update test cese
* update code
* update code
* fix bug
* update copilot review
* update test case name
2025-11-24 20:22:29 +08:00
xiaozude
d5bd64336a
[Metax] support ENABLE_V1_KVCACHE_SCHEDULER ( #5163 )
2025-11-24 19:19:49 +08:00
xiaoxiaohehe001
e150a418d4
support moe offline quant ( #5142 )
2025-11-24 18:59:18 +08:00
Juncai
af03da5127
[BugFix] fix release block ids ( #5184 )
...
* fix release block ids
* up
2025-11-24 16:48:09 +08:00