FastDeploy/fastdeploy/model_executor/layers at c5c43e3b3dec5bdf63c4945d77c074afc74cca6e - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

freeliuzc c5c43e3b3d fix attention bug in spec decoding (#5481 )

2025-12-10 12:55:13 +08:00

..

fix attention bug in spec decoding (#5481 )

2025-12-10 12:55:13 +08:00

[New][RL] Support Rollout Routing Replay (#5405 ) (#5408 )

2025-12-08 10:00:35 +08:00

batch_invariant_ops

…

[New][RL] Support Rollout Routing Replay (#5405 ) (#5408 )

2025-12-08 10:00:35 +08:00

…

[BugFix] dynamic cache kv block_wise_fp8 not need create layer.cache_k_scale (#5362 )

2025-12-03 05:32:59 -08:00

[Optimization] compulte real max_logprobs in batch (#5430 ) (#5448 )

2025-12-09 16:48:06 +08:00

__init__.py

…

activation.py

…

embeddings.py

[Feature] support Two batch overlap, mainly used in Prefill (#5078 )

2025-12-05 14:58:50 +08:00

linear.py

cp pr5373 pr5379 pr5410 (#5411 )

2025-12-06 00:47:01 +08:00

lm_head.py

…

mtp_linear.py

…

normalization.py

…

pooler.py

…

rotary_embedding.py

…

utils.py

[Quantization] Support w4afp8 MoE dynamic quantization (#5282 )

2025-12-02 18:56:16 +08:00