This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-12-24 13:28:13 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
c5c43e3b3dec5bdf63c4945d77c074afc74cca6e
FastDeploy
/
fastdeploy
/
model_executor
/
layers
History
freeliuzc
c5c43e3b3d
fix attention bug in spec decoding (
#5481
)
2025-12-10 12:55:13 +08:00
..
attention
fix attention bug in spec decoding (
#5481
)
2025-12-10 12:55:13 +08:00
backends
[New][RL] Support Rollout Routing Replay (
#5405
) (
#5408
)
2025-12-08 10:00:35 +08:00
batch_invariant_ops
…
moe
[New][RL] Support Rollout Routing Replay (
#5405
) (
#5408
)
2025-12-08 10:00:35 +08:00
pool
…
quantization
[BugFix] dynamic cache kv block_wise_fp8 not need create layer.cache_k_scale (
#5362
)
2025-12-03 05:32:59 -08:00
sample
[Optimization] compulte real max_logprobs in batch (
#5430
) (
#5448
)
2025-12-09 16:48:06 +08:00
__init__.py
…
activation.py
…
embeddings.py
[Feature] support Two batch overlap, mainly used in Prefill (
#5078
)
2025-12-05 14:58:50 +08:00
linear.py
cp pr5373 pr5379 pr5410 (
#5411
)
2025-12-06 00:47:01 +08:00
lm_head.py
…
mtp_linear.py
…
normalization.py
…
pooler.py
…
rotary_embedding.py
…
utils.py
[Quantization] Support w4afp8 MoE dynamic quantization (
#5282
)
2025-12-02 18:56:16 +08:00