FastDeploy/fastdeploy/model_executor at e150a418d44281f4564cf115d1027be2822bd772 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

xiaoxiaohehe001 e150a418d4 support moe offline quant (#5142 )

2025-11-24 18:59:18 +08:00

..

graph_optimization

[Iluvatar] add vl into ci and support v1 loader (#4774 )

2025-11-11 10:50:17 +08:00

guided_decoding

[Feature] ThreadPoolExecutor async fill_token_bitmask (#5083 )

2025-11-19 10:04:16 +08:00

support moe offline quant (#5142 )

2025-11-24 18:59:18 +08:00

logits_processor

[Feature] support logits processors (#4515 )

2025-10-29 00:08:53 +08:00

[RL]Resolve shape mismatch problems in RL-related modules (#5032 )

2025-11-19 11:12:48 +08:00

support moe offline quant (#5142 )

2025-11-24 18:59:18 +08:00

[CI]【Hackathon 9th Sprint No.13】NO.13 功能模块 fastdeploy/model_executor/ops/triton_ops/triton_utils.py 单测补充 (#5035 )

2025-11-17 11:43:31 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

forward_meta.py

[PD Disaggregation][XPU] Add XPU support for PD disaggregation (#5113 )

2025-11-21 14:09:01 +08:00

load_weight_utils.py

[Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155 )

2025-11-21 15:10:13 +08:00

pre_and_post_process.py

[Speculative Decoding][MTP]Support stop_seqs and pd-split mode (#5029 )

2025-11-20 15:26:01 +08:00

utils.py

[RL]Resolve shape mismatch problems in RL-related modules (#5032 )

2025-11-19 11:12:48 +08:00