FastDeploy/fastdeploy/model_executor at 52eda7fdb3a3e272dd3d6e3b518a48f03af60699 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

freeliuzc 52eda7fdb3 [Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

..

graph_optimization

[Executor] CUDAGraph support RL training (#3265 )

2025-08-25 20:59:30 +08:00

guided_decoding

add error traceback info (#3419 )

2025-08-19 19:32:04 +08:00

qkv_a_proj horizontal fusion (#3591 )

2025-08-26 14:25:57 +08:00

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

fix cpu __ini__.py (#3448 )

2025-08-17 12:38:54 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

forward_meta.py

[Excutor] Increase buffer size to prevent address corruption; add forward metadata debug tool (#3404 )

2025-08-18 16:14:09 +08:00

load_weight_utils.py

[Features] support hugging face qwen3 dense and qwen2 model (#3574 )

2025-08-26 10:54:53 +08:00

pre_and_post_process.py

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

utils.py

[Features] support hugging face qwen3 dense and qwen2 model (#3574 )

2025-08-26 10:54:53 +08:00