FastDeploy/model_executor at c294fc8139af20d0f1bef24eb74964f74e8b70c6 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Files

History

Yuan Xiaolan d37331fc71 fix w4afp8_gemm_scale_permute import error on A100 (#3611 )

2025-08-28 11:42:23 +08:00

..

graph_optimization

[CUDAGraph]Add debug func (#3616 )

2025-08-26 16:43:48 +08:00

guided_decoding

rename ernie_xxx to ernie4_5_xxx (#3621 )

2025-08-26 19:29:27 +08:00

fix w4afp8_gemm_scale_permute import error on A100 (#3611 )

2025-08-28 11:42:23 +08:00

[V1 Loader] support weight_only (#3413 )

2025-08-23 13:13:41 +08:00

check (#3639 )

2025-08-27 14:32:13 +08:00

fix cpu __ini__.py (#3448 )

2025-08-17 12:38:54 +08:00

__init__.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

forward_meta.py

[Excutor] Increase buffer size to prevent address corruption; add forward metadata debug tool (#3404 )

2025-08-18 16:14:09 +08:00

load_weight_utils.py

[NewFeatures] support eplb (#3547 )

2025-08-26 16:19:30 +08:00

pre_and_post_process.py

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

utils.py

[Precision] Support lm_head layer running in float32 (#3597 )

2025-08-27 11:34:53 +08:00