FastDeploy/docs at 17b414c2df4ed1f7e74f0177bfc307ea29c384b6 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

History

周周周 17b414c2df MoE Default use triton's blockwise fp8 in TP Case (#3678 )

2025-08-29 11:07:30 +08:00

..

MoE Default use triton's blockwise fp8 in TP Case (#3678 )

2025-08-29 11:07:30 +08:00

Revert "[Feature] block sparse attention (#3209 )" (#3647 )

2025-08-27 17:35:04 +08:00

[MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492 )

2025-08-25 17:44:20 +08:00

[Feature] bad words support v1 scheduler and specifiy token ids (#3608 )

2025-08-25 20:14:51 -07:00

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

MoE Default use triton's blockwise fp8 in TP Case (#3678 )

2025-08-29 11:07:30 +08:00

MoE Default use triton's blockwise fp8 in TP Case (#3678 )

2025-08-29 11:07:30 +08:00

benchmark.md

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

index.md

Update README (#3426 )

2025-08-15 18:46:28 +08:00

offline_inference.md

rename ernie_xxx to ernie4_5_xxx (#3621 )

2025-08-26 19:29:27 +08:00

parameters.md

[Precision] Support lm_head layer running in float32 (#3597 )

2025-08-27 11:34:53 +08:00

requirements.txt

Sync v2.0 version of code to github repo

2025-06-29 23:29:37 +00:00

supported_models.md

[Feature] multi source download (#3005 )

2025-07-24 17:42:09 +08:00