FastDeploy/fp8_gemm_with_cutlass at 7ccbcc5a62b50eae4967db2e2c924e0f091a7406 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-04 16:22:57 +08:00

Files

History

Zero Rains 25698d56d1 polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

..

fp8_common.h

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

fp8_fp8_fp8_dual_gemm.cu

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

fp8_fp8_half_block_gemm.cu

Sync v2.0 version of code to github repo

2025-06-29 23:29:37 +00:00

fp8_fp8_half_cuda_core_gemm.cu

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

fp8_fp8_half_cuda_core_gemm.h

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

fp8_fp8_half_gemm.cu

[Optimize] Optimize tensorwise fp8 performance (#2729 )

2025-07-07 20:06:28 +08:00

per_channel_fp8_fp8_half_gemm.cu

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00