FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

freeliuzc 2d1dade5e2 [Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155 )

* support static cachekv c8 quantization in mtp mode

* optimize memory allocation

2025-11-21 15:10:13 +08:00

c++ code format (#4527 )

2025-10-22 17:59:50 +08:00

2025-11-21 15:10:13 +08:00

c++ code format (#4527 )

2025-10-22 17:59:50 +08:00

2025-11-20 16:12:35 +08:00

2025-09-11 17:41:16 +08:00

2025-11-17 10:34:01 +08:00

2025-11-21 14:09:01 +08:00

0001-DeepGEMM-95e81b3.patch

2025-07-03 22:33:27 +08:00

MANIFEST.in

2025-06-09 19:20:15 +08:00

setup_ops_cpu.py

2025-07-19 23:19:27 +08:00

setup_ops.py

2025-11-21 14:10:32 +08:00