FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

freeliuzc 2d1dade5e2 [Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155 )

* support static cachekv c8 quantization in mtp mode

* optimize memory allocation

2025-11-21 15:10:13 +08:00

__init__.py

2025-06-29 23:29:37 +00:00

base.py

2025-11-03 10:08:01 +08:00

mtp.py

2025-11-21 15:10:13 +08:00

ngram.py

2025-10-09 21:18:29 +08:00