Files
FastDeploy/custom_ops/gpu_ops/append_attn
freeliuzc 2d1dade5e2 [Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155)
* support static cachekv c8 quantization in mtp mode

* optimize memory allocation
2025-11-21 15:10:13 +08:00
..
2025-10-24 10:14:53 +08:00
2025-10-20 14:44:58 +08:00