Files
FastDeploy/custom_ops/gpu_ops
freeliuzc 2d1dade5e2 [Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155)
* support static cachekv c8 quantization in mtp mode

* optimize memory allocation
2025-11-21 15:10:13 +08:00
..
2025-09-01 17:50:17 +08:00
2025-09-01 17:50:17 +08:00
2025-11-18 17:18:12 +08:00
2025-09-01 17:50:17 +08:00
2025-11-19 16:02:21 +08:00
2025-09-01 17:50:17 +08:00
2025-09-01 17:50:17 +08:00
2025-07-09 18:56:27 +08:00
2025-09-01 17:50:17 +08:00
2025-07-07 16:53:14 +08:00
2025-09-01 17:50:17 +08:00
2025-09-01 17:50:17 +08:00