Files
FastDeploy/fastdeploy/worker/gpu_model_runner.py
freeliuzc ceafd757f0 [Speculative Decoding]Support multi-step mtp with cudagraph (#5624) (#5670)
* support multi-step mtp with cudagraph

* fix usage

* fix unit test
2025-12-23 13:18:47 +08:00

142 KiB