mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-10-04 16:22:57 +08:00
【Inference Optimize】DeepSeek-V3-model MLA Optimize (#3886)
* support MLA chunk_size auto search & cuda_graph
This commit is contained in:
Binary file not shown.
Before Width: | Height: | Size: 81 KiB |
Binary file not shown.
Before Width: | Height: | Size: 81 KiB |
Reference in New Issue
Block a user