polish code with new pre-commit rule (#2923)

2025-12-24 13:28:13 +08:00 · 2025-07-19 23:19:27 +08:00
parent b8676d71a8
commit 25698d56d1
424 changed files with 14307 additions and 13518 deletions
--- a/docs/features/speculative_decoding.md
+++ b/docs/features/speculative_decoding.md
@@ -10,22 +10,22 @@ This project implements an efficient **Speculative Decoding** inference framewor

 - **Ngram**

- **MTP (Multi-Token Prediction)**  
-  - ✅ Supported: TP Sharding  
-  - ✅ Supported: Shared Prefix  
-  - ✅ Supported: TP Sharding + PD Separation  
+- **MTP (Multi-Token Prediction)**
+  - ✅ Supported: TP Sharding
+  - ✅ Supported: Shared Prefix
+  - ✅ Supported: TP Sharding + PD Separation
  - ⏳ Coming Soon: EP + DP + PD Separation
  - ⏳ Coming Soon: Support Chunk-prefill
-  - ⏳ Coming Soon: Multi-layer MTP Layer  
+  - ⏳ Coming Soon: Multi-layer MTP Layer

 ---

 ### Coming Soon

- Draft Model  
- Eagle  
- Hydra  
- Medusa  
+- Draft Model
+- Eagle
+- Hydra
+- Medusa
 - ...

 ---
@@ -54,7 +54,7 @@ This project implements an efficient **Speculative Decoding** inference framewor

 ## 🚀 Using Multi-Token Prediction (MTP)

-For detailed theory, refer to:  
+For detailed theory, refer to:
 📄 [DeepSeek-V3 Paper](https://arxiv.org/pdf/2412.19437)

 ### TP Sharding Mode
@@ -147,4 +147,4 @@ python -m fastdeploy.entrypoints.openai.api_server \
    --config ${path_to_FastDeploy}benchmarks/yaml/eb45t-32k-wint4-mtp-h100-tp4.yaml \
    --speculative-config '{"method": "mtp", "num_speculative_tokens": 1, "model": "${mtp_model_path}"}'

-```
+```