diff --git a/docs/quantization/images/wint2.png b/docs/quantization/images/wint2.png
new file mode 100644
index 000000000..a117ea8af
Binary files /dev/null and b/docs/quantization/images/wint2.png differ
diff --git a/docs/quantization/wint2.md b/docs/quantization/wint2.md
index f82b7da73..e7c586632 100644
--- a/docs/quantization/wint2.md
+++ b/docs/quantization/wint2.md
@@ -4,7 +4,7 @@ Weights are compressed offline using the [CCQ (Convolutional Coding Quantization
 - **Supported Hardware**: GPU
 - **Supported Architecture**: MoE architecture
 This method relies on the convolution algorithm to use overlapping bits to map 2-bit values ​​to a larger numerical representation space, so that the model weight quantization retains more information of the original data while compressing the true value to an extremely low 2-bit size. The general principle can be seen in the figure below:
-[卷积编码量化示意图](./wint2.png)
+![卷积编码量化示意图](./images/wint2.png)
 
 CCQ WINT2 is generally used in resource-constrained and low-threshold scenarios. Taking ERNIE-4.5-300B-A47B as an example, weights are compressed to 89GB, supporting single-card deployment on 141GB H20.
 
diff --git a/docs/zh/quantization/images/wint2.png b/docs/zh/quantization/images/wint2.png
new file mode 100644
index 000000000..a117ea8af
Binary files /dev/null and b/docs/zh/quantization/images/wint2.png differ
diff --git a/docs/zh/quantization/wint2.md b/docs/zh/quantization/wint2.md
index cc224aabb..00e55a979 100644
--- a/docs/zh/quantization/wint2.md
+++ b/docs/zh/quantization/wint2.md
@@ -5,7 +5,7 @@
 - **支持结构**：MoE结构
 
 该方法依托卷积算法利用重叠的Bit位将2Bit的数值映射到更大的数值表示空间，使得模型权重量化后既保留原始数据更多的信息，同时将真实数值压缩到极低的2Bit大小，大致原理可参考下图：
-[卷积编码量化示意图](./wint2.png)
+![卷积编码量化示意图](./images/wint2.png)
 
 CCQ WINT2一般用于资源受限的低门槛场景，以ERNIE-4.5-300B-A47B为例，将权重压缩到89GB，可支持141GB H20单卡部署。