diff --git a/docs/get_started/installation/kunlunxin_xpu.md b/docs/get_started/installation/kunlunxin_xpu.md index e425d5ad7..51067e893 100644 --- a/docs/get_started/installation/kunlunxin_xpu.md +++ b/docs/get_started/installation/kunlunxin_xpu.md @@ -127,7 +127,7 @@ Deploy an OpenAI API-compatible server using FastDeploy with the following comma #### Start service -**The ERNIE-4.5-300B-A47B-Paddle model is to be deployed with a configuration of 32K WINT4 utilizing 8 XPU cards (Recommended)** +**Deploy the ERNIE-4.5-300B-A47B-Paddle model with WINT4 precision and 32K context length on 8 XPUs(Recommended)** ```bash python -m fastdeploy.entrypoints.openai.api_server \ @@ -140,7 +140,7 @@ python -m fastdeploy.entrypoints.openai.api_server \ --gpu-memory-utilization 0.9 ``` -**The ERNIE-4.5-300B-A47B-Paddle model is to be deployed with a configuration of 128K WINT4 utilizing 8 XPU cards** +**Deploy the ERNIE-4.5-300B-A47B-Paddle model with WINT4 precision and 128K context length on 8 XPUs** ```bash python -m fastdeploy.entrypoints.openai.api_server \ @@ -153,7 +153,7 @@ python -m fastdeploy.entrypoints.openai.api_server \ --gpu-memory-utilization 0.9 ``` -**The ERNIE-4.5-300B-A47B-Paddle model is to be deployed with a configuration of 32K WINT4 utilizing 4 XPU cards** +**Deploy the ERNIE-4.5-300B-A47B-Paddle model with WINT4 precision and 32K context length on 4 XPUs** ```bash export XPU_VISIBLE_DEVICES="0,1,2,3" diff --git a/docs/zh/get_started/installation/kunlunxin_xpu.md b/docs/zh/get_started/installation/kunlunxin_xpu.md index 479077797..ed3148613 100644 --- a/docs/zh/get_started/installation/kunlunxin_xpu.md +++ b/docs/zh/get_started/installation/kunlunxin_xpu.md @@ -128,7 +128,7 @@ P800 支持 ```ERNIE-4.5-300B-A47B-Paddle``` 模型采用以下配置部署( #### 启动服务 -**ERNIE-4.5-300B-A47B-Paddle 模型采用 32K WINT4 8 卡配置部署(推荐)** +**基于 WINT4 精度和 32K 上下文部署 ERNIE-4.5-300B-A47B-Paddle 模型到 8 卡 P800 服务器(推荐)** ```bash python -m fastdeploy.entrypoints.openai.api_server \ @@ -141,7 +141,7 @@ python -m fastdeploy.entrypoints.openai.api_server \ --gpu-memory-utilization 0.9 ``` -**ERNIE-4.5-300B-A47B-Paddle 模型采用 128K WINT4 8 卡配置部署** +**基于 WINT4 精度和 128K 上下文部署 ERNIE-4.5-300B-A47B-Paddle 模型到 8 卡 P800 服务器** ```bash python -m fastdeploy.entrypoints.openai.api_server \ @@ -154,7 +154,7 @@ python -m fastdeploy.entrypoints.openai.api_server \ --gpu-memory-utilization 0.9 ``` -**ERNIE-4.5-300B-A47B-Paddle 模型采用 32K WINT4 4 卡配置部署** +**基于 WINT4 精度和 32K 上下文部署 ERNIE-4.5-300B-A47B-Paddle 模型到 4 卡 P800 服务器** ```bash export XPU_VISIBLE_DEVICES="0,1,2,3"