diff --git a/docs/get_started/installation/kunlunxin_xpu.md b/docs/get_started/installation/kunlunxin_xpu.md index cd97c526f..9c7606714 100644 --- a/docs/get_started/installation/kunlunxin_xpu.md +++ b/docs/get_started/installation/kunlunxin_xpu.md @@ -9,7 +9,7 @@ - XPU Firmware Version: ≥ 1.31 Verified platform: -- CPU: INTEL(R) XEON(R) PLATINUM 8563C +- CPU: INTEL(R) XEON(R) PLATINUM 8563C / Hygon C86-4G 7490 64-core Processor - Memory: 2T - Disk: 4T - OS: CentOS release 7.6 (Final) @@ -43,7 +43,7 @@ python -m pip install --pre paddlepaddle-xpu -i https://www.paddlepaddle.org.cn/ ### Install FastDeploy (**Do NOT install via PyPI source**) ```bash -python -m pip install fastdeploy-xpu==2.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/xpu-p800/ +python -m pip install fastdeploy-xpu==2.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/xpu-p800/ ``` Alternatively, you can install the latest version of FastDeploy (Not recommended) @@ -52,7 +52,7 @@ Alternatively, you can install the latest version of FastDeploy (Not recommended python -m pip install --pre fastdeploy-xpu -i https://www.paddlepaddle.org.cn/packages/nightly/xpu-p800/ ``` -### 3. Build wheel from source +## 3. Build wheel from source ### Install PaddlePaddle @@ -115,9 +115,9 @@ Currently, P800 has only validated deployment of the following models: - ERNIE-4.5-300B-A47B-Paddle 128K WINT4 (8-card) ### Offline inference - + After installing FastDeploy, you can perform offline text generation with user-provided prompts using the following code, - + ```python from fastdeploy import LLM, SamplingParams @@ -141,11 +141,11 @@ for output in outputs: Refer to [Parameters](../../parameters.md) for more configuration options. -## Online serving (OpenAI API-Compatible server) +### Online serving (OpenAI API-Compatible server) Deploy an OpenAI API-compatible server using FastDeploy with the following commands: -### Start service +#### Start service **ERNIE-4.5-300B-A47B-Paddle 32K WINT4 (8-card) (Recommended)** @@ -175,7 +175,7 @@ python -m fastdeploy.entrypoints.openai.api_server \ Refer to [Parameters](../../parameters.md) for more options. -### Send requests +#### Send requests Send requests using either curl or Python @@ -218,4 +218,4 @@ for chunk in response: print('\n') ``` -For detailed OpenAI protocol specifications, see [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create). Differences from the standard OpenAI protocol are documented in [Deployment](../../online_serving/README.md). +For detailed OpenAI protocol specifications, see [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create). Differences from the standard OpenAI protocol are documented in [OpenAI Protocol-Compatible API Server](../../online_serving/README.md). diff --git a/docs/zh/get_started/installation/kunlunxin_xpu.md b/docs/zh/get_started/installation/kunlunxin_xpu.md index f49055b62..5dcd43df1 100644 --- a/docs/zh/get_started/installation/kunlunxin_xpu.md +++ b/docs/zh/get_started/installation/kunlunxin_xpu.md @@ -9,7 +9,7 @@ - XPU 固件版本:≥ 1.31 已验证的平台: -- CPU:INTEL(R) XEON(R) PLATINUM 8563C +- CPU:INTEL(R) XEON(R) PLATINUM 8563C / Hygon C86-4G 7490 64-core Processor - 内存:2T - 磁盘:4T - OS:CentOS release 7.6 (Final) @@ -43,7 +43,7 @@ python -m pip install --pre paddlepaddle-xpu -i https://www.paddlepaddle.org.cn/ ### 安装 FastDeploy(**注意不要通过 pypi 源安装**) ```bash -python -m pip install fastdeploy-xpu==2.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/xpu-p800/ +python -m pip install fastdeploy-xpu==2.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/xpu-p800/ ``` 或者你也可以安装最新版 FastDeploy(不推荐) @@ -95,6 +95,7 @@ git checkout cd FastDeploy bash build.sh ``` + 编译后的产物在 ```FastDeploy/dist``` 目录下。 ## 验证是否安装成功 @@ -222,5 +223,4 @@ for chunk in response: print('\n') ``` -OpenAI 协议的更多说明可参考文档 [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create),以及与 OpenAI 协议的区别可以参考 [服务化部署](../../online_serving/README.md)。 - +OpenAI 协议的更多说明可参考文档 [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create),以及与 OpenAI 协议的区别可以参考 [兼容 OpenAI 协议的服务化部署](../../online_serving/README.md)。