mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-12-24 13:28:13 +08:00
Modify README
This commit is contained in:
@@ -42,7 +42,7 @@
|
|||||||
- 🤝 **OpenAI API服务与vLLM兼容**:单命令部署,兼容[vLLM](https://github.com/vllm-project/vllm/)接口
|
- 🤝 **OpenAI API服务与vLLM兼容**:单命令部署,兼容[vLLM](https://github.com/vllm-project/vllm/)接口
|
||||||
- 🧮 **全量化格式支持**:W8A16、W8A8、W4A16、W4A8、W2A16、FP8等
|
- 🧮 **全量化格式支持**:W8A16、W8A8、W4A16、W4A8、W2A16、FP8等
|
||||||
- ⏩ **高级加速技术**:推测解码、多令牌预测(MTP)及分块预填充
|
- ⏩ **高级加速技术**:推测解码、多令牌预测(MTP)及分块预填充
|
||||||
- 🖥️ **多硬件支持**:NVIDIA GPU、昆仑芯XPU、海光DCU、昇腾NPU、天数智芯GPU、燧原GCU、沐曦GPU、英特尔Gaudi等
|
- 🖥️ **多硬件支持**:NVIDIA GPU、昆仑芯XPU、海光DCU、天数智芯GPU、燧原GCU、沐曦GPU、英特尔Gaudi等
|
||||||
|
|
||||||
## 要求
|
## 要求
|
||||||
|
|
||||||
@@ -61,8 +61,6 @@ FastDeploy 支持在**英伟达(NVIDIA)GPU**、**昆仑芯(Kunlunxin)XPU
|
|||||||
- [沐曦 GPU](./docs/zh/get_started/installation/metax_gpu.md)
|
- [沐曦 GPU](./docs/zh/get_started/installation/metax_gpu.md)
|
||||||
- [英特尔 Gaudi](./docs/zh/get_started/installation/intel_gaudi.md)
|
- [英特尔 Gaudi](./docs/zh/get_started/installation/intel_gaudi.md)
|
||||||
|
|
||||||
**注意:** 我们正在积极拓展硬件支持范围。目前,包括昇腾(Ascend)NPU 等其他硬件平台正在开发测试中。敬请关注更新!
|
|
||||||
|
|
||||||
## 入门指南
|
## 入门指南
|
||||||
|
|
||||||
通过我们的文档了解如何使用 FastDeploy:
|
通过我们的文档了解如何使用 FastDeploy:
|
||||||
|
|||||||
@@ -40,7 +40,7 @@ English | [简体中文](README_CN.md)
|
|||||||
- 🤝 **OpenAI API Server and vLLM Compatible**: One-command deployment with [vLLM](https://github.com/vllm-project/vllm/) interface compatibility.
|
- 🤝 **OpenAI API Server and vLLM Compatible**: One-command deployment with [vLLM](https://github.com/vllm-project/vllm/) interface compatibility.
|
||||||
- 🧮 **Comprehensive Quantization Format Support**: W8A16, W8A8, W4A16, W4A8, W2A16, FP8, and more.
|
- 🧮 **Comprehensive Quantization Format Support**: W8A16, W8A8, W4A16, W4A8, W2A16, FP8, and more.
|
||||||
- ⏩ **Advanced Acceleration Techniques**: Speculative decoding, Multi-Token Prediction (MTP) and Chunked Prefill.
|
- ⏩ **Advanced Acceleration Techniques**: Speculative decoding, Multi-Token Prediction (MTP) and Chunked Prefill.
|
||||||
- 🖥️ **Multi-Hardware Support**: NVIDIA GPU, Kunlunxin XPU, Hygon DCU, Ascend NPU, Iluvatar GPU, Enflame GCU, MetaX GPU, Intel Gaudi etc.
|
- 🖥️ **Multi-Hardware Support**: NVIDIA GPU, Kunlunxin XPU, Hygon DCU, Iluvatar GPU, Enflame GCU, MetaX GPU, Intel Gaudi etc.
|
||||||
|
|
||||||
## Requirements
|
## Requirements
|
||||||
|
|
||||||
@@ -59,8 +59,6 @@ FastDeploy supports inference deployment on **NVIDIA GPUs**, **Kunlunxin XPUs**,
|
|||||||
- [MetaX GPU](./docs/get_started/installation/metax_gpu.md)
|
- [MetaX GPU](./docs/get_started/installation/metax_gpu.md)
|
||||||
- [Intel Gaudi](./docs/get_started/installation/intel_gaudi.md)
|
- [Intel Gaudi](./docs/get_started/installation/intel_gaudi.md)
|
||||||
|
|
||||||
**Note:** We are actively working on expanding hardware support. Additional hardware platforms including Ascend NPU are currently under development and testing. Stay tuned for updates!
|
|
||||||
|
|
||||||
## Get Started
|
## Get Started
|
||||||
|
|
||||||
Learn how to use FastDeploy through our documentation:
|
Learn how to use FastDeploy through our documentation:
|
||||||
|
|||||||
Reference in New Issue
Block a user