
* 第一次提交 * 补充一处漏翻译 * deleted: docs/en/quantize.md * Update one translation * Update en version * Update one translation in code * Standardize one writing * Standardize one writing * Update some en version * Fix a grammer problem * Update en version for api/vision result * Merge branch 'develop' of https://github.com/charl-u/FastDeploy into develop * Checkout the link in README in vision_results/ to the en documents * Modify a title * Add link to serving/docs/ * Finish translation of demo.md
简体中文 | English
FastDeploy Serving Deployment
Introduction
FastDeploy builds an end-to-end serving deployment based on Triton Inference Server. The underlying backend uses the FastDeploy high-performance Runtime module and integrates the FastDeploy pre- and post-processing modules to achieve end-to-end serving deployment. It can achieve fast deployment with easy-to-use process and excellent performance.
Prepare the environment
Environment requirements
- Linux
- If using a GPU image, NVIDIA Driver >= 470 is required (for older Tesla architecture GPUs, such as T4, the NVIDIA Driver can be 418.40+, 440.33+, 450.51+, 460.27+)
Obtain Image
CPU Image
CPU images only support Paddle/ONNX models for serving deployment on CPUs, and supported inference backends include OpenVINO, Paddle Inference, and ONNX Runtime
docker pull registry.baidubce.com/paddlepaddle/fastdeploy:1.0.1-cpu-only-21.10
GPU Image
GPU images support Paddle/ONNX models for serving deployment on GPU and CPU, and supported inference backends including OpenVINO, TensorRT, Paddle Inference, and ONNX Runtime
docker pull registry.baidubce.com/paddlepaddle/fastdeploy:1.0.1-gpu-cuda11.4-trt8.4-21.10
Users can also compile the image by themselves according to their own needs, referring to the following documents:
Other Tutorials
- How to Prepare Serving Model Repository
- Serving Deployment Configuration for Runtime
- Demo of Serving Deployment
Serving Deployment Demo
Task | Model |
---|---|
Classification | PaddleClas |
Detection | PaddleDetection |
Detection | ultralytics/YOLOv5 |
NLP | PaddleNLP/ERNIE-3.0 |
NLP | PaddleNLP/UIE |
Speech | PaddleSpeech/PP-TTS |
OCR | PaddleOCR/PP-OCRv3 |