mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-10-28 10:51:39 +08:00
42
serving/docs/EN/compile-en.md
Normal file
42
serving/docs/EN/compile-en.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# FastDeploy Serving Deployment Image Compilation
|
||||
|
||||
How to create a FastDploy image
|
||||
|
||||
## GPU Image
|
||||
|
||||
The GPU images published by FastDploy are based on version 21.10 of [Triton Inference Server](https://github.com/triton-inference-server/server). If developers need to use other CUDA versions, please refer to [ NVIDIA official website](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html) to modify the scripts in Dockerfile and scripts.
|
||||
|
||||
```shell
|
||||
# Enter the serving directory and execute the script to compile the FastDeploy and serving backend
|
||||
cd serving
|
||||
bash scripts/build.sh
|
||||
|
||||
# Exit to the FastDeploy home directory and create the image
|
||||
cd ../
|
||||
docker build -t paddlepaddle/fastdeploy:0.3.0-gpu-cuda11.4-trt8.4-21.10 -f serving/Dockerfile .
|
||||
```
|
||||
|
||||
## CPU Image
|
||||
|
||||
```shell
|
||||
# Enter the serving directory and execute the script to compile the FastDeploy and serving backend
|
||||
cd serving
|
||||
cd serving
|
||||
bash scripts/build.sh OFF
|
||||
|
||||
# Exit to the FastDeploy home directory and create the image
|
||||
cd ../
|
||||
docker build -t paddlepaddle/fastdeploy:0.3.0-cpu-only-21.10 -f serving/Dockerfile_cpu .
|
||||
```
|
||||
|
||||
## IPU Image
|
||||
|
||||
```shell
|
||||
# Enter the serving directory and execute the script to compile the FastDeploy and serving backend
|
||||
cd serving
|
||||
bash scripts/build_fd_ipu.sh
|
||||
|
||||
# Exit to the FastDeploy home directory and create the image
|
||||
cd ../
|
||||
docker build -t paddlepaddle/fastdeploy:0.3.0-ipu-only-21.10 -f serving/Dockerfile_ipu .
|
||||
```
|
||||
84
serving/docs/EN/model_repository-en.md
Normal file
84
serving/docs/EN/model_repository-en.md
Normal file
@@ -0,0 +1,84 @@
|
||||
# Model Repository
|
||||
|
||||
FastDeploy starts the serving by specifying one or more models in the model repository to deploy the service. When the serving is running, the models in the service can be modified following [Model Management](https://github.com/triton-inference-server/server/blob/main/docs/model_management.md), and obtain serving from one or more model repositories specified at the serving initiation.
|
||||
|
||||
## Repository Architecture
|
||||
|
||||
The model repository path is specified via the *--model-repository* option at FastDeploy's initation, and multiple repositories can be loaded by specifying the *--model-repository* option multiple times. Example:
|
||||
|
||||
```
|
||||
$ fastdeploy --model-repository=<model-repository-path>
|
||||
```
|
||||
|
||||
Model repository architecture should comply the following format:
|
||||
|
||||
```
|
||||
<model-repository-path>/
|
||||
<model-name>/
|
||||
[config.pbtxt]
|
||||
[<output-labels-file> ...]
|
||||
<version>/
|
||||
<model-definition-file>
|
||||
<version>/
|
||||
<model-definition-file>
|
||||
...
|
||||
<model-name>/
|
||||
[config.pbtxt]
|
||||
[<output-labels-file> ...]
|
||||
<version>/
|
||||
<model-definition-file>
|
||||
<version>/
|
||||
<model-definition-file>
|
||||
...
|
||||
...
|
||||
```
|
||||
|
||||
At the topmost `<model-repository-path>` model repository directory, there must be 0 or more `<model-name>` subdirectories. Each `<model-name>` subdirectory contains information corresponding to the model deployment, multiple numeric subdirectories indicating the model version, and a *config.pbtxt* file describing the model configuration.
|
||||
|
||||
Paddle models are saved in the version number subdirectory, which must be `model.pdmodel` and `model.pdiparams` files.
|
||||
|
||||
## Model Version
|
||||
|
||||
Each model can have one or more versions available in the repository. The subdirectory named with a number in the model directory implies the version number. Subdirectories that are not named with a number, or that start with *0* will be ignored. A [version policy](https://github.com/triton-inference-server/server/blob/main/docs/model_configuration.md#version-policy) can be specified in the model configuration file to control which version of the model in model directory is launched by Triton.
|
||||
|
||||
## Repository Demo
|
||||
|
||||
The model needed for Paddle deployment must be an inference model exported from version 2.0 or higher. The model contains `model.pdmodel` and `model.pdiparams` in the version directory.
|
||||
|
||||
Example: A minimal model repository directory for deploying Paddle models
|
||||
|
||||
```
|
||||
<model-repository-path>/
|
||||
<model-name>/
|
||||
config.pbtxt
|
||||
1/
|
||||
model.pdmodel
|
||||
model.pdiparams
|
||||
|
||||
# Example:
|
||||
models
|
||||
└── ResNet50
|
||||
├── 1
|
||||
│ ├── model.pdiparams
|
||||
│ └── model.pdmodel
|
||||
└── config.pbtxt
|
||||
```
|
||||
|
||||
To deploy an ONNX model, model with the name `model.onnx` must be included in the version directory
|
||||
|
||||
Example: A minimal model repository directory for deploying ONNX models
|
||||
|
||||
```
|
||||
<model-repository-path>/
|
||||
<model-name>/
|
||||
config.pbtxt
|
||||
1/
|
||||
model.onnx
|
||||
|
||||
# Example:
|
||||
models
|
||||
└── ResNet50
|
||||
├── 1
|
||||
│ ├── model.onnx
|
||||
└── config.pbtxt
|
||||
```
|
||||
Reference in New Issue
Block a user