mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-07 17:41:52 +08:00

Files

huangjianhui 294607fc4a [Serving] PaddleSeg add triton serving && simple serving example (#1171 )

* Update keypointdetection result docs

* Update im.copy() to im in examples

* Update new Api, fastdeploy::vision::Visualize to fastdeploy::vision

* Update SwapBackgroundSegmentation && SwapBackgroundMatting to SwapBackground

* Update README_CN.md

* Update README_CN.md

* Update preprocessor.h

* PaddleSeg supports triton serving

* Add PaddleSeg simple serving example

* Add PaddleSeg triton serving client code

* Update triton serving runtime config.pbtxt

* Update paddleseg grpc client

* Add paddle serving README

2023-01-30 09:34:38 +08:00

models

[Serving] PaddleSeg add triton serving && simple serving example (#1171 )

2023-01-30 09:34:38 +08:00

paddleseg_grpc_client.py

[Serving] PaddleSeg add triton serving && simple serving example (#1171 )

2023-01-30 09:34:38 +08:00

README_CN.md

[Serving] PaddleSeg add triton serving && simple serving example (#1171 )

2023-01-30 09:34:38 +08:00

README.md

[Serving] PaddleSeg add triton serving && simple serving example (#1171 )

2023-01-30 09:34:38 +08:00

README.md

English | 简体中文

PaddleSegmentation Serving Deployment Demo

Launch Serving

# Download demo code
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/examples/vision/segmentation/paddleseg/serving

#Download PP_LiteSeg model file
wget  https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_B_STDC2_cityscapes_with_argmax_infer.tgz
tar -xvf PP_LiteSeg_B_STDC2_cityscapes_with_argmax_infer.tgz

# Move the model files to models/infer/1
mv yolov5s.onnx models/infer/1/

# Pull fastdeploy image, x.y.z is FastDeploy version, example 1.0.2.
docker pull paddlepaddle/fastdeploy:x.y.z-gpu-cuda11.4-trt8.4-21.10

# Run the docker. The docker name is fd_serving, and the current directory is mounted as the docker's /serving directory
nvidia-docker run -it --net=host --name fd_serving -v `pwd`/:/serving paddlepaddle/fastdeploy:x.y.z-gpu-cuda11.4-trt8.4-21.10  bash

# Start the service (Without setting the CUDA_VISIBLE_DEVICES environment variable, it will have scheduling privileges for all GPU cards)
CUDA_VISIBLE_DEVICES=0 fastdeployserver --model-repository=/serving/models --backend-config=python,shm-default-byte-size=10485760

Output the following contents if serving is launched

......
I0928 04:51:15.784517 206 grpc_server.cc:4117] Started GRPCInferenceService at 0.0.0.0:8001
I0928 04:51:15.785177 206 http_server.cc:2815] Started HTTPService at 0.0.0.0:8000
I0928 04:51:15.826578 206 http_server.cc:167] Started Metrics Service at 0.0.0.0:8002

Client Requests

Execute the following command in the physical machine to send a grpc request and output the result

#Download test images
wget https://paddleseg.bj.bcebos.com/dygraph/demo/cityscapes_demo.png

#Installing client-side dependencies
python3 -m pip install tritonclient\[all\]

# Send requests
python3 paddleseg_grpc_client.py

When the request is sent successfully, the results are returned in json format and printed out:

Modify Configs

The default is to run ONNXRuntime on CPU. If developers need to run it on GPU or other inference engines, please see the Configs File to modify the configs in models/runtime/config.pbtxt.