Files
FastDeploy/docs/cn/faq/tensorrt_tricks.md
DefTruth e24c592f43 [osx] fixed mac osx arm64 and x86_64 0.3.0 rpath error (#374)
* Update paddle_inference.cmake

* Update process_libraries.py.in

* Update process_libraries.py.in

* Update paddle_inference.cmake

* Update CMakeLists.txt

* Update VERSION_NUMBER

* Update VERSION_NUMBER

* Update download_prebuilt_libraries.md

* Update gpu.md

* Update cpu.md

* Update download_prebuilt_libraries.md

* Update jetson.md

* Update tensorrt_tricks.md

* Update cpp.md

Co-authored-by: Jason <jiangjiajun@baidu.com>
2022-10-16 15:23:35 +08:00

4.8 KiB
Raw Blame History

TensorRT使用问题

1. 运行TensorRT过程中出现如下日志提示

[WARNING] fastdeploy/backends/tensorrt/trt_backend.cc(552)::CreateTrtEngineFromOnnx	Cannot build engine right now, because there's dynamic input shape exists, list as below,
[WARNING] fastdeploy/backends/tensorrt/trt_backend.cc(556)::CreateTrtEngineFromOnnx	Input 0: TensorInfo(name: image, shape: [-1, 3, 320, 320], dtype: FDDataType::FP32)
[WARNING] fastdeploy/backends/tensorrt/trt_backend.cc(556)::CreateTrtEngineFromOnnx	Input 1: TensorInfo(name: scale_factor, shape: [1, 2], dtype: FDDataType::FP32)
[WARNING] fastdeploy/backends/tensorrt/trt_backend.cc(558)::CreateTrtEngineFromOnnx	FastDeploy will build the engine while inference with input data, and will also collect the input shape range information. You should be noticed that FastDeploy will rebuild the engine while new input shape is out of the collected shape range, this may bring some time consuming problem, refer https://github.com/PaddlePaddle/FastDeploy/docs/backends/tensorrt.md for more details.
[INFO] fastdeploy/fastdeploy_runtime.cc(270)::Init	Runtime initialized with Backend::TRT in device Device::GPU.
[INFO] fastdeploy/vision/detection/ppdet/ppyoloe.cc(65)::Initialize	Detected operator multiclass_nms3 in your model, will replace it with fastdeploy::backend::MultiClassNMS(background_label=-1, keep_top_k=100, nms_eta=1, nms_threshold=0.6, score_threshold=0.025, nms_top_k=1000, normalized=1).
[WARNING] fastdeploy/backends/tensorrt/utils.cc(40)::Update	[New Shape Out of Range] input name: image, shape: [1, 3, 320, 320], The shape range before: min_shape=[-1, 3, 320, 320], max_shape=[-1, 3, 320, 320].
[WARNING] fastdeploy/backends/tensorrt/utils.cc(52)::Update	[New Shape Out of Range] The updated shape range now: min_shape=[1, 3, 320, 320], max_shape=[1, 3, 320, 320].
[WARNING] fastdeploy/backends/tensorrt/trt_backend.cc(281)::Infer	TensorRT engine will be rebuilt once shape range information changed, this may take lots of time, you can set a proper shape range before loading model to avoid rebuilding process. refer https://github.com/PaddlePaddle/FastDeploy/docs/backends/tensorrt.md for more details.
[INFO] fastdeploy/backends/tensorrt/trt_backend.cc(416)::BuildTrtEngine	Start to building TensorRT Engine...

大部分模型会存在动态Shape例如分类的输入为[-1, 3, 224, 224]表示其第一维batch维是动态的 检测的输入[-1, 3, -1, -1]表示其batch维以及高和宽是动态的。 而TensorRT在构建引擎时需要知道这些动态维度的范围。 因此FastDeploy通过以下两种方式来解决

    1. 自动设置动态Shape; 在加载模型时如若遇到模型包含动态Shape则不会立刻创建TensorRT引擎而是在实际输入数据预测时获取到数据的Shape再进行构建。
    • 1.1 由于大部分模型在推理时Shape都不会变因此相当于只是将构建的过程推迟到预测阶段整体没太大影响
    • 1.2 如若预测过程中Shape在变化FastDeploy会不断收集新的Shape扩大动态维度的变化范围。每次遇到新的Shape且超出范围的则更新范围并重新构建TensorRT引擎。 因此这样在遇到超过范围的Shape时会重新花一定时间构建引擎例如OCR模型存在这种现象但随着不断预测数据的Shape范围最终稳定后便不会再重新构建。
    1. 手动设置动态Shape当知道模型存在动态Shape先手动设置好其动态范围这样可以避免预测时重新构建
    • 2.1 Python接口调用RuntimeOption.set_trt_input_shape函数。 Python API文档
    • 2.2 C++接口调用RuntimeOption.SetTrtInputShape函数。C++ API文档

2. 每次运行TensorRT加载模型初始化耗时长

TensorRT每次构建模型的过程较长FastDeploy提供了Cache机制帮助开发者将构建好的模型缓存在本地这样在重新运行代码时可以通过加载Cache快速完成模型的加载初始化。

  • Python接口调用RuntimeOption.set_trt_cache_file函数。Python API文档
  • C++接口调用RuntimeOption.SetTrtCacheFile函数。 C++ API文档

接口传入文件路径字符串,当在执行代码时,

  • 如若发现传入的文件路径不存在则会构建TensorRT引擎在构建完成后将引擎转换为二进制流存储到此文件路径
  • 如若发现传入的文件路径存在则会跳过构建TensorRT引擎直接加载此文件并还原成TensorRT引擎

因此如若有修改模型推理配置例如Float32改成Float16)需先删除本地的cache文件避免出错。