Support Poros Backend (#188)

* Add poros backend * Add torch lib * Add python3 lib * set c++ 14 for poros * fixed bugs * fixed grammer bugs * fixed grammer bugs * fixed code bugs * fixed code bugs * fixed CreatePorosValue bug * Add AtType2String for Log * fixed trt_option * fixed poros.cmake path * fixed grammer bug * fixed grammer bug * fixed ambiguous reference * fixed ambiguous reference * fixed reference error * fixed include files * rm ENABLE_TRT_BACKEND in poros * update CMakeLists.txt * fixed CMakeLists.txt * Add libtorch.so in CMakeLists.txt * Fixed CMakeLists.txt * Fixed CMakeLists.txt * Fixed copy bug * Fixed copy bug * Fixed copy bug * Fixed Cmake * Fixed Cmake * debug * debug * debug * debug * debug * debug * debug utils * debug utils * copy to cpu * rm log info * test share mem * test share mem * test share mem * test multi outputs * test multi outputs * test multi outputs * test multi outputs * test multi outputs * test multi outputs * test multi outputs * time cost * time cost * fixed bug * time collect * mem copy * mem copy * rm time log * rm share mem * fixed multi inputs bug * add set_input_dtypes func * add SetInputDtypes * fixed bug * fixed bug * fixed prewarm data order * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * debug * fixed bug * Add compile func * Add compile func * Add compile func * Add is_dynamic option * Add is_dynamic option * Add is_dynamic option * Add is_dynamic option * rm infer log * add cuda11.6 poros lib * fixed bug * fixed bug * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * fixed multi outputs * rm logs * test * test * test * add test log * add test log * add test log * add test log * support cpu * support cpu * support cpu * support cpu * support member variable definition * rm useless log * fixed name * resolve conflict * resolve conflict * resolve conflict * fixed cmake * add GetInputInfos&GetOutputInfos * add GetInputInfos&GetOutputInfos * fixed bug * fixed runtime.py * add compile func * add np * deal with comments * rm to_inter func * add property
2025-10-05 16:48:03 +08:00 · 2022-10-17 15:28:12 +08:00
parent c8db2dd1ef
commit f5c94e5471
19 changed files with 1333 additions and 12 deletions
--- a/fastdeploy/runtime.h
+++ b/fastdeploy/runtime.h
@@ -38,6 +38,7 @@ enum Backend {
  ORT,  ///< ONNX Runtime, support Paddle/ONNX format model, CPU / Nvidia GPU
  TRT,  ///< TensorRT, support Paddle/ONNX format model, Nvidia GPU only
  PDINFER,  ///< Paddle Inference, support Paddle format model, CPU / Nvidia GPU
+  POROS,  ///< Poros, support TorchScript format model, CPU / Nvidia GPU
  OPENVINO,  ///< Intel OpenVINO, support Paddle/ONNX format, CPU only
  LITE,  ///< Paddle Lite, support Paddle format model, ARM CPU only
 };
@@ -47,6 +48,7 @@ enum ModelFormat {
  AUTOREC,  ///< Auto recognize the model format by model file name
  PADDLE,  ///< Model with paddlepaddle format
  ONNX,  ///< Model with ONNX format
+  TORCHSCRIPT,  ///< Model with TorchScript format
 };

 FASTDEPLOY_DECL std::ostream& operator<<(std::ostream& out,
@@ -117,6 +119,9 @@ struct FASTDEPLOY_DECL RuntimeOption {
  /// Set TensorRT as inference backend, only support GPU
  void UseTrtBackend();

+  /// Set Poros backend as inference backend, support CPU/GPU
+  void UsePorosBackend();
+
  /// Set OpenVINO as inference backend, only support CPU
  void UseOpenVINOBackend();

@@ -243,6 +248,13 @@ struct FASTDEPLOY_DECL RuntimeOption {
  size_t trt_max_batch_size = 32;
  size_t trt_max_workspace_size = 1 << 30;

+  // ======Only for Poros Backend=======
+  bool is_dynamic = false;
+  bool long_to_int = true;
+  bool use_nvidia_tf32 = false;
+  int unconst_ops_thres = -1;
+  std::string poros_file = "";
+
  std::string model_file = "";   // Path of model file
  std::string params_file = "";  // Path of parameters file, can be empty
  ModelFormat model_format = ModelFormat::AUTOREC;  // format of input model
@@ -270,6 +282,15 @@ struct FASTDEPLOY_DECL Runtime {
  bool Infer(std::vector<FDTensor>& input_tensors,
             std::vector<FDTensor>* output_tensors);

+  /** \brief Compile TorchScript Module, only for Poros backend
+   *
+   * \param[in] prewarm_tensors Prewarm datas for compile
+   * \param[in] _option Runtime option
+   * \return true if compile successed, otherwise false
+   */
+  bool Compile(std::vector<std::vector<FDTensor>>& prewarm_tensors,
+               const RuntimeOption& _option);
+
  /** \brief Get number of inputs
   */
  int NumInputs() { return backend_->NumInputs(); }