[Backend] TRT backend & PP-Infer backend support pinned memory (#403)

* TRT backend use pinned memory * refine fd tensor pinned memory logic * TRT enable pinned memory configurable * paddle inference support pinned memory * pinned memory pybindings Co-authored-by: Jason <jiangjiajun@baidu.com>
2025-10-25 09:31:38 +08:00 · 2022-10-21 18:51:36 +08:00
parent 8dbc1f1d10
commit 43d86114d8
14 changed files with 120 additions and 18 deletions
--- a/fastdeploy/core/fd_tensor.h
+++ b/fastdeploy/core/fd_tensor.h
@@ -40,6 +40,10 @@ struct FASTDEPLOY_DECL FDTensor {
  // so we can skip data transfer, which may improve the efficience
  Device device = Device::CPU;

+  // Whether the data buffer is in pinned memory, which is allocated
+  // with cudaMallocHost()
+  bool is_pinned_memory = false;
+
  // if the external data is not on CPU, we use this temporary buffer
  // to transfer data to CPU at some cases we need to visit the
  // other devices' data