[Benchmark]Add inlcude_h2d_d2h config for trt (#1699)

* add GPL lisence * add GPL-3.0 lisence * add GPL-3.0 lisence * add GPL-3.0 lisence * support yolov8 * add pybind for yolov8 * add yolov8 readme * add cpp benchmark * add cpu and gpu mem * public part split * add runtime mode * fixed bugs * add cpu_thread_nums * deal with comments * deal with comments * deal with comments * rm useless code * add FASTDEPLOY_DECL * add FASTDEPLOY_DECL * fixed for windows * mv rss to pss * mv rss to pss * Update utils.cc * use thread to collect mem * Add ResourceUsageMonitor * rm useless code * fixed bug * fixed typo * update ResourceUsageMonitor * fixed bug * fixed bug * add note for ResourceUsageMonitor * deal with comments * add macros * deal with comments * deal with comments * deal with comments * re-lint * rm pmap and use mem api * rm pmap and use mem api * add mem api * Add PrintBenchmarkInfo func * Add PrintBenchmarkInfo func * Add PrintBenchmarkInfo func * deal with comments * fixed enable_paddle_to_trt * add log for paddle_trt * support ppcls benchmark * use new trt option api * update benchmark info * simplify benchmark.cc * simplify benchmark.cc * deal with comments * Add ppseg && ppocr benchmark * add OCR rec img * add ocr benchmark * fixed trt shape * add trt shape * resolve conflict * add ENABLE_BENCHMARK define * Add ClassifyDiff * Add Resize for ClassifyResult * deal with comments * add convert info script * resolve conflict * Add SaveBenchmarkResult func * fixed bug * fixed bug * fixed bug * add config.txt for option * fixed bug * fixed bug * fixed bug * add benchmark.sh * mv thread_nums from 8 to 1 * deal with comments * deal with comments * fixed readme * deal with comments * add all platform shell * Update config.arm.txt * Update config.gpu.txt * Update config.x86.txt * fixed printinfo bug * rm proxy * add more model support * all backend config.txt * deal with comments * Add MattingDiff compare * fixed predict bug * adjust warmup/repeat times * add e2e/mem configs * fixed typo * open collect_mem * fixed typo * add trt cache option * fixed bug * fixed repeat times * test for benchmark * test for det benchmark * for benchmark * fixed for x86 * add h2d and d2h config * renmae txt file --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
2025-12-24 13:28:13 +08:00 · 2023-03-24 11:17:49 +08:00
parent cae341e6c5
commit f0235a4c4e
8 changed files with 64 additions and 8 deletions
--- a/benchmark/cpp/config/config.gpu.ort.fp32.e2e.txt
+++ b/benchmark/cpp/config/config.gpu.ort.fp32.e2e.txt
@@ -1,8 +1,8 @@
 device: gpu
 device_id: 3
 cpu_thread_nums: 1
-warmup: 20
-repeat: 100
+warmup: 200
+repeat: 1000
 backend: ort
 profile_mode: end2end
 include_h2d_d2h: false
--- a/benchmark/cpp/config/config.gpu.ort.fp32.txt
+++ b/benchmark/cpp/config/config.gpu.ort.fp32.txt
@@ -1,8 +1,8 @@
 device: gpu
 device_id: 3
 cpu_thread_nums: 1
-warmup: 20
-repeat: 100
+warmup: 200
+repeat: 1000
 backend: ort
 profile_mode: runtime
 include_h2d_d2h: false
--- a/benchmark/cpp/config/config.gpu.paddle.fp32.e2e.txt
+++ b/benchmark/cpp/config/config.gpu.paddle.fp32.e2e.txt
@@ -1,8 +1,8 @@
 device: gpu
 device_id: 3
 cpu_thread_nums: 1
-warmup: 20
-repeat: 100
+warmup: 200
+repeat: 1000
 backend: paddle
 profile_mode: end2end
 include_h2d_d2h: false
--- a/benchmark/cpp/config/config.gpu.paddle.fp32.txt
+++ b/benchmark/cpp/config/config.gpu.paddle.fp32.txt
@@ -1,8 +1,8 @@
 device: gpu
 device_id: 3
 cpu_thread_nums: 1
-warmup: 20
-repeat: 100
+warmup: 200
+repeat: 1000
 backend: paddle
 profile_mode: runtime
 include_h2d_d2h: false
--- a/benchmark/cpp/config/config.gpu.paddle_trt.fp16.h2d.txt
+++ b/benchmark/cpp/config/config.gpu.paddle_trt.fp16.h2d.txt
@@ -0,0 +1,14 @@
+device: gpu
+device_id: 3
+cpu_thread_nums: 1
+warmup: 200
+repeat: 1000
+backend: paddle_trt
+profile_mode: runtime
+include_h2d_d2h: true
+use_fp16: true
+collect_memory_info: false
+sampling_interval: 1
+precision_compare: false
+xpu_l3_cache: 0
+result_path: benchmark_gpu_paddle_trt_fp16_h2d.txt
--- a/benchmark/cpp/config/config.gpu.paddle_trt.fp32.h2d.txt
+++ b/benchmark/cpp/config/config.gpu.paddle_trt.fp32.h2d.txt
@@ -0,0 +1,14 @@
+device: gpu
+device_id: 3
+cpu_thread_nums: 1
+warmup: 200
+repeat: 1000
+backend: paddle_trt
+profile_mode: runtime
+include_h2d_d2h: true
+use_fp16: false
+collect_memory_info: false
+sampling_interval: 1
+precision_compare: false
+xpu_l3_cache: 0
+result_path: benchmark_gpu_paddle_trt_fp32_h2d.txt
--- a/benchmark/cpp/config/config.gpu.trt.fp16.h2d.txt
+++ b/benchmark/cpp/config/config.gpu.trt.fp16.h2d.txt
@@ -0,0 +1,14 @@
+device: gpu
+device_id: 3
+cpu_thread_nums: 1
+warmup: 200
+repeat: 1000
+backend: trt
+profile_mode: runtime
+include_h2d_d2h: true
+use_fp16: true
+collect_memory_info: false
+sampling_interval: 1
+precision_compare: false
+xpu_l3_cache: 0
+result_path: benchmark_gpu_trt_fp16_h2d.txt
--- a/benchmark/cpp/config/config.gpu.trt.fp32.h2d.txt
+++ b/benchmark/cpp/config/config.gpu.trt.fp32.h2d.txt
@@ -0,0 +1,14 @@
+device: gpu
+device_id: 3
+cpu_thread_nums: 1
+warmup: 200
+repeat: 1000
+backend: trt
+profile_mode: runtime
+include_h2d_d2h: true
+use_fp16: false
+collect_memory_info: false
+sampling_interval: 1
+precision_compare: false
+xpu_l3_cache: 0
+result_path: benchmark_gpu_trt_fp32_h2d.txt