* norm and permute batch processing
* move cache to mat, batch processors
* get batched tensor logic, resize on cpu logic
* fix cpu compile error
* remove vector mat api
* nits
* add comments
* nits
* fix batch size
* move initial resize on cpu option to use_cuda api
* fix pybind
* processor manager pybind
* rename mat and matbatch
* move initial resize on cpu to ppcls preprocessor
---------
Co-authored-by: Jason <jiangjiajun@baidu.com>
* cvcuda resize
* cvcuda center crop
* cvcuda resize
* add a fdtensor in fdmat
* get cv mat and get tensor support gpu
* paddleclas cvcuda preprocessor
* fix compile err
* fix windows compile error
* rename reused to cached
* address comment
* remove debug code
* add comment
* add manager run
* use cuda and cuda used
* use cv cuda doc
* address comment
---------
Co-authored-by: Jason <jiangjiajun@baidu.com>
* cuda normalize and permute, cuda concat
* add use cuda option for preprocessor
* ppyoloe use cuda normalize
* ppseg use cuda normalize
* add proclib cuda in processor base
* ppcls add use cuda preprocess api
* ppcls preprocessor set gpu id
* fix pybind
* refine ppcls preprocessing use gpu logic
* fdtensor device id is -1 by default
* refine assert message
Co-authored-by: heliqi <1101791222@qq.com>
* [Backend] fix lite backend save model error
* [Backend] fixed typos
* [FlyCV] optimize the integration of FlyCV
* [cmake] close some tests options
* [cmake] close some test option
* [FlyCV] remove un-need warnings
* [FlyCV] remove un-need GetMat method
* [FlyCV] optimize FlyCV codes
* [cmake] remove un-need cmake function in examples/CMakelists
* [cmake] support gflags for Android