Commit Graph

3 Commits

Author SHA1 Message Date
Wang Xinyu
caa369f64a [Backend] TRT cast GPU input from int64 to int32, output from int32 to int64, and Windows support building CUDA files (#426)
* TRT cast int64 to int32

* windows cmake build cuda src

* fix windows cmake error when build cuda src

* add a notice in windows gpu build doc

* cmake add cuda std=11

* TRT cast output from int32 to int64

* nits

* trt get original input output dtype
2022-10-28 13:38:06 +08:00
Jack Zhou
9c150f0bfb Upgrade eigen func (#253)
* Add FDTensor copy and move assignment and constructor

* Upgrade the transpose to receive the output tensor same as input tensor

* Add note

* Add realloc for FDTensor

* Support output equals to input for softmax

* Remove FDTensor::Alloc
2022-09-20 10:58:07 +08:00
Jason
68523be411 Modify file structure to separate python and cpp code (#223)
Modify code structure
2022-09-14 15:44:13 +08:00