[Backend] Add KunlunXin XPU deploy support (#747)

* add xpu support

* fix docs

* update code

* update doc

* update code

* update yolov5

* update cmake

* add int64_t data support

* fix

* update download links

* add en doc

* update code

* update xpu options

* update doc

* update doc

* update doc

* update lib links

* update doc

* update code

* update lite xpu link

* update xpu lib

* update doc

* update en doc
This commit is contained in:
yeliang2258
2022-12-15 21:17:14 +08:00
committed by GitHub
parent 6e79df40d9
commit 5be839b322
39 changed files with 870 additions and 58 deletions

View File

@@ -245,6 +245,34 @@ class RuntimeOption:
return
return self._option.use_gpu(device_id)
def use_xpu(self,
device_id=0,
l3_workspace_size=16 * 1024 * 1024,
locked=False,
autotune=True,
autotune_file="",
precision="int16",
adaptive_seqlen=False,
enable_multi_stream=False):
"""Inference with XPU
:param device_id: (int)The index of XPU will be used for inference, default 0
:param l3_workspace_size: (int)The size of the video memory allocated by the l3 cache, the maximum is 16M, default 16M
:param locked: (bool)Whether the allocated L3 cache can be locked. If false, it means that the L3 cache is not locked,
and the allocated L3 cache can be shared by multiple models, and multiple models
:param autotune: (bool)Whether to autotune the conv operator in the model.
If true, when the conv operator of a certain dimension is executed for the first time,
it will automatically search for a better algorithm to improve the performance of subsequent conv operators of the same dimension.
:param autotune_file: (str)Specify the path of the autotune file. If autotune_file is specified,
the algorithm specified in the file will be used and autotune will not be performed again.
:param precision: (str)Calculation accuracy of multi_encoder
:param adaptive_seqlen: (bool)adaptive_seqlen Is the input of multi_encoder variable length
:param enable_multi_stream: (bool)Whether to enable the multi stream of xpu.
"""
return self._option.use_xpu(device_id, l3_workspace_size, locked,
autotune, autotune_file, precision,
adaptive_seqlen, enable_multi_stream)
def use_cpu(self):
"""Inference with CPU
"""