Add uie python example and doc (#221)

* add fastdeploy.text.UIEModel

* Add uie python example

* Add one schema for cpp demo

* Add ConvertUIEResultToDict for pretty the uie result in python

* remove default args for SchemaNode

* Add uie example args

* Add uie python api desc

* Add infer.py usage

* truncate some example output

* Add uie schema usage

* Add uie result md

* Add uie c++ api doc
This commit is contained in:
Jack Zhou
2022-09-15 06:06:40 +08:00
committed by GitHub
parent fb0a428c3c
commit 14ba9ce6c2
11 changed files with 760 additions and 33 deletions

View File

@@ -7,28 +7,18 @@
- 1. 软硬件环境满足要求,参考[FastDeploy环境要求](../../../../docs/environment.md)
- 2. 根据开发环境下载预编译部署库和samples代码参考[FastDeploy预编译库](../../../../docs/compile/prebuilt_libraries.md)
## 快速开始
以Linux上uie-base模型推理为例在本目录执行如下命令即可完成编译测试。
```
# UIE目前还未发布当前需开发者自行编译FastDeploy通过如下脚本编译得到部署库fastdeploy-linux-x64-dev
git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy
mkdir build && cd build
cmake .. -DENABLE_ORT_BACKEND=ON \
-DENABLE_VISION=ON \
-DENABLE_PADDLE_BACKEND=ON \
-DENABLE_TEXT=ON \
-DWITH_GPU=ON \
-DCMAKE_INSTALL_PREFIX=${PWD}/fastdeploy-linux-x64-gpu-dev
#下载SDK编译模型examples代码SDK中包含了examples代码
wget https://bj.bcebos.com/fastdeploy/release/cpp/fastdeploy-linux-x64-gpu-0.2.1.tgz
tar xvf fastdeploy-linux-x64-gpu-0.2.1.tgz
make -j8
make install
# 编译模型examples代码SDK中包含了examples代码
cd ../examples/text/uie/cpp
cd fastdeploy-linux-x64-gpu-0.2.1/examples/text/uie/cpp
mkdir build
cd build
cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/../../../../../build/fastdeploy-linux-x64-gpu-dev
cmake .. -DFASTDEPLOY_INSTALL_DIR=${PWD}/../../../../../../fastdeploy-linux-x64-gpu-0.2.1
make -j
# 下载uie-base模型以及词表
@@ -41,7 +31,116 @@ tar -xvfz uie-base.tgz
# GPU 推理
./infer_demo uie-base 1
# 使用OpenVINO推理
./infer_demo uie-base 1 2
```
## 模型获取
UIE 模型介绍可以参考https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/uie 。其中在完成训练后需要将训练后的模型导出成推理模型。该步骤可参考该文档完成导出https://github.com/PaddlePaddle/PaddleNLP/tree/develop/model_zoo/uie#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2 。
运行完成后返回结果如下所示(仅截取NER任务的输出)。
```bash
[INFO] fastdeploy/fastdeploy_runtime.cc(264)::Init Runtime initialized with Backend::PDINFER in device Device::CPU.
After init predictor
The result:
赛事名称:
text: 北京冬奥会自由式滑雪女子大跳台决赛
probability: 0.850309
start: 6
end: 23
时间:
text: 2月8日上午
probability: 0.985738
start: 0
end: 6
选手:
text: 谷爱凌
probability: 0.898155
start: 28
end: 31
```
## UIEModel C++接口
### SchemaNode 结构
表示UIE模型目标模式的结构。
```c++
SchemaNode(const std::string& name,
const std::vector<SchemaNode>& children = {});
```
**参数**
> * **name**(str): 需要抽取的信息。
> * **children**(str): 当前节点需抽取信息关联的子信息。
### UIEModel 结构
用于信息抽取任务的UIE模型结构。
#### 初始化函数
```c++
UIEModel(
const std::string& model_file, const std::string& params_file,
const std::string& vocab_file, float position_prob, size_t max_length,
const std::vector<std::string>& schema,
const fastdeploy::RuntimeOption& custom_option =
fastdeploy::RuntimeOption(),
const fastdeploy::Frontend& model_format = fastdeploy::Frontend::PADDLE);
UIEModel(
const std::string& model_file, const std::string& params_file,
const std::string& vocab_file, float position_prob, size_t max_length,
const SchemaNode& schema, const fastdeploy::RuntimeOption& custom_option =
fastdeploy::RuntimeOption(),
const fastdeploy::Frontend& model_format = fastdeploy::Frontend::PADDLE);
UIEModel(
const std::string& model_file, const std::string& params_file,
const std::string& vocab_file, float position_prob, size_t max_length,
const std::vector<SchemaNode>& schema,
const fastdeploy::RuntimeOption& custom_option =
fastdeploy::RuntimeOption(),
const fastdeploy::Frontend& model_format = fastdeploy::Frontend::PADDLE);
```
UIE模型加载和初始化其中model_file, params_file为训练模型导出的Paddle inference文件具体请参考其文档说明[模型导出](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2)。
**参数**
> * **model_file**(str): 模型文件路径
> * **params_file**(str): 参数文件路径
> * **vocab_file**(str): 词表文件路径
> * **position_prob**(str): 位置概率,模型将输出位置概率大于`position_prob`的位置默认为0.5
> * **max_length**(int): 输入文本的最大长度。输入文本下标超过`max_length`的部分将被截断。默认为128
> * **schema**(list(SchemaNode) | SchemaNode | list(str)): 抽取任务的目标模式。
> * **runtime_option**(RuntimeOption): 后端推理配置默认为None即采用默认配置
> * **model_format**(Frontend): 模型格式默认为Paddle格式
#### SetSchema函数
```c++
void SetSchema(const std::vector<std::string>& schema);
void SetSchema(const std::vector<SchemaNode>& schema);
void SetSchema(const SchemaNode& schema);
```
**参数**
> * **schema**(list(SchemaNode) | SchemaNode | list(str)): 输入数据,待抽取文本模式。
#### Predict函数
```c++
void Predict(
const std::vector<std::string>& texts,
std::vector<std::unordered_map<std::string, std::vector<UIEResult>>>* results);
```
**参数**
> * **texts**(list(str)): 文本列表
> * **results**(list(dict())): UIE模型抽取结果。UIEResult结构详细可见[UIEResult说明](../../../../docs/api/text_results/uie_result.md)。
## 相关文档
[UIE模型详细介绍](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md)
[UIE模型导出方法](https://github.com/PaddlePaddle/PaddleNLP/blob/develop/model_zoo/uie/README.md#%E6%A8%A1%E5%9E%8B%E9%83%A8%E7%BD%B2)
[UIE C++部署方法](../cpp/README.md)

View File

@@ -71,7 +71,7 @@ int main(int argc, char* argv[]) {
auto predictor =
fastdeploy::text::UIEModel(model_path, param_path, vocab_path, 0.5, 128,
{"时间", "选手", "赛事名称"}, option);
fastdeploy::FDINFO << "After init predictor" << std::endl;
std::cout << "After init predictor" << std::endl;
std::vector<std::unordered_map<std::string, std::vector<UIEResult>>> results;
// Named Entity Recognition
predictor.Predict({"2月8日上午北京冬奥会自由式滑雪女子大跳台决赛中中国选手谷"
@@ -80,6 +80,16 @@ int main(int argc, char* argv[]) {
std::cout << results << std::endl;
results.clear();
predictor.SetSchema(
{"肿瘤的大小", "肿瘤的个数", "肝癌级别", "脉管内癌栓分级"});
predictor.Predict({"右肝肿瘤肝细胞性肝癌II-"
"III级梁索型和假腺管型肿瘤包膜不完整紧邻肝被膜"
"及周围肝组织未见脉管内癌栓MVI分级M0级及卫星子灶形"
"成。肿物1个大小4.2×4.0×2.8cm)。"},
&results);
std::cout << results << std::endl;
results.clear();
// Relation Extraction
predictor.SetSchema(
{SchemaNode("竞赛名称", {SchemaNode("主办方"), SchemaNode("承办方"),