Commit Graph

960 Commits

Author SHA1 Message Date
GoldPancake
e7fa57ebae Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mtp eh_proj layer

* fix mtp update_cfg function

* fix stringdoc

* simplify class name
2025-07-04 14:15:04 +08:00
gaoziyuan
a5ae88ded9 [feature]add fd whl version info (#2698) 2025-07-04 14:12:42 +08:00
ltd0924
87e638498c [RL] update reschedule finish reason (#2709) 2025-07-04 13:47:36 +08:00
freeliuzc
667547be59 support chunk_prefill in MTP (#2705) 2025-07-04 11:55:48 +08:00
Yuanle Liu
240bdac2a4 [feat] support fa3 backend for pd disaggregated (#2695)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd [Bug] fix logger format (#2689)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
Jiang-Jia-Jun
9fd74f75bd Update dynamic_weight_manager.py 2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593 [Sync] Update to latest code (#2679)
* [Sync] Update to latest code

* Add new code files

* Add new code files

* update code

* Try to fix build.sh

* Try to fix build.sh

* Update code

* Update requirements.txt

* Update code

---------

Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
2025-07-03 15:43:53 +08:00
AIbin
a197dcd729 【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference (#2666)
* Support TP2&TP4 Wint

* Support TP2&TP4 Wint2 Inference
2025-07-01 18:29:11 +08:00
ltd0924
50aa4080c0 [Serving] fix offline inference sampling parameters overwrite (#2654) 2025-07-01 10:17:46 +08:00
Jiang-Jia-Jun
92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00
jiangjiajun
fb18f3092d [LLM] Add output module and polish docs 2025-06-09 20:26:53 +08:00
jiangjiajun
684703fd72 [LLM] First commit the llm deployment code 2025-06-09 19:20:15 +08:00
Zheng-Bicheng
9faf1b5ad9 Merge branch 'PaddlePaddle:develop' into develop 2025-02-12 21:23:36 +08:00
Jiang-Jia-Jun
d4bbdbefea Merge pull request #2559 from MaaXYZ/fix/build_error
fix: build error without ENABLE_PADDLE2ONNX
2025-01-10 13:55:49 +08:00
Jiang-Jia-Jun
618826b39d Merge pull request #2560 from MaaXYZ/perf/read_file
perf: ReadBinaryFromFile supports Chinese path
2025-01-10 13:55:27 +08:00
Jiang-Jia-Jun
17d204b975 Merge pull request #2561 from MaaXYZ/feat/directml
feat: select adapter id for DirectML
2025-01-10 13:54:44 +08:00
MistEO
ec3d4c714c fix: valid_directml_backends 2024-11-21 16:47:16 +08:00
MistEO
11214a642f fix: typo of log 2024-11-21 16:11:31 +08:00
MistEO
227cc37a7b fix: config including 2024-11-21 16:02:19 +08:00
MistEO
2507a172f8 feat: select adapter id for DirectML 2024-11-20 14:38:32 +08:00
MistEO
e321ae55a1 fix: windows build error 2024-11-20 00:27:27 +08:00
MistEO
5662fa3903 fix: build error for CI 2024-11-19 19:16:16 +08:00
MistEO
1bbba6fbe3 fix: build error on old C++ std 2024-11-19 19:12:17 +08:00
MistEO
f5ccf62fda perf: ReadBinaryFromFile supports Chinese path 2024-11-19 19:04:19 +08:00
MistEO
eca2ae8b94 fix: build error without ENABLE_PADDLE2ONNX 2024-11-19 16:37:46 +08:00
Zheng-Bicheng
527536565a Update cosine_similarity.cc 2024-11-07 16:41:03 +08:00
ChaoII
0ae0dc9d82 [BUG FIX] fix memory leak for ort backend 2024-03-29 09:00:05 +08:00
ChaoII
cc8d1f3c9f [BUG FIX] add export declaration of GaussianRandom function (#2379)
* Update fd_type.cc

[bug fix]add define for destroy TwoDimArrayCstr for c_api

* [BUG FIX] add export declaration of GaussianRandom function
2024-03-05 13:37:20 +08:00
Albin
1314f3267e bug fix: 修复contrib中的det模型后处理在遇到没有检测框的图时导致同batch中接下来的图片结果全部为空的bug (#2378) 2024-02-27 19:21:17 +08:00
ThinkWD
cfd80e95a1 修复 #2359 (#2363)
Update pptinypose_utils.cc

添加 py 范围限制, 避免 utils::DarkParse 内部出现数组越界
2024-01-29 19:16:19 +08:00
DefTruth
cfebd24dfc [Backend] fix ort backend windows build error (#2269)
* support ort offline graph optimize option

* support ort offline graph optimize option

* [Backend] fix windows build error
2023-11-02 12:51:46 +08:00
DefTruth
6a8cd4d759 [Backend] support ort offline graph optimize option (#2268)
* support ort offline graph optimize option

* support ort offline graph optimize option
2023-11-02 09:19:33 +08:00
Jason
d910d6116c [Model] Fix nms problem (#2230)
fix nms problem
2023-10-10 16:24:33 +08:00
ChaoII
771dcf4b91 [Backend]Update sophgo_backend.cc for FP16 data type (#2161) 2023-08-16 14:12:58 +08:00
DefTruth
ade27d29cb [Sync][Internal] sync some internal features of paddle3d inference (#2118)
* [Sync][Internal] sync some internal codes

* [Sync][Internal] sync some internal features of paddle3d inference

* [Sync][Internal] sync some internal features of paddle3d inference
2023-07-17 23:06:51 +08:00
DefTruth
681ccc4c24 [Sync][Internal] sync some internal paddle3d codes (#2108) 2023-07-13 22:06:28 +08:00
DefTruth
d5ad3d9c8d [Bug Fix] fixed paddle custom ops windows build error (#2103)
* [cmake] upgrade windows paddle inference -> 2.5.0

* [cmake] upgrade windows paddle inference -> 2.5.0

* fix paddle custom ops bug on windows

* [Backend] refactor paddle custom ops

* [Bug Fix] fixed paddle custom ops windows build error
2023-07-13 14:04:01 +08:00
DefTruth
99c2b6592d [Backend] refactor paddle custom ops -> fastdeploy::paddle_custom_ops (#2101)
* [cmake] upgrade windows paddle inference -> 2.5.0

* [cmake] upgrade windows paddle inference -> 2.5.0

* fix paddle custom ops bug on windows

* [Backend] refactor paddle custom ops
2023-07-13 09:00:03 +08:00
DefTruth
2542a75b61 [cmake] upgrade windows paddle inference -> 2.5.0 (#2100)
* [cmake] upgrade windows paddle inference -> 2.5.0

* [cmake] upgrade windows paddle inference -> 2.5.0

* fix paddle custom ops bug on windows
2023-07-12 18:39:06 +08:00
zengshao0622
c95fc1fba8 [Bug Fix] fix centerpoint malloc bug (#2099) 2023-07-12 17:17:22 +08:00
DefTruth
cf1ff2077d [Bug Fix] fix trt backend page-locked error (#2095)
* [Bug Fix] fix trt backend page-locked error

* Update trt_backend.cc
2023-07-11 13:49:47 +08:00
DefTruth
4c1e80b723 [Bug Fix] fixed ocr visualize error (#2090) 2023-07-07 17:43:08 +08:00
yeliang2258
ad1f46f7d9 Add ORT fp16 support in server (#2069)
* add ort fp16 support in server

* update paddle2onnx url

* update ort fp16 api

* add disable_ort_fp16_op_types in serving
2023-07-05 17:50:00 +08:00
zengshao0622
79a3587339 [Model] Add Paddle3D CenterPoint model (#2078)
* add centerpoint

* update for review comments
2023-07-03 13:39:16 +08:00
DefTruth
b2426aefa9 [Backend] add paddle custom ops compatible policy (#2070)
* Add centerpoint

* fix postprocess op file name

* [Backend] add paddle custom ops compatible policy

* [Backend] add paddle custom ops compatible policy

* [Backend] add paddle custom ops compatible policy

* upgrade linx paddle gpu -> 2.5

* add custom op compatible policy for paddle 2.5

* add custom op compatible policy for paddle 2.5

* add custom op compatible policy for paddle 2.5

* add collect_trt_shape_by_device option for paddle backend

* add collect_trt_shape_by_device option for paddle backend

* add custom op option for python build

* fix python build bugs

* update paddle linux x86 cpu only lib

* update paddle linux gpu lib

* update patchelf cmake

* fix paddle backend option pybind

* update paddle_inference.cmake

* add cuda sm_80 support (A100)

---------

Co-authored-by: zengshao0622 <peter_z96@163.com>
Co-authored-by: qiuyanjun <qiuyanjun@baidu.com>
2023-06-29 22:32:14 +08:00
JugendTraum
4c3e7030e1 Update rknpu2_backend.cc (#2064)
Update rknpu2_backend.cc  Signed-off-by: JugendTraum <443248173@qq.com>

Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
2023-06-26 04:58:12 -07:00
zengshao0622
709ba51612 [WIP]Add VI-LayoutXLM (#2048)
* WIP, add VI-LayoutXLM

* fix pybind

* update the dir of ser_vi_layoutxlm model

* update dir and name of ser_vi_layoutxlm model

* update model name to StructureV2SerViLayoutXLMModel

* fix import paddle bug

---------

Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
2023-06-26 16:40:05 +08:00
DefTruth
dea3795930 [Backend] Add backward compatible for paddle inference 2.4.x (#2062)
* [Backend] Add backward compatiable for paddle infernence 2.4.x

* [Backend] Add backward compatiable for paddle infernence 2.4.x
2023-06-25 19:27:36 +08:00
DefTruth
ba8649a69d [Model] update PP-ShiTuV2-rec preprocess parser policy (#2061)
* [benchmark] fixed paddlex benchmark for picodet 320

* [Bug Fix] fixed paddlex ppseg pp-trt infer error

* [Bug Fix] fixed paddlex dino benchmark trt shapes

* [benchmark] support paddlex ppyoloe pptrt benchmark

* [benchmark] adjust paddlex dino trt shapes

* [benchmark] add max_workspace_size flags for tensorrt/pptrt backend

* [benchmark] add max_workspace_size flags for tensorrt/pptrt backend

* [benchmark] add max_workspace_size flags for tensorrt/pptrt backend

* [benchmark] add ort/paddle h2d gpu configs for paddlex

* [benchmark] update paddlex benchmark scripts

* [benchmark] update paddlex benchmark scripts

* [Model] update PP-ShituV2-rec preprocess parser policy

---------

Co-authored-by: qiuyanjun <qiuyanjun@baidu.com>
2023-06-25 13:50:02 +08:00