Ryan
fefbd65cf8
[SOT] Remove BreakGraph with paddle.maximum
( #2731 )
...
* rm if with clip
* clip -> maximum
* int64 -> int32
2025-07-08 11:44:25 +08:00
ming1753
1eb8ea7328
[Bug fix] fix complie bug when sm < 89 ( #2738 )
2025-07-08 11:24:52 +08:00
ming1753
ef6649a577
[Optimize] Optimize tensorwise fp8 performance ( #2729 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Optimize] Optimize tensorwise fp8 performance
2025-07-07 20:06:28 +08:00
liddk1121
1b54a2831e
Adapt for iluvatar gpu ( #2684 )
2025-07-07 16:53:14 +08:00
lddfym
4e293e50fa
Check if the controller port is available ( #2724 )
2025-07-07 13:24:55 +08:00
ltd0924
68b4755587
[LLM] support multi node deploy ( #2708 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [LLM] support multi node deploy
* Update engine.py
* fix bugs
* fix
* [LLM] support multi node deploy
* [LLM] support multi node deploy
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-06 10:33:51 +08:00
Ting
a6e9161045
fix bug. ( #2718 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-05 08:19:19 +08:00
Ting
90ef28d982
spec token map lazy. ( #2715 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-05 00:14:54 +08:00
lizexu123
9cb08e71e8
add support QWQ enable_thinking ( #2706 )
...
* add support QWQ enable_thinking
* add stream=True
* fix stream=true
* fix qwen
---------
Co-authored-by: lizexu <lizexu@baidu.com >
2025-07-04 20:55:23 +08:00
GoldPancake
e7fa57ebae
Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue ( #2707 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mtp eh_proj layer
* fix mtp update_cfg function
* fix stringdoc
* simplify class name
2025-07-04 14:15:04 +08:00
gaoziyuan
a5ae88ded9
[feature]add fd whl version info ( #2698 )
2025-07-04 14:12:42 +08:00
ltd0924
87e638498c
[RL] update reschedule finish reason ( #2709 )
2025-07-04 13:47:36 +08:00
freeliuzc
667547be59
support chunk_prefill in MTP ( #2705 )
2025-07-04 11:55:48 +08:00
Yuanle Liu
240bdac2a4
[feat] support fa3 backend for pd disaggregated ( #2695 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd
[Bug] fix logger format ( #2689 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
Jiang-Jia-Jun
9fd74f75bd
Update dynamic_weight_manager.py
2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
AIbin
a197dcd729
【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference ( #2666 )
...
* Support TP2&TP4 Wint
* Support TP2&TP4 Wint2 Inference
2025-07-01 18:29:11 +08:00
ltd0924
50aa4080c0
[Serving] fix offline inference sampling parameters overwrite ( #2654 )
2025-07-01 10:17:46 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
jiangjiajun
fb18f3092d
[LLM] Add output module and polish docs
2025-06-09 20:26:53 +08:00
jiangjiajun
684703fd72
[LLM] First commit the llm deployment code
2025-06-09 19:20:15 +08:00
Zheng-Bicheng
9faf1b5ad9
Merge branch 'PaddlePaddle:develop' into develop
2025-02-12 21:23:36 +08:00
Jiang-Jia-Jun
d4bbdbefea
Merge pull request #2559 from MaaXYZ/fix/build_error
...
fix: build error without ENABLE_PADDLE2ONNX
2025-01-10 13:55:49 +08:00
Jiang-Jia-Jun
618826b39d
Merge pull request #2560 from MaaXYZ/perf/read_file
...
perf: ReadBinaryFromFile supports Chinese path
2025-01-10 13:55:27 +08:00
Jiang-Jia-Jun
17d204b975
Merge pull request #2561 from MaaXYZ/feat/directml
...
feat: select adapter id for DirectML
2025-01-10 13:54:44 +08:00
MistEO
ec3d4c714c
fix: valid_directml_backends
2024-11-21 16:47:16 +08:00
MistEO
11214a642f
fix: typo of log
2024-11-21 16:11:31 +08:00
MistEO
227cc37a7b
fix: config including
2024-11-21 16:02:19 +08:00
MistEO
2507a172f8
feat: select adapter id for DirectML
2024-11-20 14:38:32 +08:00
MistEO
e321ae55a1
fix: windows build error
2024-11-20 00:27:27 +08:00
MistEO
5662fa3903
fix: build error for CI
2024-11-19 19:16:16 +08:00
MistEO
1bbba6fbe3
fix: build error on old C++ std
2024-11-19 19:12:17 +08:00
MistEO
f5ccf62fda
perf: ReadBinaryFromFile supports Chinese path
2024-11-19 19:04:19 +08:00
MistEO
eca2ae8b94
fix: build error without ENABLE_PADDLE2ONNX
2024-11-19 16:37:46 +08:00
Zheng-Bicheng
527536565a
Update cosine_similarity.cc
2024-11-07 16:41:03 +08:00
ChaoII
0ae0dc9d82
[BUG FIX] fix memory leak for ort backend
2024-03-29 09:00:05 +08:00
ChaoII
cc8d1f3c9f
[BUG FIX] add export declaration of GaussianRandom function ( #2379 )
...
* Update fd_type.cc
[bug fix]add define for destroy TwoDimArrayCstr for c_api
* [BUG FIX] add export declaration of GaussianRandom function
2024-03-05 13:37:20 +08:00
Albin
1314f3267e
bug fix: 修复contrib中的det模型后处理在遇到没有检测框的图时导致同batch中接下来的图片结果全部为空的bug ( #2378 )
2024-02-27 19:21:17 +08:00
ThinkWD
cfd80e95a1
修复 #2359 ( #2363 )
...
Update pptinypose_utils.cc
添加 py 范围限制, 避免 utils::DarkParse 内部出现数组越界
2024-01-29 19:16:19 +08:00
DefTruth
cfebd24dfc
[Backend] fix ort backend windows build error ( #2269 )
...
* support ort offline graph optimize option
* support ort offline graph optimize option
* [Backend] fix windows build error
2023-11-02 12:51:46 +08:00
DefTruth
6a8cd4d759
[Backend] support ort offline graph optimize option ( #2268 )
...
* support ort offline graph optimize option
* support ort offline graph optimize option
2023-11-02 09:19:33 +08:00
Jason
d910d6116c
[Model] Fix nms problem ( #2230 )
...
fix nms problem
2023-10-10 16:24:33 +08:00
ChaoII
771dcf4b91
[Backend]Update sophgo_backend.cc for FP16 data type ( #2161 )
2023-08-16 14:12:58 +08:00
DefTruth
ade27d29cb
[Sync][Internal] sync some internal features of paddle3d inference ( #2118 )
...
* [Sync][Internal] sync some internal codes
* [Sync][Internal] sync some internal features of paddle3d inference
* [Sync][Internal] sync some internal features of paddle3d inference
2023-07-17 23:06:51 +08:00
DefTruth
681ccc4c24
[Sync][Internal] sync some internal paddle3d codes ( #2108 )
2023-07-13 22:06:28 +08:00
DefTruth
d5ad3d9c8d
[Bug Fix] fixed paddle custom ops windows build error ( #2103 )
...
* [cmake] upgrade windows paddle inference -> 2.5.0
* [cmake] upgrade windows paddle inference -> 2.5.0
* fix paddle custom ops bug on windows
* [Backend] refactor paddle custom ops
* [Bug Fix] fixed paddle custom ops windows build error
2023-07-13 14:04:01 +08:00
DefTruth
99c2b6592d
[Backend] refactor paddle custom ops -> fastdeploy::paddle_custom_ops ( #2101 )
...
* [cmake] upgrade windows paddle inference -> 2.5.0
* [cmake] upgrade windows paddle inference -> 2.5.0
* fix paddle custom ops bug on windows
* [Backend] refactor paddle custom ops
2023-07-13 09:00:03 +08:00
DefTruth
2542a75b61
[cmake] upgrade windows paddle inference -> 2.5.0 ( #2100 )
...
* [cmake] upgrade windows paddle inference -> 2.5.0
* [cmake] upgrade windows paddle inference -> 2.5.0
* fix paddle custom ops bug on windows
2023-07-12 18:39:06 +08:00
zengshao0622
c95fc1fba8
[Bug Fix] fix centerpoint malloc bug ( #2099 )
2023-07-12 17:17:22 +08:00