Commit Graph

3104 Commits

Author SHA1 Message Date
Yuanle Liu
240bdac2a4 [feat] support fa3 backend for pd disaggregated (#2695)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* support fa3 backend run in pd disaggregated

* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd [Bug] fix logger format (#2689)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
kevin
3d3bccdf79 [doc] update docs (#2690) 2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
9fd74f75bd Update dynamic_weight_manager.py 2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593 [Sync] Update to latest code (#2679)
* [Sync] Update to latest code

* Add new code files

* Add new code files

* update code

* Try to fix build.sh

* Try to fix build.sh

* Update code

* Update requirements.txt

* Update code

---------

Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
d222248d00 Update README.md 2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun
e5b94d4117 Update README.md 2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun
87e2e58a22 Update gh-pages.yml 2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun
de20e5a992 Update Dockerfile.xpu
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun
2f9c0618f0 Update Dockerfile.gpu 2025-07-03 10:14:39 +08:00
Yuanle Liu
9a14ab6572 add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl (#2682)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 05:32:20 -07:00
Divano
d1cb3ed571 Update gh-pages.yml (#2680) 2025-07-02 17:36:18 +08:00
handiz
b8a8a19689 add wint2 performance (#2673) 2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun
97ac82834f Update nvidia_gpu.md 2025-07-02 16:54:14 +08:00
Jiang-Jia-Jun
685265a97d Update nvidia_gpu.md 2025-07-02 15:43:35 +08:00
Jiang-Jia-Jun
fc4d643634 Update nvidia_gpu.md 2025-07-02 15:39:48 +08:00
YuBaoku
bb880c8d7c Update CI test cases (#2671)
* set git identity to avoid merge failure in CI

* add ci cases
2025-07-02 15:08:39 +08:00
liddk1121
865e856a94 update iluvatar gpu fastdeploy whl (#2675) 2025-07-02 14:47:21 +08:00
Jiang-Jia-Jun
9f4a65d817 Update README.md
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 10:04:58 +08:00
YuBaoku
e3aac0c5b8 set git identity to avoid merge failure in CI (#2665)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 19:06:46 +08:00
AIbin
a197dcd729 【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference (#2666)
* Support TP2&TP4 Wint

* Support TP2&TP4 Wint2 Inference
2025-07-01 18:29:11 +08:00
freeliuzc
2b7f74d427 fix docs (#2669)
Co-authored-by: liuzichang01 <liuzichang01@baidu.com>
2025-07-01 18:02:44 +08:00
Jiang-Jia-Jun
164b83ab0b [Doc] Update nvidia gpu installation description 2025-07-01 15:22:19 +08:00
Jiang-Jia-Jun
01d5d66d95 [Doc] Update nvidia gpu installation description 2025-07-01 15:20:40 +08:00
Jiang-Jia-Jun
8f1dddcf35 [Doc] Update nvidia gpu installation description 2025-07-01 15:20:21 +08:00
hong19860320
8e335db645 Update kunlunxin_xpu.md (#2662) 2025-07-01 15:10:45 +08:00
AIbin
1bb296c5ad update quantization doc (#2659) 2025-07-01 15:05:02 +08:00
hong19860320
92428a5ae4 Update kunlunxin_xpu.md (#2657)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 12:28:49 +08:00
RichardWooSJTU
85090ed799 remove unuseful scripts (#2652) 2025-07-01 10:18:25 +08:00
ltd0924
50aa4080c0 [Serving] fix offline inference sampling parameters overwrite (#2654) 2025-07-01 10:17:46 +08:00
YUNSHEN XIE
d5af78945b Add ci (#2650)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* add ci ut and workflow

* Automatically cancel any previous CI runs for the ci.yml workflow, keeping only the latest one active
2025-06-30 20:20:49 +08:00
hong19860320
6bead64f48 Update kunlunxin_xpu.md 2025-06-30 15:59:22 +08:00
hong19860320
6b95b42986 Update kunlunxin_xpu.md 2025-06-30 15:49:32 +08:00
hong19860320
b0d3a630ba Merge branch 'develop' of https://github.com/hong19860320/FastDeploy into hongming/fix_xpu_doc 2025-06-30 15:42:29 +08:00
hong19860320
ef72873695 Update kunlunxin_xpu.md 2025-06-30 15:27:48 +08:00
qingqing01
4a5db82fb2 Merge pull request #2644 from kevincheng2/develop
[docs] update docs
2025-06-30 14:55:54 +08:00
kevin
4f7b42ce3e update docs 2025-06-30 14:45:41 +08:00
qingqing01
df1e22b595 Merge pull request #2642 from MARD1NO/remove_redundant_sync
use shfl_xor_sync to reduce redundant shfl broadcast
2025-06-30 14:33:12 +08:00
MARD1NO
ac5f860536 use shfl_xor_sync to reduce redundant shfl broadcast 2025-06-30 13:12:21 +08:00
qingqing01
90a5b18742 Update disaggregated.md
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-06-30 11:57:12 +08:00
qingqing01
7c43500060 Update disaggregated.md 2025-06-30 11:56:33 +08:00
Jiang-Jia-Jun
ea29b01a68 Update quick_start.md 2025-06-30 11:52:05 +08:00
Jiang-Jia-Jun
51f1306de8 Merge pull request #2641 from yongqiangma/doc
fix format
2025-06-30 11:42:52 +08:00
yongqiangma
f9431106d8 Merge branch 'develop' into doc 2025-06-30 11:42:43 +08:00
Jiang-Jia-Jun
f4ce0393f3 Merge pull request #2640 from chang-wenbin/fix_wint2_doc
【Update Doc】Update Wint2 Doc
2025-06-30 11:40:41 +08:00
mayongqiang
0d39e23ab9 fix format 2025-06-30 11:39:59 +08:00
changwenbin
634d3c3642 update wint2 doc 2025-06-30 11:36:15 +08:00
Jiang-Jia-Jun
cb54462303 Update README.md 2025-06-30 11:16:00 +08:00
Jiang-Jia-Jun
c9b358c502 Merge pull request #2639 from ZhangYulongg/patch-1
Update README.md
2025-06-30 11:10:17 +08:00
Divano
733cc47b00 Merge pull request #2638 from PaddlePaddle/DDDivano-FixWorkflow
Update gh-pages.yml
2025-06-30 10:55:52 +08:00