Yuanle Liu
240bdac2a4
[feat] support fa3 backend for pd disaggregated ( #2695 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd
[Bug] fix logger format ( #2689 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
kevin
3d3bccdf79
[doc] update docs ( #2690 )
2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
9fd74f75bd
Update dynamic_weight_manager.py
2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
d222248d00
Update README.md
2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun
e5b94d4117
Update README.md
2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun
87e2e58a22
Update gh-pages.yml
2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun
de20e5a992
Update Dockerfile.xpu
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun
2f9c0618f0
Update Dockerfile.gpu
2025-07-03 10:14:39 +08:00
Yuanle Liu
9a14ab6572
add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl ( #2682 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 05:32:20 -07:00
Divano
d1cb3ed571
Update gh-pages.yml ( #2680 )
2025-07-02 17:36:18 +08:00
handiz
b8a8a19689
add wint2 performance ( #2673 )
2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun
97ac82834f
Update nvidia_gpu.md
2025-07-02 16:54:14 +08:00
Jiang-Jia-Jun
685265a97d
Update nvidia_gpu.md
2025-07-02 15:43:35 +08:00
Jiang-Jia-Jun
fc4d643634
Update nvidia_gpu.md
2025-07-02 15:39:48 +08:00
YuBaoku
bb880c8d7c
Update CI test cases ( #2671 )
...
* set git identity to avoid merge failure in CI
* add ci cases
2025-07-02 15:08:39 +08:00
liddk1121
865e856a94
update iluvatar gpu fastdeploy whl ( #2675 )
2025-07-02 14:47:21 +08:00
Jiang-Jia-Jun
9f4a65d817
Update README.md
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 10:04:58 +08:00
YuBaoku
e3aac0c5b8
set git identity to avoid merge failure in CI ( #2665 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 19:06:46 +08:00
AIbin
a197dcd729
【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference ( #2666 )
...
* Support TP2&TP4 Wint
* Support TP2&TP4 Wint2 Inference
2025-07-01 18:29:11 +08:00
freeliuzc
2b7f74d427
fix docs ( #2669 )
...
Co-authored-by: liuzichang01 <liuzichang01@baidu.com >
2025-07-01 18:02:44 +08:00
Jiang-Jia-Jun
164b83ab0b
[Doc] Update nvidia gpu installation description
2025-07-01 15:22:19 +08:00
Jiang-Jia-Jun
01d5d66d95
[Doc] Update nvidia gpu installation description
2025-07-01 15:20:40 +08:00
Jiang-Jia-Jun
8f1dddcf35
[Doc] Update nvidia gpu installation description
2025-07-01 15:20:21 +08:00
hong19860320
8e335db645
Update kunlunxin_xpu.md ( #2662 )
2025-07-01 15:10:45 +08:00
AIbin
1bb296c5ad
update quantization doc ( #2659 )
2025-07-01 15:05:02 +08:00
hong19860320
92428a5ae4
Update kunlunxin_xpu.md ( #2657 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 12:28:49 +08:00
RichardWooSJTU
85090ed799
remove unuseful scripts ( #2652 )
2025-07-01 10:18:25 +08:00
ltd0924
50aa4080c0
[Serving] fix offline inference sampling parameters overwrite ( #2654 )
2025-07-01 10:17:46 +08:00
YUNSHEN XIE
d5af78945b
Add ci ( #2650 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add ci ut and workflow
* Automatically cancel any previous CI runs for the ci.yml workflow, keeping only the latest one active
2025-06-30 20:20:49 +08:00
hong19860320
6bead64f48
Update kunlunxin_xpu.md
2025-06-30 15:59:22 +08:00
hong19860320
6b95b42986
Update kunlunxin_xpu.md
2025-06-30 15:49:32 +08:00
hong19860320
b0d3a630ba
Merge branch 'develop' of https://github.com/hong19860320/FastDeploy into hongming/fix_xpu_doc
2025-06-30 15:42:29 +08:00
hong19860320
ef72873695
Update kunlunxin_xpu.md
2025-06-30 15:27:48 +08:00
qingqing01
4a5db82fb2
Merge pull request #2644 from kevincheng2/develop
...
[docs] update docs
2025-06-30 14:55:54 +08:00
kevin
4f7b42ce3e
update docs
2025-06-30 14:45:41 +08:00
qingqing01
df1e22b595
Merge pull request #2642 from MARD1NO/remove_redundant_sync
...
use shfl_xor_sync to reduce redundant shfl broadcast
2025-06-30 14:33:12 +08:00
MARD1NO
ac5f860536
use shfl_xor_sync to reduce redundant shfl broadcast
2025-06-30 13:12:21 +08:00
qingqing01
90a5b18742
Update disaggregated.md
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-06-30 11:57:12 +08:00
qingqing01
7c43500060
Update disaggregated.md
2025-06-30 11:56:33 +08:00
Jiang-Jia-Jun
ea29b01a68
Update quick_start.md
2025-06-30 11:52:05 +08:00
Jiang-Jia-Jun
51f1306de8
Merge pull request #2641 from yongqiangma/doc
...
fix format
2025-06-30 11:42:52 +08:00
yongqiangma
f9431106d8
Merge branch 'develop' into doc
2025-06-30 11:42:43 +08:00
Jiang-Jia-Jun
f4ce0393f3
Merge pull request #2640 from chang-wenbin/fix_wint2_doc
...
【Update Doc】Update Wint2 Doc
2025-06-30 11:40:41 +08:00
mayongqiang
0d39e23ab9
fix format
2025-06-30 11:39:59 +08:00
changwenbin
634d3c3642
update wint2 doc
2025-06-30 11:36:15 +08:00
Jiang-Jia-Jun
cb54462303
Update README.md
2025-06-30 11:16:00 +08:00
Jiang-Jia-Jun
c9b358c502
Merge pull request #2639 from ZhangYulongg/patch-1
...
Update README.md
2025-06-30 11:10:17 +08:00
Divano
733cc47b00
Merge pull request #2638 from PaddlePaddle/DDDivano-FixWorkflow
...
Update gh-pages.yml
2025-06-30 10:55:52 +08:00