Jiang-Jia-Jun
3ec126dc02
Update setup.py
2025-07-15 14:57:40 +08:00
gaoziyuan
337d76f094
[sync fix] ( #2759 )
...
* add rl qwen model support
* fix
* fix
* add_commit_config
* fix
2025-07-08 19:29:23 +08:00
gaoziyuan
ae2f78184d
【Sync develop】 add commit info ( #2755 )
...
* add rl qwen model support
* fix
* fix
* add_commit_config
2025-07-08 17:02:50 +08:00
gaoziyuan
6851489425
【Sync】Release/2.0.1 ( #2745 )
...
* add rl qwen model support
* fix
* fix
2025-07-08 14:38:18 +08:00
Jiang-Jia-Jun
ea787d8f62
fix bug. ( #2718 ) ( #2720 )
...
Co-authored-by: Ting <wtmlon@foxmail.com >
2025-07-05 09:00:01 +08:00
Ting
90ef28d982
spec token map lazy. ( #2715 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-05 00:14:54 +08:00
YuBaoku
b37585e693
[BugFix] fix paddle_git_commit_id error ( #2714 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* set git identity to avoid merge failure in CI
* add ci cases
* [CI] Add validation for MTP and CUDAGraph
* [BugFix] fix paddle_git_commit_id error
2025-07-04 22:16:37 +08:00
lizexu123
9cb08e71e8
add support QWQ enable_thinking ( #2706 )
...
* add support QWQ enable_thinking
* add stream=True
* fix stream=true
* fix qwen
---------
Co-authored-by: lizexu <lizexu@baidu.com >
2025-07-04 20:55:23 +08:00
YuBaoku
dacc46f04c
[CI] Add validation for MTP and CUDAGraph ( #2710 )
...
* set git identity to avoid merge failure in CI
* add ci cases
* [CI] Add validation for MTP and CUDAGraph
2025-07-04 18:13:54 +08:00
Jiang-Jia-Jun
09ded7715f
Update mkdocs.yml
2025-07-04 17:55:52 +08:00
LQX
11cfdf5d89
添加XPU CI, test=model ( #2701 )
...
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
2025-07-04 16:16:06 +08:00
GoldPancake
e7fa57ebae
Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue ( #2707 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mtp eh_proj layer
* fix mtp update_cfg function
* fix stringdoc
* simplify class name
2025-07-04 14:15:04 +08:00
gaoziyuan
a5ae88ded9
[feature]add fd whl version info ( #2698 )
2025-07-04 14:12:42 +08:00
ltd0924
87e638498c
[RL] update reschedule finish reason ( #2709 )
2025-07-04 13:47:36 +08:00
freeliuzc
667547be59
support chunk_prefill in MTP ( #2705 )
2025-07-04 11:55:48 +08:00
LiqinruiG
b38823bc66
modify reasoning_output docs ( #2696 )
2025-07-04 11:30:02 +08:00
Divano
050d9658a5
Update requirements.txt
2025-07-04 09:53:03 +08:00
Divano
be5cabaf80
add quick benchmark ( #2703 )
...
测试脚本不需要过CI
2025-07-04 09:32:36 +08:00
Yuanle Liu
240bdac2a4
[feat] support fa3 backend for pd disaggregated ( #2695 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn
2025-07-03 22:33:27 +08:00
ltd0924
00863c43fd
[Bug] fix logger format ( #2689 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 19:58:03 +08:00
kevin
3d3bccdf79
[doc] update docs ( #2690 )
2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
9fd74f75bd
Update dynamic_weight_manager.py
2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593
[Sync] Update to latest code ( #2679 )
...
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun
d222248d00
Update README.md
2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun
e5b94d4117
Update README.md
2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun
87e2e58a22
Update gh-pages.yml
2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun
de20e5a992
Update Dockerfile.xpu
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun
2f9c0618f0
Update Dockerfile.gpu
2025-07-03 10:14:39 +08:00
Yuanle Liu
9a14ab6572
add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl ( #2682 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 05:32:20 -07:00
Divano
d1cb3ed571
Update gh-pages.yml ( #2680 )
2025-07-02 17:36:18 +08:00
handiz
b8a8a19689
add wint2 performance ( #2673 )
2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun
97ac82834f
Update nvidia_gpu.md
2025-07-02 16:54:14 +08:00
Jiang-Jia-Jun
685265a97d
Update nvidia_gpu.md
2025-07-02 15:43:35 +08:00
Jiang-Jia-Jun
fc4d643634
Update nvidia_gpu.md
2025-07-02 15:39:48 +08:00
YuBaoku
bb880c8d7c
Update CI test cases ( #2671 )
...
* set git identity to avoid merge failure in CI
* add ci cases
2025-07-02 15:08:39 +08:00
liddk1121
865e856a94
update iluvatar gpu fastdeploy whl ( #2675 )
2025-07-02 14:47:21 +08:00
Jiang-Jia-Jun
9f4a65d817
Update README.md
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 10:04:58 +08:00
YuBaoku
e3aac0c5b8
set git identity to avoid merge failure in CI ( #2665 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 19:06:46 +08:00
AIbin
a197dcd729
【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference ( #2666 )
...
* Support TP2&TP4 Wint
* Support TP2&TP4 Wint2 Inference
2025-07-01 18:29:11 +08:00
freeliuzc
2b7f74d427
fix docs ( #2669 )
...
Co-authored-by: liuzichang01 <liuzichang01@baidu.com >
2025-07-01 18:02:44 +08:00
Jiang-Jia-Jun
164b83ab0b
[Doc] Update nvidia gpu installation description
2025-07-01 15:22:19 +08:00
Jiang-Jia-Jun
01d5d66d95
[Doc] Update nvidia gpu installation description
2025-07-01 15:20:40 +08:00
Jiang-Jia-Jun
8f1dddcf35
[Doc] Update nvidia gpu installation description
2025-07-01 15:20:21 +08:00
hong19860320
8e335db645
Update kunlunxin_xpu.md ( #2662 )
2025-07-01 15:10:45 +08:00
AIbin
1bb296c5ad
update quantization doc ( #2659 )
2025-07-01 15:05:02 +08:00
hong19860320
92428a5ae4
Update kunlunxin_xpu.md ( #2657 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 12:28:49 +08:00
RichardWooSJTU
85090ed799
remove unuseful scripts ( #2652 )
2025-07-01 10:18:25 +08:00
ltd0924
50aa4080c0
[Serving] fix offline inference sampling parameters overwrite ( #2654 )
2025-07-01 10:17:46 +08:00
YUNSHEN XIE
d5af78945b
Add ci ( #2650 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add ci ut and workflow
* Automatically cancel any previous CI runs for the ci.yml workflow, keeping only the latest one active
2025-06-30 20:20:49 +08:00
hong19860320
6bead64f48
Update kunlunxin_xpu.md
2025-06-30 15:59:22 +08:00