YuanRisheng
09c979f3dd
[V1 Loader] Support Ernie text(moe and dense) ( #3110 )
...
* new loader support 0.3B
* fix weight
* support parallel load
* support parallel load
* fix slice
* support moe
* delete code
* perfect code
* perfect code
2025-08-14 20:25:28 +08:00
xjkmfa
ab60292f89
【CI】 evil case ( #3359 )
...
* Add ci case for min token and max token
* 【CI case】include total_tokens in the last packet of completion interface stream output
* 边缘检测 ,攻击性测试
* 边缘检测 ,攻击性测试
* 边缘检测 ,攻击性测试
* 边缘检测 ,攻击性测试
---------
Co-authored-by: xujing43 <xujing43@baidu.com >
2025-08-14 20:00:47 +08:00
freeliuzc
cacc52bf21
modify readme ( #3409 )
2025-08-14 19:47:36 +08:00
Sunny-bot1
79d8ae4c38
[UT Fix] Fix bad_words test ( #3385 )
...
* fix bad_words test
* add streaming
* fix
* fix
2025-08-14 03:55:02 -07:00
lzy
1e06b9fa6d
make append_attn supports mask_offset ( #3138 )
...
* make append_attn supports mask_offset
* add unittest
2025-08-14 03:40:55 -07:00
memoryCoderC
6031f9a5f5
[BugFix] fix ErnieProcessor not set raw_prediction ( #3400 )
2025-08-14 18:07:49 +08:00
YUNSHEN XIE
f72db9386c
Add requirements for running unit tests ( #3350 )
...
* Add requirements for running unit tests
* update
2025-08-14 17:37:18 +08:00
lizexu123
7b596d0877
[BugFix] fix real_bsz in ep ( #3366 )
...
* Your commit message here
* fix ep
* delete cuda_graph
2025-08-14 17:31:19 +08:00
gaoziyuan
0ea8712018
fix op tests ( #3398 )
2025-08-14 16:45:25 +08:00
Sunny-bot1
2e7831185f
[Optimize]Add norm_weights feature for topk_gating_softmax ( #3372 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-14 15:05:23 +08:00
Jiang-Jia-Jun
666ab65a51
[Polish Code] Remove useless notes
2025-08-14 14:04:52 +08:00
Jiang-Jia-Jun
dd583fb16a
[BugFix] Fix default log level of paddleformers ( #3376 )
...
* [BugFix] Fix default log level of paddleformers
* [BugFix] Fix default log level of paddleformers
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-08-14 11:36:24 +08:00
xiaolei373
d4f610e4cd
feat(log):add_request_and_response_log ( #3373 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-13 23:27:41 +08:00
ming1753
396dba0d62
[Bug Fix] Fix V1 video bug ( #3388 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-13 23:04:07 +08:00
YUNSHEN XIE
1ace375fc3
Optimize CI execution workflow ( #3371 )
...
* Optimize CI execution workflow
* fix
2025-08-13 18:47:31 +08:00
Zero Rains
be94bdd0b0
[Loader V1] modify layername for DeepSeekV3 ( #3336 )
...
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-13 15:47:06 +08:00
memoryCoderC
f702a675a1
fix TestOpenAIServingCompletion fail ( #3368 )
2025-08-13 15:45:07 +08:00
EnflameGCU
d1a92e3e17
[GCU] Enable gcu CI ( #3190 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [GCU] Update to the latest version
* [GCU] Enable CI
2025-08-13 11:48:24 +08:00
yzwu
ce9180241e
[Iluvatar GPU] Modify the names of some variables ( #3273 )
2025-08-13 11:38:02 +08:00
Kane2011
b4fef2cf29
[MetaxGPU] Support FastDeploy on metax gpu ( #3241 )
...
* [MetaxGPU] Support FastDeploy on metax gpu
* Update metax_worker.py
1. change worker log;
2. remove custom allreduce, adapt it later;
3. remove cuda graph;
* Update __init__.py
1. remove metax's key work comment
* Update __init__.py
1. remove metax's key word comment;
2. add fused_moe_kernel_paddle import
---------
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-08-13 11:11:54 +08:00
Ryan
ed6bff215a
fix custom op order rms_norm_eps ( #3348 )
2025-08-13 10:12:49 +08:00
Sunny-bot1
8224b21525
Refactor moe_topk_select op to use apply_norm_weight as a template parameter ( #3345 )
...
* Refactor moe_topk_select op to use apply_norm_weight as a template parameter
* update test
2025-08-13 08:44:16 +08:00
luukunn
eda83ca672
add Tool Parser ( #3272 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add tool-parser
* add tool-parser
* add tool parser
* add tool parser
* fix
* add offline
* add offline
* fix
* parsers:tool&reasoning
* 修改tool parser名称·
* update
* fix reasoning-parser
* add requirements
* fix finish reason
* fix
* fix reasoning-parser
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: zhuzixuan <zhuzixuan@baidu.com >
2025-08-13 01:06:55 +08:00
memoryCoderC
2d1a4cacdf
Completion add raw_prediction/text_after_process ( #3356 )
2025-08-12 23:06:45 +08:00
zhink
2c0d853067
add test for CustomAllreduce ( #3313 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-12 20:44:47 +08:00
YUNSHEN XIE
8791ad4e61
Pre ce modified ( #3335 )
...
* update
* update
* fix
* fix
* update
* update
* update
* fix
* update
2025-08-12 20:25:03 +08:00
memoryCoderC
c575611a5b
[BugFix] v1/completions add finish_reason ( #3246 )
...
* [BugFix] v1/completions add finish_reason
* update TestOpenAIServingCompletion for merge
---------
Co-authored-by: YUNSHEN XIE <1084314248@qq.com >
2025-08-12 19:40:26 +08:00
Jiang-Jia-Jun
90bfa0be9c
Update envs.py
2025-08-12 16:24:47 +08:00
Jiang-Jia-Jun
5620bd12de
Update envs.py
2025-08-12 16:24:33 +08:00
YUNSHEN XIE
7d0d5a543a
Use latest PaddlePaddle package ( #3347 )
...
* Use latest PaddlePaddle package
* fix
2025-08-12 16:23:41 +08:00
gaoziyuan
ccc7f1beb3
fix mapping ( #3320 )
2025-08-12 16:15:59 +08:00
RichardWooSJTU
283da92bfa
fix ep lm head ( #3244 )
...
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
2025-08-12 15:38:28 +08:00
ming1753
f5164215be
[Bug Fix] fix vl V1 schedule bug ( #3323 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Bug Fix] fix vl V1 schedule bug
* fix format
2025-08-12 11:31:39 +08:00
yangjianfengo1
b808c49585
[Doc] 增加中英文切换 ( #3318 )
...
* 增加中英文切换
* 增加中英文切换
* 修改readme
2025-08-12 11:20:45 +08:00
chenjian
b21272d9ff
[Bug fix] fix block num setting in scheduler v1 for develop ( #3303 )
...
* fix block num setting in scheduler v1
* fix block num setting in scheduler v1
* fix max_block_num and max_num_batched_tokens setting
* fix max_block_num and max_num_batched_tokens setting
* fix max_block_num and max_num_batched_tokens setting
* fix max_block_num and max_num_batched_tokens setting
2025-08-12 10:38:51 +08:00
Jiang-Jia-Jun
183e3863e8
Remove useless code ( #3337 )
2025-08-12 10:32:31 +08:00
Sunny-bot1
19fda4e912
fix docs ( #3332 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-11 21:03:49 +08:00
JYChen
973ddad91e
fix unittest ( #3328 )
2025-08-11 20:58:24 +08:00
Divano
f27e879785
Update _base_test.yml ( #3331 )
2025-08-11 20:57:20 +08:00
Sunny-bot1
789dc67ff7
[Docs]fix sampling docs ( #3113 )
...
* fix sampling docs
* fix sampling docs
* update
2025-08-11 20:42:27 +08:00
Divano
8bf96217b4
Update test_evil_cases.py
2025-08-11 20:27:02 +08:00
YUNSHEN XIE
770b0aa3c5
fix ci pypi index error ( #3326 )
2025-08-11 20:21:08 +08:00
kevin
9627619235
fix uvicorn multi worker error ( #3300 )
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-11 19:39:41 +08:00
Zero Rains
b23af29d0b
Launch expert_service before kv_cache initialization in worker_process ( #3045 )
...
* launch expert_service before kv_cache initialization
* add two signal make sure model loading and expert_service lauching finished
* fix the EP bug
* fix ep
* update launching way
* fix ep
* update
* roback ep
* pre-commit all files
---------
Co-authored-by: RAM <gstian5555@outlook.com >
Co-authored-by: Divano <dddivano@outlook.com >
2025-08-11 19:38:46 +08:00
Zhang Yulong
c27a3dc43b
Update deploy.py ( #3310 )
...
* Update deploy.py
更新部署工具
* Update deploy.py
2025-08-11 19:11:57 +08:00
Jiang-Jia-Jun
c56c99837a
Revert "[BugFix] num_seqs ( #3291 )" ( #3316 )
...
This reverts commit e0aeac58e1
.
2025-08-11 16:16:51 +08:00
Yuanle Liu
9571c458f0
enhance eos_tokens ( #3274 )
...
* enhance eos_tokens
* update
* update
2025-08-11 14:47:52 +08:00
Divano
21caa63794
update base test ( #3304 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* update base test
额外启动一次服务测试repetition stop
* Update _base_test.yml
2025-08-11 14:15:45 +08:00
Zero Rains
42af0b4b64
[V1 Loader] Support DeepSeekV3(bf16) ( #3294 )
...
* Support new loader for DeepSeekV3(bf16)
* update paddle version
* remove useless attr
2025-08-11 13:39:28 +08:00
lizexu123
e0aeac58e1
[BugFix] num_seqs ( #3291 )
...
* fix num_seqs
* merge develop
2025-08-11 13:38:55 +08:00