FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Author	SHA1	Message	Date
YuanRisheng	09c979f3dd	[V1 Loader] Support Ernie text（moe and dense） (#3110 ) * new loader support 0.3B * fix weight * support parallel load * support parallel load * fix slice * support moe * delete code * perfect code * perfect code	2025-08-14 20:25:28 +08:00
xjkmfa	ab60292f89	【CI】 evil case (#3359 ) * Add ci case for min token and max token * 【CI case】include total_tokens in the last packet of completion interface stream output * 边缘检测，攻击性测试 * 边缘检测，攻击性测试 * 边缘检测，攻击性测试 * 边缘检测，攻击性测试 --------- Co-authored-by: xujing43 <xujing43@baidu.com>	2025-08-14 20:00:47 +08:00
freeliuzc	cacc52bf21	modify readme (#3409 )	2025-08-14 19:47:36 +08:00
Sunny-bot1	79d8ae4c38	[UT Fix] Fix bad_words test (#3385 ) * fix bad_words test * add streaming * fix * fix	2025-08-14 03:55:02 -07:00
lzy	1e06b9fa6d	make append_attn supports mask_offset (#3138 ) * make append_attn supports mask_offset * add unittest	2025-08-14 03:40:55 -07:00
memoryCoderC	6031f9a5f5	[BugFix] fix ErnieProcessor not set raw_prediction (#3400 )	2025-08-14 18:07:49 +08:00
YUNSHEN XIE	f72db9386c	Add requirements for running unit tests (#3350 ) * Add requirements for running unit tests * update	2025-08-14 17:37:18 +08:00
lizexu123	7b596d0877	[BugFix] fix real_bsz in ep (#3366 ) * Your commit message here * fix ep * delete cuda_graph	2025-08-14 17:31:19 +08:00
gaoziyuan	0ea8712018	fix op tests (#3398 )	2025-08-14 16:45:25 +08:00
Sunny-bot1	2e7831185f	[Optimize]Add norm_weights feature for topk_gating_softmax (#3372 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-14 15:05:23 +08:00
Jiang-Jia-Jun	666ab65a51	[Polish Code] Remove useless notes	2025-08-14 14:04:52 +08:00
Jiang-Jia-Jun	dd583fb16a	[BugFix] Fix default log level of paddleformers (#3376 ) * [BugFix] Fix default log level of paddleformers * [BugFix] Fix default log level of paddleformers --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-08-14 11:36:24 +08:00
xiaolei373	d4f610e4cd	feat(log):add_request_and_response_log (#3373 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-13 23:27:41 +08:00
ming1753	396dba0d62	[Bug Fix] Fix V1 video bug (#3388 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-13 23:04:07 +08:00
YUNSHEN XIE	1ace375fc3	Optimize CI execution workflow (#3371 ) * Optimize CI execution workflow * fix	2025-08-13 18:47:31 +08:00
Zero Rains	be94bdd0b0	[Loader V1] modify layername for DeepSeekV3 (#3336 ) Co-authored-by: Yuanle Liu <yuanlehome@163.com> Co-authored-by: YUNSHEN XIE <1084314248@qq.com>	2025-08-13 15:47:06 +08:00
memoryCoderC	f702a675a1	fix TestOpenAIServingCompletion fail (#3368 )	2025-08-13 15:45:07 +08:00
EnflameGCU	d1a92e3e17	[GCU] Enable gcu CI (#3190 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [GCU] Update to the latest version * [GCU] Enable CI	2025-08-13 11:48:24 +08:00
yzwu	ce9180241e	[Iluvatar GPU] Modify the names of some variables (#3273 )	2025-08-13 11:38:02 +08:00
Kane2011	b4fef2cf29	[MetaxGPU] Support FastDeploy on metax gpu (#3241 ) * [MetaxGPU] Support FastDeploy on metax gpu * Update metax_worker.py 1. change worker log; 2. remove custom allreduce, adapt it later; 3. remove cuda graph; * Update __init__.py 1. remove metax's key work comment * Update __init__.py 1. remove metax's key word comment; 2. add fused_moe_kernel_paddle import --------- Co-authored-by: yongqiangma <xing.wo@163.com>	2025-08-13 11:11:54 +08:00
Ryan	ed6bff215a	fix custom op order rms_norm_eps (#3348 )	2025-08-13 10:12:49 +08:00
Sunny-bot1	8224b21525	Refactor moe_topk_select op to use apply_norm_weight as a template parameter (#3345 ) * Refactor moe_topk_select op to use apply_norm_weight as a template parameter * update test	2025-08-13 08:44:16 +08:00
luukunn	eda83ca672	add Tool Parser (#3272 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * add tool-parser * add tool-parser * add tool parser * add tool parser * fix * add offline * add offline * fix * parsers:tool&reasoning * 修改tool parser名称· * update * fix reasoning-parser * add requirements * fix finish reason * fix * fix reasoning-parser * fix * fix * fix * fix * fix --------- Co-authored-by: zhuzixuan <zhuzixuan@baidu.com>	2025-08-13 01:06:55 +08:00
memoryCoderC	2d1a4cacdf	Completion add raw_prediction/text_after_process (#3356 )	2025-08-12 23:06:45 +08:00
zhink	2c0d853067	add test for CustomAllreduce (#3313 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-12 20:44:47 +08:00
YUNSHEN XIE	8791ad4e61	Pre ce modified (#3335 ) * update * update * fix * fix * update * update * update * fix * update	2025-08-12 20:25:03 +08:00
memoryCoderC	c575611a5b	[BugFix] v1/completions add finish_reason (#3246 ) * [BugFix] v1/completions add finish_reason * update TestOpenAIServingCompletion for merge --------- Co-authored-by: YUNSHEN XIE <1084314248@qq.com>	2025-08-12 19:40:26 +08:00
Jiang-Jia-Jun	90bfa0be9c	Update envs.py	2025-08-12 16:24:47 +08:00
Jiang-Jia-Jun	5620bd12de	Update envs.py	2025-08-12 16:24:33 +08:00
YUNSHEN XIE	7d0d5a543a	Use latest PaddlePaddle package (#3347 ) * Use latest PaddlePaddle package * fix	2025-08-12 16:23:41 +08:00
gaoziyuan	ccc7f1beb3	fix mapping (#3320 )	2025-08-12 16:15:59 +08:00
RichardWooSJTU	283da92bfa	fix ep lm head (#3244 ) Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>	2025-08-12 15:38:28 +08:00
ming1753	f5164215be	[Bug Fix] fix vl V1 schedule bug (#3323 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [Bug Fix] fix vl V1 schedule bug * fix format	2025-08-12 11:31:39 +08:00
yangjianfengo1	b808c49585	[Doc] 增加中英文切换 (#3318 ) * 增加中英文切换 * 增加中英文切换 * 修改readme	2025-08-12 11:20:45 +08:00
chenjian	b21272d9ff	[Bug fix] fix block num setting in scheduler v1 for develop (#3303 ) * fix block num setting in scheduler v1 * fix block num setting in scheduler v1 * fix max_block_num and max_num_batched_tokens setting * fix max_block_num and max_num_batched_tokens setting * fix max_block_num and max_num_batched_tokens setting * fix max_block_num and max_num_batched_tokens setting	2025-08-12 10:38:51 +08:00
Jiang-Jia-Jun	183e3863e8	Remove useless code (#3337 )	2025-08-12 10:32:31 +08:00
Sunny-bot1	19fda4e912	fix docs (#3332 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-11 21:03:49 +08:00
JYChen	973ddad91e	fix unittest (#3328 )	2025-08-11 20:58:24 +08:00
Divano	f27e879785	Update _base_test.yml (#3331 )	2025-08-11 20:57:20 +08:00
Sunny-bot1	789dc67ff7	[Docs]fix sampling docs (#3113 ) * fix sampling docs * fix sampling docs * update	2025-08-11 20:42:27 +08:00
Divano	8bf96217b4	Update test_evil_cases.py	2025-08-11 20:27:02 +08:00
YUNSHEN XIE	770b0aa3c5	fix ci pypi index error (#3326 )	2025-08-11 20:21:08 +08:00
kevin	9627619235	fix uvicorn multi worker error (#3300 ) Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-08-11 19:39:41 +08:00
Zero Rains	b23af29d0b	Launch expert_service before kv_cache initialization in worker_process (#3045 ) * launch expert_service before kv_cache initialization * add two signal make sure model loading and expert_service lauching finished * fix the EP bug * fix ep * update launching way * fix ep * update * roback ep * pre-commit all files --------- Co-authored-by: RAM <gstian5555@outlook.com> Co-authored-by: Divano <dddivano@outlook.com>	2025-08-11 19:38:46 +08:00
Zhang Yulong	c27a3dc43b	Update deploy.py (#3310 ) * Update deploy.py 更新部署工具 * Update deploy.py	2025-08-11 19:11:57 +08:00
Jiang-Jia-Jun	c56c99837a	Revert "[BugFix] num_seqs (#3291 )" (#3316 ) This reverts commit `e0aeac58e1`.	2025-08-11 16:16:51 +08:00
Yuanle Liu	9571c458f0	enhance eos_tokens (#3274 ) * enhance eos_tokens * update * update	2025-08-11 14:47:52 +08:00
Divano	21caa63794	update base test (#3304 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * update base test 额外启动一次服务测试repetition stop * Update _base_test.yml	2025-08-11 14:15:45 +08:00
Zero Rains	42af0b4b64	[V1 Loader] Support DeepSeekV3(bf16) (#3294 ) * Support new loader for DeepSeekV3(bf16) * update paddle version * remove useless attr	2025-08-11 13:39:28 +08:00
lizexu123	e0aeac58e1	[BugFix] num_seqs (#3291 ) * fix num_seqs * merge develop	2025-08-11 13:38:55 +08:00

... 4 5 6 7 8 ...

3260 Commits