FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-15 21:20:53 +08:00

Author	SHA1	Message	Date
YuanRisheng	85fbf5455a	[V1 Loader]Ernie VL support loader v1 (#3494 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * ernie vl support new loader * add unittest * fix test	2025-08-22 11:16:57 +08:00
YuanRisheng	c389a4013c	Unify server-side and model-side Config(Part-5) (#3497 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details * move config * fix xpu * fix * fix vl * fix vl * fix unitest * fix args * add unitest * fix test	2025-08-21 19:00:21 +08:00
李泳桦	8bea4b1e25	[fix] fix output tokens count in streaming completion api (#3507 )	2025-08-21 18:19:13 +08:00
李泳桦	e4f0b755b4	[fix] setting disable_chat_template while passing prompt_token_ids led to response error (#3228 ) * [fix] setting disable_chat_template while passing prompt_token_ids led to response error * [fix] code syntax * [test] add test case for this bug * [test] add test case for empty message list * [test] fix test case for empty message list	2025-08-21 17:30:51 +08:00
luukunn	371fb3f853	[Feature] add tool parser (#3483 ) * add tool parser * add x1 enable_thinking * restart ci * fix vl reasoning parser * modify call style * modify call style * add offline enablethinking * fix completion * fix * fix unit test * fix unit test * fix unit test * fix vl reasoning parser * fix vl reasoning parser	2025-08-21 17:25:44 +08:00
Yzc216	466cbb5a99	[Feature] Models api (#3073 ) * add v1/models interface related * add model parameters * default model verification * unit test * check model err_msg * unit test * type annotation * model parameter in response * modify document description * modify document description * unit test * verification * verification update * model_name * pre-commit * update test case * update test case * Update tests/entrypoints/openai/test_serving_models.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tests/entrypoints/openai/test_serving_models.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tests/entrypoints/openai/test_serving_models.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tests/entrypoints/openai/test_serving_models.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/entrypoints/openai/serving_models.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-21 17:02:56 +08:00
qw86972190	c83381d650	revert pr (#3481 ) Co-authored-by: iosmers <yinwei_hust@163.com>	2025-08-21 14:19:50 +08:00
ltd0924	51f68ae593	[Feature] add dealer manager to reuse the connection (#3471 ) * [BugFix] fix control signal release failed * [BugFix] fix control signal release failed * update * update * update * [Feature] add dealer manager to reuse the connection * fix * fix * fix * fix * fix * fix * Create test_dealer_connection_manager.py * Delete test/entrypoints/openai directory * Update test_dealer_connection_manager.py * Update test_dealer_connection_manager.py	2025-08-21 13:11:13 +08:00
memoryCoderC	31f639f10b	[Feature] add prompt_tokens and completion_tokens (#3504 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-21 10:23:27 +08:00
Zero Rains	30b3f2dc07	[BugFix][V1 Loader] fix the bug in creat weight for block_wise_fp8 (#3486 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-20 05:52:54 -07:00
Ryan	bcdfc1d6b9	Add custom op declaration for `all_reduce` (#3473 ) * add custom op declaration * roll back try except	2025-08-20 20:29:58 +08:00
xiaolei373	5d131485d8	add error log to file (#3431 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * feat(log):add_request_and_response_log * feat[log]:add error log to file	2025-08-20 09:52:34 +08:00
kevin	67298cf4c0	add error traceback info (#3419 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * add error traceback info * update error msg * update code --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-08-19 19:32:04 +08:00
Zero Rains	fef447e350	[V1 Loader] Support MOE parameters create and load for DeepGemm and marlin backend (#3447 ) * support deepgemm backend * support marlin backend * remove print * fix process_prequanted_weights	2025-08-19 14:15:53 +08:00
chen	6735626014	fix request_output sampling_params (#3154 ) (#3464 )	2025-08-19 13:52:50 +08:00
ltd0924	bca8905b40	[BugFix] fix control signal release failed (#3390 ) * [BugFix] fix control signal release failed * [BugFix] fix control signal release failed * update * update * update	2025-08-19 13:51:38 +08:00
Zero Rains	8b12c80f90	[FixBug] compute early stopping with real batch size (#3418 ) * [FixBug] compute early stopping with real batch size * update * fix test_sampler	2025-08-18 22:09:21 -07:00
luukunn	3a7a20d191	[Feature] Pass through the `chat_template_kwargs` to the data processing module (#3421 ) * fix chat_template_args * fix args * add offline * add offline * fix * fix * fix default enable_thinking value * fix default enable_thinking value * modify condition * Revert "modify condition" This reverts commit `26430bdeb1`. * fix unit test	2025-08-19 10:50:01 +08:00
lizexu123	a053ab889b	[BugFix] fix num_running_requests in cuda_graph (#3457 ) * fix cuda_grpah * add note --------- Co-authored-by: RAM <gstian5555@outlook.com>	2025-08-19 10:47:22 +08:00
AIbin	beec24fd89	【Inference Optimize】DeepSeek-v3 model inference performance optimization (#3455 ) * DSK_OPT_01 * update FA3	2025-08-19 10:42:42 +08:00
zhuzixuan	c95b3395e9	【BugFix】completion接口echo回显支持 (#3245 ) * wenxin-tools-511,修复v1/completion无法回显的问题。 * 支持多prompt的回显 * 支持多prompt情况下的流式回显 * 补充了 completion 接口支持 echo 的单元测试 * pre-commit * 移除了多余的test文件 * 修复了completion接口echo支持的单测方法 * 补充了单元测试文件 * 补充单测 * unittest * 补充单测 * 修复单测 * 删除不必要的assert. * 重新提交 * 更新测试方法 * ut * 验证是否是正确思路单测 * 验证是否是正确思路单测 * 验证是否是正确思路单测3 * 优化单测代码，有针对性地缩小单测范围。 * 优化单测代码2，有针对性地缩小单测范围。 * 优化单测代码3，有针对性地缩小单测范围。 * support 'echo' in chat/completion. * update * update * update * update * update * update * 补充了关于tokenid的单元测试 * update * 修正index错误 * 修正index错误	2025-08-19 10:41:51 +08:00
lizexu123	32b39620bc	[Code Simplification] remove cum_offsets (#3410 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details	2025-08-18 20:21:25 +08:00
luukunn	9c129813f9	[Feature] add custom chat template (#3251 ) * add custom chat_template * add custom chat_template * add unittest * fix * add docs * fix comment * add offline chat * fix unit test * fix unit test * fix * fix pre commit * fix unit test * add unit test * add unit test * add unit test * fix pre_commit * fix enable_thinking * fix pre commit * fix pre commit * fix unit test * add requirements	2025-08-18 16:34:08 +08:00
Jundong Liu	70ee910cd5	[Excutor] Change cudagraph hashkey from batch size to num_tokens (#3454 )	2025-08-18 16:16:48 +08:00
Jundong Liu	ea4a3b479c	[Excutor] Increase buffer size to prevent address corruption; add forward metadata debug tool (#3404 ) * 修复buffer申请不够大，增加打印forwardmetadata的工具 * fix mistake * Make CPU tensor in CPUPlace * Add test about forward_meta_str and Add unitest_requirement --------- Co-authored-by: RAM <gstian5555@outlook.com>	2025-08-18 16:14:09 +08:00
chen	5585cf7aa5	fix mtp_rej_topp input (#3450 )	2025-08-18 16:12:42 +08:00
gaoziyuan	6fdd83da10	fix some bug (#3434 )	2025-08-18 14:39:13 +08:00
freeliuzc	a12d0bc549	[Feature][MTP]update multi-draft-token strategy (#3369 ) * update multi-draft-token strategy * fix format --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-08-18 13:59:56 +08:00
chen	e88f5552db	fix cpu __ini__.py (#3448 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-17 12:38:54 +08:00
chen	f0f00a6025	[OPs] Universal optimization and Fix early_stop cuda 700 (#3375 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * delete nonzero * delete setup_ops_base.py * check if * check gcp infer_seed.cpu() * fix repetition_early_stopper_kernel cuda 700	2025-08-14 22:40:44 +08:00
YuanRisheng	09c979f3dd	[V1 Loader] Support Ernie text（moe and dense） (#3110 ) * new loader support 0.3B * fix weight * support parallel load * support parallel load * fix slice * support moe * delete code * perfect code * perfect code	2025-08-14 20:25:28 +08:00
lzy	1e06b9fa6d	make append_attn supports mask_offset (#3138 ) * make append_attn supports mask_offset * add unittest	2025-08-14 03:40:55 -07:00
memoryCoderC	6031f9a5f5	[BugFix] fix ErnieProcessor not set raw_prediction (#3400 )	2025-08-14 18:07:49 +08:00
lizexu123	7b596d0877	[BugFix] fix real_bsz in ep (#3366 ) * Your commit message here * fix ep * delete cuda_graph	2025-08-14 17:31:19 +08:00
Jiang-Jia-Jun	666ab65a51	[Polish Code] Remove useless notes	2025-08-14 14:04:52 +08:00
Jiang-Jia-Jun	dd583fb16a	[BugFix] Fix default log level of paddleformers (#3376 ) * [BugFix] Fix default log level of paddleformers * [BugFix] Fix default log level of paddleformers --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-08-14 11:36:24 +08:00
xiaolei373	d4f610e4cd	feat(log):add_request_and_response_log (#3373 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-13 23:27:41 +08:00
ming1753	396dba0d62	[Bug Fix] Fix V1 video bug (#3388 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-13 23:04:07 +08:00
Zero Rains	be94bdd0b0	[Loader V1] modify layername for DeepSeekV3 (#3336 ) Co-authored-by: Yuanle Liu <yuanlehome@163.com> Co-authored-by: YUNSHEN XIE <1084314248@qq.com>	2025-08-13 15:47:06 +08:00
EnflameGCU	d1a92e3e17	[GCU] Enable gcu CI (#3190 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [GCU] Update to the latest version * [GCU] Enable CI	2025-08-13 11:48:24 +08:00
yzwu	ce9180241e	[Iluvatar GPU] Modify the names of some variables (#3273 )	2025-08-13 11:38:02 +08:00
Kane2011	b4fef2cf29	[MetaxGPU] Support FastDeploy on metax gpu (#3241 ) * [MetaxGPU] Support FastDeploy on metax gpu * Update metax_worker.py 1. change worker log; 2. remove custom allreduce, adapt it later; 3. remove cuda graph; * Update __init__.py 1. remove metax's key work comment * Update __init__.py 1. remove metax's key word comment; 2. add fused_moe_kernel_paddle import --------- Co-authored-by: yongqiangma <xing.wo@163.com>	2025-08-13 11:11:54 +08:00
luukunn	eda83ca672	add Tool Parser (#3272 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * add tool-parser * add tool-parser * add tool parser * add tool parser * fix * add offline * add offline * fix * parsers:tool&reasoning * 修改tool parser名称· * update * fix reasoning-parser * add requirements * fix finish reason * fix * fix reasoning-parser * fix * fix * fix * fix * fix --------- Co-authored-by: zhuzixuan <zhuzixuan@baidu.com>	2025-08-13 01:06:55 +08:00
memoryCoderC	2d1a4cacdf	Completion add raw_prediction/text_after_process (#3356 )	2025-08-12 23:06:45 +08:00
memoryCoderC	c575611a5b	[BugFix] v1/completions add finish_reason (#3246 ) * [BugFix] v1/completions add finish_reason * update TestOpenAIServingCompletion for merge --------- Co-authored-by: YUNSHEN XIE <1084314248@qq.com>	2025-08-12 19:40:26 +08:00
Jiang-Jia-Jun	90bfa0be9c	Update envs.py	2025-08-12 16:24:47 +08:00
Jiang-Jia-Jun	5620bd12de	Update envs.py	2025-08-12 16:24:33 +08:00
gaoziyuan	ccc7f1beb3	fix mapping (#3320 )	2025-08-12 16:15:59 +08:00
RichardWooSJTU	283da92bfa	fix ep lm head (#3244 ) Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>	2025-08-12 15:38:28 +08:00
ming1753	f5164215be	[Bug Fix] fix vl V1 schedule bug (#3323 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [Bug Fix] fix vl V1 schedule bug * fix format	2025-08-12 11:31:39 +08:00

1 2 3 4 5 ...

946 Commits