FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 16:48:03 +08:00

Author	SHA1	Message	Date
Divano	e24929efa3	Ce add bad cases (#3215 ) * add repitation early stop cases * add repitation early stop cases * add bad cases * add bad cases	2025-08-05 16:37:28 +08:00
chen	04fc7eb931	fix test_air_top_p_sampling name (#3211 )	2025-08-05 15:47:50 +08:00
Divano	9f1936ae28	Ce add repitation early stop cases (#3213 ) * add repitation early stop cases * add repitation early stop cases	2025-08-05 15:47:28 +08:00
ming1753	14ed75f7d3	[Test] scaled_gemm_f8_i4_f16 skip test while sm != 89 (#3210 )	2025-08-05 15:25:28 +08:00
yangjianfengo1	40f7f3e0d8	[New Feature] fa3 支持flash mask (#3184 ) * 支持flash mask * 修改test_flash_mask * 修改test.sh	2025-08-05 12:20:48 +08:00
Divano	fb7a0689cc	add more cases (#3207 )	2025-08-05 11:17:36 +08:00
RAM	c593e1a39c	[Bug Fix]Fix bug of append attention test case (#3202 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-08-05 11:04:45 +08:00
Divano	88596c0c63	Add more base chat cases (#3203 ) * add test base class * fix codestyle * fix codestyle * add base chat	2025-08-05 10:24:12 +08:00
lizhenyun01	fe540f6caa	[plugin] Custom model_runner/model support (#3186 ) * support custom model&&model_runner * fix merge * add test && update doc * fix codestyle * fix unittest * load model in rl	2025-08-04 18:52:39 -07:00
YuBaoku	3eb9a5df60	[CI] add test_compare_top_logprobs (#3191 )	2025-08-04 19:49:24 +08:00
SunLei	68bc1d12c0	[Bugfix] Fix uninitialized decoded_token and add corresponding unit test. (#3195 )	2025-08-04 19:23:58 +08:00
Zero Rains	17f51f0c92	[unitest] fix the bug in test_sampler (#3157 )	2025-08-04 01:23:25 -07:00
Divano	3bfb2eca92	Update test_base_chat.py (#3183 )	2025-08-04 15:09:53 +08:00
gaoziyuan	4021d66ea5	【Feature】add fd plugins && rm model_classes (#3123 ) * add fd plugins && rm model_classed * fix reviews * add docs * fix * fix unitest ci	2025-08-03 19:53:20 -07:00
Divano	66d3bb89ad	Update __init__.py (#3163 ) 升级测试基类兼容性	2025-08-04 09:40:09 +08:00
Zhang Yulong	0eb32bb9c8	add cases (#3155 )	2025-08-01 18:38:57 +08:00
Divano	50db0d7ba9	add case (#3150 ) * add test base class * fix codestyle * fix codestyle * add base chat	2025-08-01 17:30:58 +08:00
JYChen	c34088b0fd	fix stop seq unittest (#3126 )	2025-08-01 16:50:05 +08:00
Divano	1d93565082	[CE] Add base test class for web server testing (#3120 ) * add test base class * fix codestyle * fix codestyle	2025-07-31 23:28:50 +08:00
Zhang Yulong	1a543bca29	Fix test_EB_Lite_serving.py (#3119 ) * Fix test_EB_Lite_serving.py * fix test_EB_Lite_serving.py	2025-07-31 20:15:25 +08:00
LiqinruiG	25005fee30	[Doc] add chat_template_kwagrs and update params docs (#3103 ) * add chat_template_kwagrs and update params docs * add chat_template_kwagrs and update params docs * update enable_thinking * pre-commit * update test case --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-07-31 19:44:06 +08:00
YUNSHEN XIE	583eae2fd1	fix ci (#3106 ) * fix ci * disable test_non_streaming_chat_with_min_tokens	2025-07-31 17:25:08 +08:00
Jiang-Jia-Jun	0616c208d2	[Feature] Support include_stop_str_in_output in completion api (#3096 ) * [Feature] Support include_stop_str_in_output in completion api * Fix ci test --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-30 22:18:48 +08:00
李泳桦	b242150f94	[feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client (#3058 ) * [feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client * [fix] delete ci test case for enable_thinking * [fix] add reasoning_parser when server starts * [fix] fix ci consistency test error with reasoning parser * [doc] update docs related to metadata * [fix] cancel enable_thinking default value	2025-07-30 19:25:20 +08:00
AIbin	28fff1b035	Revert "Add uinttest for moe_ffn_wint2. (#3037 )" (#3085 ) This reverts commit `327e1943fa`.	2025-07-30 19:04:07 +08:00
Jiang-Jia-Jun	ffa0f4d99b	[Fix] Fix version function (#3076 ) * [Fix] Fix version function * Fix commit * Fix commit * fix code sync * Update coverage_run.sh --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-30 16:05:24 +08:00
YuanRisheng	eeadbf332a	delete unused unittest (#3065 )	2025-07-30 15:11:58 +08:00
Yiqun Liu	327e1943fa	Add uinttest for moe_ffn_wint2. (#3037 ) Change-Id: Ifd452527eaf87ea96c3fa4fa9aeb17729b33c2de	2025-07-30 15:03:09 +08:00
Sunny-bot1	74aa31d15b	[Feature] support bad_words (#3055 ) * support bad_words * support online infer bad_words * update * add CI test * update * update * update --------- Co-authored-by: Yuanle Liu <yuanlehome@163.com>	2025-07-30 09:31:29 +08:00
zhuzixuan	ad7bb52a28	修复传入max_tokens=1时的报错 (#3068 ) * 修复传入max_tokens=1时的报错 * 修复传入max_tokens=1时的报错 * 修复传入max_tokens=1时的报错 * 修复传入max_tokens=1时的报错 * 修复传入max_tokens=1时的报错 * 修复传入max_tokens=1时的报错	2025-07-29 23:49:28 +08:00
Zero Rains	b2f9a42d87	[Feature] Support repetition early stop (#3024 ) * support repetition early stop and support user to set the parameter * remove log * fix codestyle * add the early_stop_config to rollout_config * update config and EarlyStopper class * fix the bug for triton * modify the stop method * update description * modify the usage for stop_flags --------- Co-authored-by: Yuanle Liu <yuanlehome@163.com>	2025-07-29 22:42:54 +08:00
JYChen	dafe02a7b9	[stop sequence] support stop sequence (#3025 ) * stop seqs in multi-ends * unittest for gpu stop op * kernel tid==0	2025-07-29 14:17:37 +08:00
李泳桦	69996a40da	[feat] add disable_chat_template in chat api as a substitute for previous raw_request (#3020 ) * [feat] add disable_chat_template in chat api as a substitute for previous raw_request * [fix] pre-commit code check	2025-07-25 20:57:32 +08:00
EnflameGCU	7634ffb709	[GCU] Add CI (#3006 )	2025-07-25 10:59:29 +08:00
Zero Rains	0fb37ab7e4	update flake8 version to support pre-commit in python3.12 (#3000 ) * update flake8 version to support pre-commit in python3.12 * polish code	2025-07-24 01:43:31 -07:00
Yzc216	e14587a954	[Feature] multi-source download (#2986 ) * multi-source download * multi-source download * huggingface download revision * requirement * style * add revision arg * test * pre-commit	2025-07-24 14:26:37 +08:00
李泳桦	8a619e9db5	[Feature] Add return_token_ids, prompt_token_ids, and delete training, raw_request in request body (#2940 ) * [feat] add return_token_ids, prompt_token_ids, delete raw_request in request body * [fix] return_token_ids not working in curl request * [test] improve some test cases of return_token_ids and prompt_token_ids * [fix] the server responds ok even if request.messages is an empty list	2025-07-21 19:31:14 +08:00
Yuanle Liu	2f74e93d7e	use dist.all_reduce(min) to sync num_blocks_local (#2933 ) * pre-commit all files check * reduce min num_blocks_local * fix nranks=1 * pre-commit when commit-msg	2025-07-21 01:23:36 -07:00
lizexu123	67990e0572	[Feature] support min_p_sampling (#2872 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * Fastdeploy support min_p * add test_min_p * fix * min_p_sampling * update * delete vl_gpu_model_runner.py * fix * Align usage of min_p with vLLM * fix * modified unit test * fix test_min_sampling * pre-commit all files * fix * fix * fix * fix xpu_model_runner.py	2025-07-20 23:17:59 -07:00
gaoziyuan	95a214ae43	support trainer_degree in name_mapping (#2935 )	2025-07-20 23:12:55 -07:00
YuanRisheng	bce2c6cd7c	rename test dir (#2934 )	2025-07-21 14:05:45 +08:00
liddk1121	17c5d3a241	[Iluvatar GPU] Add CI scripts (#2876 )	2025-07-21 09:44:42 +08:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
ZhangYulongg	b8676d71a8	update ci cases Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-18 21:44:07 +08:00
ZhangYulongg	43976138de	update ci cases	2025-07-18 21:44:07 +08:00
ZhangYulongg	e546e6b1b0	update ci cases	2025-07-18 21:44:07 +08:00
ZhangYulongg	eb77b1be6d	update ci cases	2025-07-18 21:44:07 +08:00
Jiang-Jia-Jun	fbe3547c95	[Feature] Support include_stop_str_in_output in chat/completion (#2910 ) * [Feature] Support include_stop_str_in_output in chat/completion * Add ci test for include_stop_str_in_output * Update version of openai * Fix ci test --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-18 16:59:18 +08:00
ming1753	1f15ca21e4	[Feature] support prompt repetition_penalty (#2806 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-17 12:05:52 +08:00
RAM	0fad10b35a	[Executor] CUDA Graph support padding batch (#2844 ) * cuda graph support padding batch * Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes. * Do not insert max_num_seqs when the user specifies a capture list * Support set graph optimization config from YAML file * update cuda graph ci * fix ci bug * fix ci bug	2025-07-15 19:49:01 -07:00

1 2 3

111 Commits