FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-31 11:56:44 +08:00

Author	SHA1	Message	Date
Zero Rains	e37e86b3b8	[V1 Loader]support param create and load for wint2 and xpu backend (#3581 ) * support wint2 backend' * [V1 Loader]support param create and load for wint2 and xpu backend * update weight shape name * update * update * update baseline.txt * update model name * update baseline.txt * fix codestyle * remove debug coode	2025-08-28 09:49:36 +08:00
李泳桦	b2afdf4fc6	[fix] qwen output inconsistency when top_p=0 (#3634 ) * [fix] qwen output inconsistency when top_p=0 * [fix] remove decode pre_id code	2025-08-27 17:16:23 +08:00
Yuanle Liu	cbce94a00e	rename ernie_xxx to ernie4_5_xxx (#3621 ) * rename ernie_xxx to ernie4_5_xxx * ci fix	2025-08-26 19:29:27 +08:00
Sunny-bot1	c68c3c4b8b	[Feature] bad words support v1 scheduler and specifiy token ids (#3608 ) * support bad_words_token_ids * docs * fix test * fix * bad words support kvcache v1 and token ids * fix	2025-08-25 20:14:51 -07:00
Kane2011	2ae7ab28d2	[MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492 )	2025-08-25 17:44:20 +08:00
Kane2011	b4fef2cf29	[MetaxGPU] Support FastDeploy on metax gpu (#3241 ) * [MetaxGPU] Support FastDeploy on metax gpu * Update metax_worker.py 1. change worker log; 2. remove custom allreduce, adapt it later; 3. remove cuda graph; * Update __init__.py 1. remove metax's key work comment * Update __init__.py 1. remove metax's key word comment; 2. add fused_moe_kernel_paddle import --------- Co-authored-by: yongqiangma <xing.wo@163.com>	2025-08-13 11:11:54 +08:00