| 
							
							
								 Zero Rains | e37e86b3b8 | [V1 Loader]support param create and load for wint2 and xpu backend (#3581) * support wint2 backend'
* [V1 Loader]support param create and load for wint2 and xpu backend
* update weight shape name
* update
* update
* update baseline.txt
* update model name
* update baseline.txt
* fix codestyle
* remove debug coode | 2025-08-28 09:49:36 +08:00 |  | 
			
				
					| 
							
							
								 李泳桦 | b2afdf4fc6 | [fix] qwen output inconsistency when top_p=0 (#3634) * [fix] qwen output inconsistency when top_p=0
* [fix] remove decode pre_id code | 2025-08-27 17:16:23 +08:00 |  | 
			
				
					| 
							
							
								 Yuanle Liu | cbce94a00e | rename ernie_xxx to ernie4_5_xxx (#3621) * rename ernie_xxx to ernie4_5_xxx
* ci fix | 2025-08-26 19:29:27 +08:00 |  | 
			
				
					| 
							
							
								 Sunny-bot1 | c68c3c4b8b | [Feature] bad words support v1 scheduler and specifiy token ids (#3608) * support bad_words_token_ids
* docs
* fix test
* fix
* bad words support kvcache v1 and token ids
* fix | 2025-08-25 20:14:51 -07:00 |  | 
			
				
					| 
							
							
								 Kane2011 | 2ae7ab28d2 | [MetaxGPU] adapt to the latest fastdeploy on metax gpu (#3492) | 2025-08-25 17:44:20 +08:00 |  | 
			
				
					| 
							
							
								 Kane2011 | b4fef2cf29 | [MetaxGPU] Support FastDeploy on metax gpu  (#3241) * [MetaxGPU] Support FastDeploy on metax gpu
* Update metax_worker.py
1. change worker log;
2. remove custom allreduce, adapt it later;
3. remove cuda graph;
* Update __init__.py
1. remove metax's key work comment
* Update __init__.py
1. remove metax's key word comment;
2. add fused_moe_kernel_paddle import
---------
Co-authored-by: yongqiangma <xing.wo@163.com> | 2025-08-13 11:11:54 +08:00 |  |