| 
							
							
								 gaoziyuan | a799d14df1 | [Bugfix] Fix model accuracy in some ops (#3231) * fix noaux_tc op
* fix
* update
* fix qk norm
* fix linear for prequant loader
* test
* fix
* fix
* rm some print
* fix noaux_tc op
* test
* Fix the confused enable_early_stop when only set early_stop_config (#3214)
* fix the confused early_stop_config when only set early_stop_config
* pre-commit
* write a general method
* Add ci case for min token and max token (#3229)
Co-authored-by: xujing43 <xujing43@baidu.com>
* add some evil cases (#3240)
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
* qwen3_moe (#3084)
* [Feature] support seed parameter (#3161)
* support seed
* fix
* add SamplingMetadata seed test
* The next_tokens values are inconsistent!
* add air and rejection seed test
* fix
* add SamplingParams seed test
* fix seed=0
* Default to defualt
* fix
* fix args_utils
* fix review
* fix review
* fix
* fix
* add xpu,gcu,iluvatar support seed
* fix
* 【Fix Bug】 修复 fa3 支持集中式bug (#3235)
* fix fa3 集中式bug
* 增加qknorm参数
* fix qk norm
* fix
* update
* fix linear for prequant loader
* fix
* fix
* rm some print
* fix
* fix moe init weight&scale
* fix moe init weight&scale
---------
Co-authored-by: bukejiyu <395822456@qq.com>
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
Co-authored-by: Zero Rains <linjunlu@zerorains.top>
Co-authored-by: xjkmfa <108254620+xjkmfa@users.noreply.github.com>
Co-authored-by: xujing43 <xujing43@baidu.com>
Co-authored-by: Divano <dddivano@outlook.com>
Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com>
Co-authored-by: lizexu123 <39205361+lizexu123@users.noreply.github.com>
Co-authored-by: yangjianfengo1 <125249383+yangjianfengo1@users.noreply.github.com>
Co-authored-by: qingqing01 <dangqingqing@baidu.com> | 2025-08-08 17:30:37 +08:00 |  | 
			
				
					| 
							
							
								 Yuan Xiaolan | af543b7f0f | revise get_moe_scores (#3164) | 2025-08-05 16:43:07 +08:00 |  | 
			
				
					| 
							
							
								 RichardWooSJTU | f5c64a074c | [EP] Refactor DeepEP Engine Organization for Mixed Mode & Buffer Management Optimization  (#3182) * Add support for mixed-ep across multi nodes
* code refine
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com> | 2025-08-05 15:40:11 +08:00 |  | 
			
				
					| 
							
							
								 Longzhi Wang | 907d561523 | fix ep when paddle version mismatch (#3056) | 2025-07-29 15:06:49 +08:00 |  | 
			
				
					| 
							
							
								 Longzhi Wang | 0700c90caa | [Feat] support mixed ep (#2969) 
		
	
	
		
			
				
	
				Deploy GitHub Pages / deploy (push) Has been cancelled * Support mixed ep
* fix comment
* fix comment
* update mixep
* fix conflict
* fix typo
* update
* fix typo
* fix code style
* fix conflict | 2025-07-25 15:29:30 +08:00 |  | 
			
				
					| 
							
							
								 xiaoxiaohehe001 | 2970b00dfa | [Feature] Support_eplb (#2997) 
		
	
	
		
			
				
	
				Deploy GitHub Pages / deploy (push) Has been cancelled * [Feature] support_eplb
* [Feature] support_eplb
* [Fix] fix mm ep | 2025-07-24 20:22:45 +08:00 |  | 
			
				
					| 
							
							
								 Zero Rains | 0fb37ab7e4 | update flake8 version to support pre-commit in python3.12 (#3000) * update flake8 version to support pre-commit in python3.12
* polish code | 2025-07-24 01:43:31 -07:00 |  | 
			
				
					| 
							
							
								 周周周 | ff4569f135 | remove some code in ep.py (#2947) | 2025-07-21 22:44:57 +08:00 |  | 
			
				
					| 
							
							
								 Zero Rains | 25698d56d1 | polish code with new pre-commit rule (#2923) | 2025-07-19 23:19:27 +08:00 |  | 
			
				
					| 
							
							
								 Jiang-Jia-Jun | 92c2cfa2e7 | Sync v2.0 version of code to github repo | 2025-06-29 23:29:37 +00:00 |  | 
			
				
					| 
							
							
								 jiangjiajun | 684703fd72 | [LLM] First commit the llm deployment code | 2025-06-09 19:20:15 +08:00 |  |