| 
							
							
								 chen | f0f00a6025 | [OPs] Universal optimization and Fix early_stop cuda 700 (#3375) 
		
	
	
		
			
				
	
				Deploy GitHub Pages / deploy (push) Has been cancelled * delete nonzero
* delete setup_ops_base.py
* check if
* check gcp infer_seed.cpu()
* fix repetition_early_stopper_kernel cuda 700 | 2025-08-14 22:40:44 +08:00 |  | 
			
				
					| 
							
							
								 yzwu | fbdd6b0663 | [Iluvatar GPU] Optimze attention and moe performance (#3234) | 2025-08-08 10:51:24 +08:00 |  | 
			
				
					| 
							
							
								 Sunny-bot1 | 7c5e34e72d | [FIX]fix rejection sampling when topp=0 using _SAMPLING_EPS (#2967) * fix rejection sampling when topp=0
* fix | 2025-07-22 05:53:37 -07:00 |  | 
			
				
					| 
							
							
								 lizexu123 | 67990e0572 | [Feature] support min_p_sampling (#2872) 
		
	
	
		
			
				
	
				Deploy GitHub Pages / deploy (push) Has been cancelled * Fastdeploy support min_p
* add test_min_p
* fix
* min_p_sampling
* update
* delete vl_gpu_model_runner.py
* fix
* Align usage of min_p with vLLM
* fix
* modified unit test
* fix test_min_sampling
* pre-commit all files
* fix
* fix
* fix
* fix xpu_model_runner.py | 2025-07-20 23:17:59 -07:00 |  | 
			
				
					| 
							
							
								 Sunny-bot1 | e45050cae3 | [Feature] support top_k_top_p sampling (#2753) * support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix | 2025-07-09 20:58:58 -07:00 |  | 
			
				
					| 
							
							
								 liddk1121 | 1b54a2831e | Adapt for iluvatar gpu (#2684) | 2025-07-07 16:53:14 +08:00 |  | 
			
				
					| 
							
							
								 Jiang-Jia-Jun | 92c2cfa2e7 | Sync v2.0 version of code to github repo | 2025-06-29 23:29:37 +00:00 |  |