lizexu123 
							
						 
					 
					
						
						
							
						
						afff4d37ea 
					 
					
						
						
							
							[Feature] support seed parameter ( #3161 )  
						
						... 
						
						
						
						* support seed
* fix
* add SamplingMetadata seed test
* The next_tokens values are inconsistent!
* add air and rejection seed test
* fix
* add SamplingParams seed test
* fix seed=0
* Default to defualt
* fix
* fix args_utils
* fix review
* fix review
* fix
* fix
* add xpu,gcu,iluvatar support seed
* fix 
						
						
					 
					
						2025-08-06 15:20:47 +08:00 
						 
				 
			
				
					
						
							
							
								lizexu123 
							
						 
					 
					
						
						
							
						
						b01cfd6007 
					 
					
						
						
							
							[BugFix] support real batch_size ( #3109 )  
						
						... 
						
						
						
						* support real bsz
* fix
* fix xpu_model_runner.py,gpu_model_runner.py,gcu_model_runner.py,iluvatar_model_runner.py
* add event_loop_ep
* fix
* Add comments
* fix
* support mtp real_batch_size
* fix
* self.tmp_seq_lens_this_time->self.seq_lens_this_time_buffer
* fix
* fix VL real_seq_lens_this_time
* fix
* fix mtp
* fix
* fix mtp
* fix xpu
* fix 
						
						
					 
					
						2025-08-05 16:33:54 +08:00 
						 
				 
			
				
					
						
							
							
								Ryan 
							
						 
					 
					
						
						
							
						
						73cfe1fd37 
					 
					
						
						
							
							[SOT]  Extend SOT warmup support to new hardware ( #3032 )  
						
						... 
						
						
						
						* add new hardware
* add_sot_warmup4new_hardware
* fix conflict
* rm Optional 
						
						
					 
					
						2025-07-29 22:45:20 +08:00 
						 
				 
			
				
					
						
							
							
								ltd0924 
							
						 
					 
					
						
						
							
						
						3792345c3a 
					 
					
						
						
							
							[LLM] update function name ( #2985 )  
						
						... 
						
						
						
						* [LLM] update function name 
						
						
					 
					
						2025-07-24 15:03:40 +08:00 
						 
				 
			
				
					
						
							
							
								Zero Rains 
							
						 
					 
					
						
						
							
						
						89a485b69f 
					 
					
						
						
							
							[Feature] Support using prefix-caching + cudagraph for inference ( #2924 )  
						
						... 
						
						
						
						* fix the bug in cudagraph+prefix-caching but still have some bug with profile
Change-Id: Ibf2ba3f2e3b08641d03f4b1391d7c862c3efa397
* add the signal to make sure cache manager launched
* fix judge condition
* reomove useless control
* update control stream
* update
* fix xpu
* change the do_profile flag
* update
* add new threads to init cache_manager
---------
Co-authored-by: RAM <gstian5555@outlook.com > 
						
						
					 
					
						2025-07-22 00:59:45 -07:00 
						 
				 
			
				
					
						
							
							
								Zero Rains 
							
						 
					 
					
						
						
							
						
						25698d56d1 
					 
					
						
						
							
							polish code with new pre-commit rule ( #2923 )  
						
						
						
						
					 
					
						2025-07-19 23:19:27 +08:00 
						 
				 
			
				
					
						
							
							
								EnflameGCU 
							
						 
					 
					
						
						
							
						
						d0f4d6ba3a 
					 
					
						
						
							
							[GCU] Support gcu platform ( #2702 )  
						
						... 
						
						
						
						baseline: e7fa57ebaexing.wo@163.com > 
						
						
					 
					
						2025-07-08 13:00:52 +08:00