李泳桦 
							
						 
					 
					
						
						
							
						
						2a8a2c06de 
					 
					
						
						
							
							[fix] non-streaming api now returns full output ids if return_token_ids is enabled ( #2951 )  
						
						
						
						
					 
					
						2025-07-22 14:35:56 +08:00 
						 
				 
			
				
					
						
							
							
								lifulll 
							
						 
					 
					
						
						
							
						
						2c6a9e887e 
					 
					
						
						
							
							native top_p_sampling ( #2901 )  
						
						
						
						
					 
					
						2025-07-22 14:09:59 +08:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						0eedbdaee0 
					 
					
						
						
							
							fix import error ( #2944 )  
						
						
						
						
					 
					
						2025-07-22 14:06:01 +08:00 
						 
				 
			
				
					
						
							
							
								K11OntheBoat 
							
						 
					 
					
						
						
							
						
						8020927f50 
					 
					
						
						
							
							[BugFix] Rename attention params of deepseekv3 ( #2939 )  
						
						... 
						
						
						
						Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”> 
						
						
					 
					
						2025-07-22 14:01:30 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						56102e91e1 
					 
					
						
						
							
							[Polish] Return error message of raw_request ( #2946 )  
						
						... 
						
						
						
						Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com > 
						
						
					 
					
						2025-07-22 10:21:32 +08:00 
						 
				 
			
				
					
						
							
							
								zhink 
							
						 
					 
					
						
						
							
						
						0262ef7eb3 
					 
					
						
						
							
							custom all reduce support cuda graph ( #2938 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* Support enabling cuda graph and custom all reduce at the same time, and fix the overwritten custom all reduce flag
* rename communication_op to communication 
						
						
					 
					
						2025-07-21 22:52:03 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						ff4569f135 
					 
					
						
						
							
							remove some code in ep.py ( #2947 )  
						
						
						
						
					 
					
						2025-07-21 22:44:57 +08:00 
						 
				 
			
				
					
						
							
							
								李泳桦 
							
						 
					 
					
						
						
							
						
						8a619e9db5 
					 
					
						
						
							
							[Feature] Add return_token_ids, prompt_token_ids, and delete training, raw_request in request body ( #2940 )  
						
						... 
						
						
						
						* [feat] add return_token_ids, prompt_token_ids, delete raw_request in request body
* [fix] return_token_ids not working in curl request
* [test] improve some test cases of return_token_ids and prompt_token_ids
* [fix] the server responds ok even if request.messages is an empty list 
						
						
					 
					
						2025-07-21 19:31:14 +08:00 
						 
				 
			
				
					
						
							
							
								littledgg 
							
						 
					 
					
						
						
							
						
						2845bde964 
					 
					
						
						
							
							[Executor] Avoid OOM when start the service while Enable Chunked Prefill + CudaGraph  ( #2936 )  
						
						... 
						
						
						
						* [Executor] Avoid OOM when start the service while Enable Chunked Prefill + CudaGraph
* Fix: Apply black formatting 
						
						
					 
					
						2025-07-21 16:25:51 +08:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						2f74e93d7e 
					 
					
						
						
							
							use dist.all_reduce(min) to sync num_blocks_local ( #2933 )  
						
						... 
						
						
						
						* pre-commit all files check
* reduce min num_blocks_local
* fix nranks=1
* pre-commit when commit-msg 
						
						
					 
					
						2025-07-21 01:23:36 -07:00 
						 
				 
			
				
					
						
							
							
								lizexu123 
							
						 
					 
					
						
						
							
						
						67990e0572 
					 
					
						
						
							
							[Feature] support min_p_sampling ( #2872 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* Fastdeploy support min_p
* add test_min_p
* fix
* min_p_sampling
* update
* delete vl_gpu_model_runner.py
* fix
* Align usage of min_p with vLLM
* fix
* modified unit test
* fix test_min_sampling
* pre-commit all files
* fix
* fix
* fix
* fix xpu_model_runner.py 
						
						
					 
					
						2025-07-20 23:17:59 -07:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						95a214ae43 
					 
					
						
						
							
							support trainer_degree in name_mapping ( #2935 )  
						
						
						
						
					 
					
						2025-07-20 23:12:55 -07:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						bce2c6cd7c 
					 
					
						
						
							
							rename test dir ( #2934 )  
						
						
						
						
					 
					
						2025-07-21 14:05:45 +08:00 
						 
				 
			
				
					
						
							
							
								ltd0924 
							
						 
					 
					
						
						
							
						
						cc4cec0a74 
					 
					
						
						
							
							Update engine_client.py ( #2931 )  
						
						
						
						
					 
					
						2025-07-21 11:42:16 +08:00 
						 
				 
			
				
					
						
							
							
								liddk1121 
							
						 
					 
					
						
						
							
						
						17c5d3a241 
					 
					
						
						
							
							[Iluvatar GPU] Add CI scripts ( #2876 )  
						
						
						
						
					 
					
						2025-07-21 09:44:42 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						8c5407d9e4 
					 
					
						
						
							
							remove cum_offsets from ForwardMeta ( #2925 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-07-19 23:57:27 +08:00 
						 
				 
			
				
					
						
							
							
								Zero Rains 
							
						 
					 
					
						
						
							
						
						25698d56d1 
					 
					
						
						
							
							polish code with new pre-commit rule ( #2923 )  
						
						
						
						
					 
					
						2025-07-19 23:19:27 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						b8676d71a8 
					 
					
						
						
							
							update ci cases  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						43976138de 
					 
					
						
						
							
							update ci cases  
						
						
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						e546e6b1b0 
					 
					
						
						
							
							update ci cases  
						
						
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						9c8292fb19 
					 
					
						
						
							
							update ci cases  
						
						
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						a5e95013b5 
					 
					
						
						
							
							update ci cases  
						
						
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						93481a5478 
					 
					
						
						
							
							update ci cases  
						
						
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ZhangYulongg 
							
						 
					 
					
						
						
							
						
						eb77b1be6d 
					 
					
						
						
							
							update ci cases  
						
						
						
						
					 
					
						2025-07-18 21:44:07 +08:00 
						 
				 
			
				
					
						
							
							
								ming1753 
							
						 
					 
					
						
						
							
						
						5328daa333 
					 
					
						
						
							
							[Bug Fix] fix ep config bug ( #2920 )  
						
						
						
						
					 
					
						2025-07-18 19:12:56 +08:00 
						 
				 
			
				
					
						
							
							
								xiaoxiaohehe001 
							
						 
					 
					
						
						
							
						
						a42fc3f40b 
					 
					
						
						
							
							[Feature] Support 45tVL EP FP8 Infer. ( #2909 )  
						
						... 
						
						
						
						* support_mm_ep_fp8
* support_mm_ep 
						
						
					 
					
						2025-07-18 17:57:15 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						fbe3547c95 
					 
					
						
						
							
							[Feature] Support include_stop_str_in_output in chat/completion ( #2910 )  
						
						... 
						
						
						
						* [Feature] Support include_stop_str_in_output in chat/completion
* Add ci test for include_stop_str_in_output
* Update version of openai
* Fix ci test
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com > 
						
						
					 
					
						2025-07-18 16:59:18 +08:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						6efad14b95 
					 
					
						
						
							
							support vl ori_vacab_size ( #2900 )  
						
						
						
						
					 
					
						2025-07-18 16:26:14 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						d306944f4f 
					 
					
						
						
							
							remove cum_offsets from get_block_shape_and_split_kv_block ( #2913 )  
						
						... 
						
						
						
						* remove padding_offsets from get_padding_offset.cu
* remove padding_offsets from get_padding_offset.cu
* remove padding_offsets from get_padding_offset.cu
* remove cum_offsets from get_block_shape_and_split_kv_block
* remove cum_offsets from get_block_shape_and_split_kv_block 
						
						
					 
					
						2025-07-18 16:13:32 +08:00 
						 
				 
			
				
					
						
							
							
								YUNSHEN XIE 
							
						 
					 
					
						
						
							
						
						e81137e581 
					 
					
						
						
							
							fix ci workflow ( #2896 )  
						
						
						
						
					 
					
						2025-07-18 16:01:00 +08:00 
						 
				 
			
				
					
						
							
							
								RAM 
							
						 
					 
					
						
						
							
						
						cd52dc0f65 
					 
					
						
						
							
							[Executor] Fix set capture sizes bug ( #2902 )  
						
						
						
						
					 
					
						2025-07-18 15:12:19 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						1339e56282 
					 
					
						
						
							
							[XPU] Remove padding_offsets from get_padding_offset.cu ( #2911 )  
						
						
						
						
					 
					
						2025-07-18 14:16:44 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						0eb5dc18d3 
					 
					
						
						
							
							[BugFix]Fix sample rejection ( #2908 )  
						
						... 
						
						
						
						* fix config
* fix rejection 
						
						
					 
					
						2025-07-18 13:44:30 +08:00 
						 
				 
			
				
					
						
							
							
								sg263 
							
						 
					 
					
						
						
							
						
						e679567d59 
					 
					
						
						
							
							[Trace]fix opentelemetry can not work in uvicorn ( #2906 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* fix annotation
* fix annotation when add opentelemetry
* fix opentelemetry-instrumentation-fastapi
* fix pentelemetry-bootstrap
* fix opentelemetry can not work in uvicorn
* move conf to env
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-07-17 23:16:45 +08:00 
						 
				 
			
				
					
						
							
							
								RAM 
							
						 
					 
					
						
						
							
						
						bbe2c5c968 
					 
					
						
						
							
							Update GraphOptimizationBackend docs ( #2898 )  
						
						
						
						
					 
					
						2025-07-17 21:38:18 +08:00 
						 
				 
			
				
					
						
							
							
								ltd0924 
							
						 
					 
					
						
						
							
						
						4b14dca1d6 
					 
					
						
						
							
							[LLM] delete fixed slots ( #2893 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-07-17 19:19:54 +08:00 
						 
				 
			
				
					
						
							
							
								yulangz 
							
						 
					 
					
						
						
							
						
						c8c280c4d3 
					 
					
						
						
							
							[XPU][Doc] fix typo ( #2892 )  
						
						
						
						
					 
					
						2025-07-17 19:13:54 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						ddb10ac509 
					 
					
						
						
							
							[Inference, rename] remove padding_offsets from atten use batch_id_per_token ( #2880 )  
						
						... 
						
						
						
						* remove padding_offsets from atten 
						
						
					 
					
						2025-07-17 18:41:31 +08:00 
						 
				 
			
				
					
						
							
							
								freeliuzc 
							
						 
					 
					
						
						
							
						
						d49f8fb30a 
					 
					
						
						
							
							[Feature][MTP] Support cacheKV transfer in per_chunk mode ( #2890 )  
						
						... 
						
						
						
						* support chunk_prefill both normal and speculative_decoding(mtp)
* optimize pd-disaggregation config
* fix bug 
						
						
					 
					
						2025-07-17 17:58:08 +08:00 
						 
				 
			
				
					
						
							
							
								ming1753 
							
						 
					 
					
						
						
							
						
						67180c1ff9 
					 
					
						
						
							
							[Bug Fix] fix bug of prompt penalty ( #2888 )  
						
						
						
						
					 
					
						2025-07-17 17:21:37 +08:00 
						 
				 
			
				
					
						
							
							
								Xintong Yu 
							
						 
					 
					
						
						
							
						
						273efba76f 
					 
					
						
						
							
							[Fix] remove misleading variables ( #2841 )  
						
						... 
						
						
						
						Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-07-17 16:49:14 +08:00 
						 
				 
			
				
					
						
							
							
								YUNSHEN XIE 
							
						 
					 
					
						
						
							
						
						1cfba5ba3e 
					 
					
						
						
							
							enable CI workflow for pull requests targeting release/* branches ( #2887 )  
						
						
						
						
					 
					
						2025-07-17 16:48:03 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						31cab9f87b 
					 
					
						
						
							
							Update test_openai.py  
						
						
						
						
					 
					
						2025-07-17 16:07:31 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						d3dfa1446c 
					 
					
						
						
							
							Update test_openai.py  
						
						
						
						
					 
					
						2025-07-17 16:07:07 +08:00 
						 
				 
			
				
					
						
							
							
								ltd0924 
							
						 
					 
					
						
						
							
						
						b630031414 
					 
					
						
						
							
							[LLM] fix serval bugs ( #2878 )  
						
						
						
						
					 
					
						2025-07-17 14:21:05 +08:00 
						 
				 
			
				
					
						
							
							
								LokeZhou 
							
						 
					 
					
						
						
							
						
						f50c25178b 
					 
					
						
						
							
							[MM_PROCESS] add _extract_labels ( #2879 )  
						
						
						
						
					 
					
						2025-07-17 14:20:01 +08:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						dbb9e2506b 
					 
					
						
						
							
							Fix rollout_model init ( #2881 )  
						
						
						
						
					 
					
						2025-07-16 22:36:21 -07:00 
						 
				 
			
				
					
						
							
							
								ming1753 
							
						 
					 
					
						
						
							
						
						1f15ca21e4 
					 
					
						
						
							
							[Feature] support prompt repetition_penalty ( #2806 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-07-17 12:05:52 +08:00 
						 
				 
			
				
					
						
							
							
								yulangz 
							
						 
					 
					
						
						
							
						
						7dfd2ea052 
					 
					
						
						
							
							[XPU][doc] Update minimal fastdeploy required ( #2863 )  
						
						... 
						
						
						
						* [XPU][doc] update minimal fastdeploy required 
						
						
					 
					
						2025-07-17 11:33:22 +08:00 
						 
				 
			
				
					
						
							
							
								GoldPancake 
							
						 
					 
					
						
						
							
						
						42d4001400 
					 
					
						
						
							
							[Features] Add speculative metrics ( #2857 )  
						
						
						
						
					 
					
						2025-07-17 11:08:55 +08:00