chenjian 
							
						 
					 
					
						
						
							
						
						918ccdb123 
					 
					
						
						
							
							[Feature] Support pd ep deployment with yiyan adapter ( #4029 )  
						
						... 
						
						
						
						* [Feature] Support mixed deployment with yiyan adapter in release2.2
* fix metrics
* add unit test
* add unit test
* add unit test
* Support pd ep deployment with yiyan adapter
* Support pd ep deployment with yiyan adapter
* refactor cache messager
* support scheduler v1 in PD
* suppport pd v1 + chunk prefill
* suppport pd v1 + chunk prefill
* add eplb
* support eplb
* support eplb
* support eplb
* support v1
* fix
* fix
* fix bug
* remove eplb support
* support prefix cache in P
* fix bug
* fix bug
* support one stop in V1
* fix bug
* fix ci
* fix ci
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com > 
						
						
					 
					
						2025-09-22 16:41:38 +08:00 
						 
				 
			
				
					
						
							
							
								lizexu123 
							
						 
					 
					
						
						
							
						
						c86945ef49 
					 
					
						
						
							
							[Feature] support pool ( #3827 )  
						
						... 
						
						
						
						* support pool
* update pooling
* add pooler_config and check
* update
* support AutoWeightsLoader load weight
* fix
* update
* delete print
* update pre-commit
* fix
* fix xpu
* fix ModelRegistry->model_registry
* fix Copilot review
* fix pooler.py
* delete StepPooler
* fix abstract
* fix default_loader_v1
* fix Pre Commit
* support torch qwen3 dense
* add test and fix torch-qwen
* fix
* fix
* adapter ci:
* fix review
* fix pooling_params.py
* fix
* fix tasks.py 2025
* fix print and logger
* Modefy ModelRegistry and delete AutoWeightsLoader
* fix logger
* fix test_embedding
* fix ci bug
* ernie4_5 model_registry
* fix test
* support Qwen3-Embedding-0.6B tp=1 load
* fix extra code
* fix
* delete fix vocab_size
* delete prepare_params_dict
* fix: 
						
						
					 
					
						2025-09-22 14:09:09 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						da74a5f0b3 
					 
					
						
						
							
							fix glm all_reduce tp group ( #4187 )  
						
						
						
						
					 
					
						2025-09-22 10:56:55 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						24180fba0a 
					 
					
						
						
							
							[FDConfig]Remove splitwise_role and engine_worker_queue_port in FDConfig ( #4147 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* remove splitwise_role and engine_worker_queue_port
* fix xpu
* fix xpu
* fix xpu
* fix unittest
* resolve conflct 
						
						
					 
					
						2025-09-19 17:01:52 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						c3b8ebeb18 
					 
					
						
						
							
							[Optimize] Machete using group scale default ( #4121 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-18 13:51:11 +08:00 
						 
				 
			
				
					
						
							
							
								chenjian 
							
						 
					 
					
						
						
							
						
						618ccdbfba 
					 
					
						
						
							
							[Feature] Support mixed deployment with yiyan adapter in develop ( #3976 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* [Feature] Support mixed deployment with yiyan adapter in release2.2
* fix metrics
* add unit test
* add unit test
* add unit test
* fix ci
* fix for eb5
* fix ci
* fix ci
* fix ci
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-09-18 01:52:20 +08:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						896e3bb606 
					 
					
						
						
							
							[NewFeture]add ep rollout model init and update/clear ep buffer ( #4039 )  
						
						... 
						
						
						
						* fix gid
* merge
* fix test
* fix bug
* fix
* fix ci 
						
						
					 
					
						2025-09-17 20:24:53 +08:00 
						 
				 
			
				
					
						
							
							
								RichardWooSJTU 
							
						 
					 
					
						
						
							
						
						2adca04f1f 
					 
					
						
						
							
							Reconstruct streaming data transfer with zmq ( #3836 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* reconstruct USE_GET_SAVE_OUTPUT_V1
* fix ut
* use dp rank
* fix ci 
						
						
					 
					
						2025-09-17 14:30:39 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						2e9e53ff7e 
					 
					
						
						
							
							[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config ( #4116 )  
						
						... 
						
						
						
						* remove max_num_batched_tokens in parallel config
* remove max_num_seqs
* update test case
* fix test
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-09-17 10:43:35 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						03b3d6175d 
					 
					
						
						
							
							fix mtp ( #4105 )  
						
						
						
						
					 
					
						2025-09-15 20:26:07 +08:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						113e330030 
					 
					
						
						
							
							fix bf16 and add comments ( #4106 )  
						
						
						
						
					 
					
						2025-09-15 17:23:07 +08:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						b1b33211e8 
					 
					
						
						
							
							[CUDAGraph] Support multi output buffers and merge some fixes from feature/exp_0908 ( #4062 )  
						
						... 
						
						
						
						* refine cudagraph
* refine cudagraph
* typo
* fix
* fix plugins
* fix
* update
* update
* update 
						
						
					 
					
						2025-09-15 16:21:30 +08:00 
						 
				 
			
				
					
						
							
							
								zhupengyang 
							
						 
					 
					
						
						
							
						
						9409665713 
					 
					
						
						
							
							[xpu] support ep ( #4067 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-15 13:53:11 +08:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						29ed617f0f 
					 
					
						
						
							
							[v1 loader]qwen Offline fp8 ( #4036 )  
						
						... 
						
						
						
						* support offline fp8
* update ut
* update ut
* update ut
* fix
* update
* update 
						
						
					 
					
						2025-09-15 13:44:11 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						b1a5b756a3 
					 
					
						
						
							
							[Optimize] Support WINT8 and group scale for Machete ( #3905 )  
						
						
						
						
					 
					
						2025-09-15 12:01:34 +08:00 
						 
				 
			
				
					
						
							
							
								Ayakouji 
							
						 
					 
					
						
						
							
						
						987609c894 
					 
					
						
						
							
							[BugFix] Fix image_feature 0-Size causing insert failed ( #4042 )  
						
						... 
						
						
						
						* update
* fix image_feature 
						
						
					 
					
						2025-09-12 19:13:08 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						88ea565aba 
					 
					
						
						
							
							[BugFix]Fix load kv cache quant scale ( #4077 )  
						
						... 
						
						
						
						* fix kv cache
* fix kv_cache
* fix kv cache 
						
						
					 
					
						2025-09-12 17:44:03 +08:00 
						 
				 
			
				
					
						
							
							
								SuperNova 
							
						 
					 
					
						
						
							
						
						805f29a06c 
					 
					
						
						
							
							[Feature] refactor metax_gpu attention and moe and remove some useless code ( #3688 )  
						
						... 
						
						
						
						Co-authored-by: yongqiangma <xing.wo@163.com > 
						
						
					 
					
						2025-09-12 14:40:25 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						8466219ec8 
					 
					
						
						
							
							fix typos ( #3840 )  
						
						... 
						
						
						
						* fix typos
* ci
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com > 
						
						
					 
					
						2025-09-12 11:04:38 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						4859f40b20 
					 
					
						
						
							
							[Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) ( #4051 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-11 20:08:09 +08:00 
						 
				 
			
				
					
						
							
							
								xiaoxiaohehe001 
							
						 
					 
					
						
						
							
						
						abdcef30aa 
					 
					
						
						
							
							[BugFix] mm_post_fix ( #4005 )  
						
						... 
						
						
						
						* mm_post_fix
* mm_post_fix_1 
						
						
					 
					
						2025-09-11 19:09:46 +08:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						2650f58740 
					 
					
						
						
							
							[docs] Update environment variables documentation ( #3957 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-10 21:17:06 -07:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						a7392a0ff9 
					 
					
						
						
							
							【Inference Optimize】DeepSeek-V3-model MLA Optimize ( #3886 )  
						
						... 
						
						
						
						* support MLA chunk_size auto search & cuda_graph 
						
						
					 
					
						2025-09-11 10:46:09 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						637d96c6ae 
					 
					
						
						
							
							[Feature] Support zai-org/GLM-4.5-Air BF16 model ( #3928 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* support glm45_air 
						
						
					 
					
						2025-09-10 19:36:10 +08:00 
						 
				 
			
				
					
						
							
							
								RAM 
							
						 
					 
					
						
						
							
						
						d3e4ae3d49 
					 
					
						
						
							
							[Executor] Adjust signal sending order in RL training ( #3773 )  
						
						... 
						
						
						
						* Adjust processing order
* fix bug
* fix update_parameters bug
* refine code 
						
						
					 
					
						2025-09-10 13:24:20 +08:00 
						 
				 
			
				
					
						
							
							
								Ayakouji 
							
						 
					 
					
						
						
							
						
						453487d5b0 
					 
					
						
						
							
							[Feat] ernie4_5_vl_moe support CudaGraph ( #3226 )  
						
						... 
						
						
						
						* delete dynamic control flow for decode
* coda-style
* fix scatter/gather typos and use input stream instead default stream
* support 0-Size Tensor
* update runner and model
* using static mem address as input
* fix mem leak
* refine code
* update mm_buffer
* fix typo
* fix buffersize
* fix unk token
* refine code
* refine
* support other arch
* open cudagraph in vlci
* fix
* update
* update
* update
* fix cmd
* update
---------
Co-authored-by: aquagull <hongyuh@qq.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com > 
						
						
					 
					
						2025-09-10 13:11:57 +08:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						c3b2a60fb8 
					 
					
						
						
							
							[BugFix] Fix the abnormal memory usage caused by shape errors in the triton moe backend ( #4026 )  
						
						... 
						
						
						
						* fix device_id to in
* fix triton_moe bug 
						
						
					 
					
						2025-09-09 20:05:54 -07:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						3b1da6e4dd 
					 
					
						
						
							
							support v1 loader for machete ( #3999 )  
						
						
						
						
					 
					
						2025-09-10 10:21:33 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						b3fac5bde1 
					 
					
						
						
							
							[V1 Loader] Ernie kv cache quant support v1 loader ( #3899 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* support c8 for ernie
* add unittest
* support vl
* fix c8 
						
						
					 
					
						2025-09-09 05:25:08 -07:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						c60adf4281 
					 
					
						
						
							
							Revert "【FIX】Change the name of sparse attn from moba to plas ( #3845 )" ( #4001 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						This reverts commit e31c8f7336 
						
						
					 
					
						2025-09-09 11:08:23 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						e31c8f7336 
					 
					
						
						
							
							【FIX】Change the name of sparse attn from moba to plas ( #3845 )  
						
						... 
						
						
						
						* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci 
						
						
					 
					
						2025-09-09 10:56:50 +08:00 
						 
				 
			
				
					
						
							
							
								Jundong Liu 
							
						 
					 
					
						
						
							
						
						3d0aaa5923 
					 
					
						
						
							
							[Excutor] Experiment Feature-Support Prefill in cudagraph ( #3459 )  
						
						... 
						
						
						
						* Support prefill in Cudagraph
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.1
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.2
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.3
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.4
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.5
* Solve problem about encoder_num_blocks_x_cpu
* Add early-exit mechanism for attention kernel
* fix test case about append-attention
* Update testcode, Add annotations to related tensors
* move get_input_length_list
* solve test_code
* Add annotations about early-exit for attention kernel
* Add annotations about early-exit for attention kernel2
* solve comment
* solve mtp
---------
Co-authored-by: RAM <gstian5555@outlook.com > 
						
						
					 
					
						2025-09-08 13:12:24 +08:00 
						 
				 
			
				
					
						
							
							
								lzy 
							
						 
					 
					
						
						
							
						
						af49b81ffd 
					 
					
						
						
							
							supports dynamic Cfp8 ( #3767 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* supports dynamic Cfp8
* add unittest 
						
						
					 
					
						2025-09-07 20:41:29 -07:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						e52ce1c4b1 
					 
					
						
						
							
							cache feature ( #3857 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-07 18:52:46 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						0d989829bb 
					 
					
						
						
							
							Compatible with EB 0.3B torch model arch ( #3913 )  
						
						... 
						
						
						
						* fix
* check 
						
						
					 
					
						2025-09-05 19:04:59 +08:00 
						 
				 
			
				
					
						
							
							
								Yuan Xiaolan 
							
						 
					 
					
						
						
							
						
						2cf55168ca 
					 
					
						
						
							
							load hadamard_block_size from config ( #3797 )  
						
						
						
						
					 
					
						2025-09-05 17:07:58 +08:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						41aee08982 
					 
					
						
						
							
							【Inference Optimize】Update MergedReplicatedLinear for DSK qkv_a_proj_with_mqa. ( #3673 )  
						
						... 
						
						
						
						* support MergedReplicatedLinear
* update MergedReplicatedLinear to support DSK_wint4 V1_load
* update model name
* update linear class
* fix
* fix v0 moe_bias load
---------
Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com > 
						
						
					 
					
						2025-09-04 21:16:05 -07:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						ab1929f5ff 
					 
					
						
						
							
							fix mem boom in ep ( #3854 )  
						
						
						
						
					 
					
						2025-09-05 11:48:21 +08:00 
						 
				 
			
				
					
						
							
							
								freeliuzc 
							
						 
					 
					
						
						
							
						
						88d44a2c93 
					 
					
						
						
							
							support mtp in v1_scheduler mode ( #3695 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-04 17:39:59 +08:00 
						 
				 
			
				
					
						
							
							
								Ayakouji 
							
						 
					 
					
						
						
							
						
						31313e0f3d 
					 
					
						
						
							
							[Feature] ernie4_5_vl_moe support huggingface safetensor loading ( #3750 )  
						
						... 
						
						
						
						* update
* update
* update in tp
* add todo
* update
---------
Co-authored-by: aquagull <hongyuh@qq.com > 
						
						
					 
					
						2025-09-03 02:58:59 -07:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						0a1ce612c2 
					 
					
						
						
							
							V1 loader support ep ( #3801 )  
						
						
						
						
					 
					
						2025-09-03 16:05:41 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						ce998449e0 
					 
					
						
						
							
							fix w8a8.py ( #3733 )  
						
						
						
						
					 
					
						2025-09-03 10:57:26 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						5441538173 
					 
					
						
						
							
							rename fused_get_rope.cu ( #3752 )  
						
						... 
						
						
						
						* rename fused_get_rope.cu
* fix
* fix typos
* fix
* fix 
						
						
					 
					
						2025-09-03 10:54:34 +08:00 
						 
				 
			
				
					
						
							
							
								Longzhi Wang 
							
						 
					 
					
						
						
							
						
						e0c9a6c76c 
					 
					
						
						
							
							[Feat] Support streaming transfer data using ZMQ ( #3521 )  
						
						... 
						
						
						
						* Support streaming transfer data of ZMQ
* fix typo
* fix typo
* support tp
* add unittest
* update
* update
* fix typo
* fix typo
* fix tp_num in ci machine
---------
Co-authored-by: Wanglongzhi2001 <> 
						
						
					 
					
						2025-09-02 19:52:19 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						8e1b35a09b 
					 
					
						
						
							
							【Fix bug]  w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 ( #3771 )  
						
						... 
						
						
						
						* fix w4afp8
* 增加集中式配置
* codestyle
* fix fa3 append attn 
						
						
					 
					
						2025-09-02 19:17:01 +08:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						b6a4115369 
					 
					
						
						
							
							[v1loader]Reduce EB300B model loading time ( #3700 )  
						
						... 
						
						
						
						* speed up eb45
* update 
						
						
					 
					
						2025-09-02 19:13:57 +08:00 
						 
				 
			
				
					
						
							
							
								RAM 
							
						 
					 
					
						
						
							
						
						205b706ef8 
					 
					
						
						
							
							[Executor] Fix bug of import paddle with RLHF ( #3781 )  
						
						
						
						
					 
					
						2025-09-02 17:32:13 +08:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						306c024ff3 
					 
					
						
						
							
							[BugFix] fix error of import paddle.base.core.Config ( #3761 )  
						
						... 
						
						
						
						* 延迟 import Config
* support chunked_prefill
* support chunked_prefill 
						
						
					 
					
						2025-09-02 17:23:27 +08:00 
						 
				 
			
				
					
						
							
							
								ltd0924 
							
						 
					 
					
						
						
							
						
						905d89e42f 
					 
					
						
						
							
							[Feature] support model weight update in ep ( #3765 )  
						
						... 
						
						
						
						* support model weight update in ep
* support model weight update in ep
* support model weight update in ep
* support model weight update in ep
* Update fused_moe_backend_base.py
* Update worker_process.py
* Update worker_process.py
* Update dynamic_weight_manager.py 
						
						
					 
					
						2025-09-02 17:16:03 +08:00 
						 
				 
			
				
					
						
							
							
								kevin 
							
						 
					 
					
						
						
							
						
						1908465542 
					 
					
						
						
							
							[Feature] mm and thinking model support structred output ( #2749 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* mm support structured output
* update code
* update code
* update format
* update code
* update code
* add enable_thinking default
* update code
* add structured_outputs test case
* add ci install xgrammar
* add ci timeout time
* update test for structured_outputs
* update code
* add error traceback info
* update error msg
* update structred output code
* update code
* update code
* update config
* update torch version
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-09-02 16:21:09 +08:00