Lucas 
							
						 
					 
					
						
						
							
						
						87179cb744 
					 
					
						
						
							
							[XPU] support XPU VL model inference ( #4030 )  
						
						... 
						
						
						
						* [XPU] support XPU VL model inference
* fix image op import and device check
* rebase develop
* fix perf 
						
						
					 
					
						2025-09-25 14:34:15 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						7c1fd19f0f 
					 
					
						
						
							
							[OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 ( #4238 )  
						
						
						
						
					 
					
						2025-09-24 16:39:51 +08:00 
						 
				 
			
				
					
						
							
							
								Yohanna 
							
						 
					 
					
						
						
							
						
						44010cee13 
					 
					
						
						
							
							FIX] Fix CUDA error(700): 'cudaErrorIllegalAddress' in CascadeAppendWriteCacheKVQKV cache_kernel(). Continue when batch_id_per_token[token_idx] is default value -1. ( #4218 )  
						
						
						
						
					 
					
						2025-09-24 14:08:49 +08:00 
						 
				 
			
				
					
						
							
							
								fmiao2372 
							
						 
					 
					
						
						
							
						
						f1b5392e20 
					 
					
						
						
							
							[Intel HPU] Support intel hpu platform ( #4161 )  
						
						... 
						
						
						
						* [Intel HPU] Support intel hpu platform
* fix some issues
* apply precommit and move AttentionBackend_HPU
* fix format issue
* correct ops import
* fix ci issue
* update code in layers
* fix code style issue
* remove dense tp moe ep mode
* fix enc_dec_block_num
* fix rebase issue
* rename hpu to gaudi in readme
* rename ForwardMeta_HPU to HPUForwardMeta 
						
						
					 
					
						2025-09-24 12:27:50 +08:00 
						 
				 
			
				
					
						
							
							
								yyssys 
							
						 
					 
					
						
						
							
						
						d6e59447f5 
					 
					
						
						
							
							[XPU] Enable XPU V1 mode based on environment variable ( #4213 )  
						
						... 
						
						
						
						* Enable XPU V1 mode based on environment variable
* add default param to xft_moe_fc_block_eb for latest xvllm compatibility; update run_ci_xpu to use latest xvllm 
						
						
					 
					
						2025-09-24 10:29:48 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						1a6283424e 
					 
					
						
						
							
							Fix noaux_tc cuda Error 700 in CUDAGraph ( #4174 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-23 18:41:33 +08:00 
						 
				 
			
				
					
						
							
							
								yzwu 
							
						 
					 
					
						
						
							
						
						504461b6b5 
					 
					
						
						
							
							[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error ( #3651 )  
						
						
						
						
					 
					
						2025-09-22 21:13:59 +08:00 
						 
				 
			
				
					
						
							
							
								chenjian 
							
						 
					 
					
						
						
							
						
						918ccdb123 
					 
					
						
						
							
							[Feature] Support pd ep deployment with yiyan adapter ( #4029 )  
						
						... 
						
						
						
						* [Feature] Support mixed deployment with yiyan adapter in release2.2
* fix metrics
* add unit test
* add unit test
* add unit test
* Support pd ep deployment with yiyan adapter
* Support pd ep deployment with yiyan adapter
* refactor cache messager
* support scheduler v1 in PD
* suppport pd v1 + chunk prefill
* suppport pd v1 + chunk prefill
* add eplb
* support eplb
* support eplb
* support eplb
* support v1
* fix
* fix
* fix bug
* remove eplb support
* support prefix cache in P
* fix bug
* fix bug
* support one stop in V1
* fix bug
* fix ci
* fix ci
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com > 
						
						
					 
					
						2025-09-22 16:41:38 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						c4830ef24c 
					 
					
						
						
							
							fix typos ( #4176 )  
						
						... 
						
						
						
						* fix typos
* fix 
						
						
					 
					
						2025-09-22 14:27:17 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						66a98b44ed 
					 
					
						
						
							
							ep support logprob ( #4089 ) ( #4151 )  
						
						
						
						
					 
					
						2025-09-19 14:07:31 +08:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						896e3bb606 
					 
					
						
						
							
							[NewFeture]add ep rollout model init and update/clear ep buffer ( #4039 )  
						
						... 
						
						
						
						* fix gid
* merge
* fix test
* fix bug
* fix
* fix ci 
						
						
					 
					
						2025-09-17 20:24:53 +08:00 
						 
				 
			
				
					
						
							
							
								Yuan Xiaolan 
							
						 
					 
					
						
						
							
						
						de8638b1e9 
					 
					
						
						
							
							fix dynamic Cfp8 computing error ( #4119 )  
						
						... 
						
						
						
						Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com > 
						
						
					 
					
						2025-09-16 20:21:49 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						442543cd6b 
					 
					
						
						
							
							fix ep wint8 ( #4102 )  
						
						
						
						
					 
					
						2025-09-16 11:05:33 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						17a27170bc 
					 
					
						
						
							
							fix typos ( #4093 )  
						
						
						
						
					 
					
						2025-09-15 18:33:30 +08:00 
						 
				 
			
				
					
						
							
							
								zhupengyang 
							
						 
					 
					
						
						
							
						
						9409665713 
					 
					
						
						
							
							[xpu] support ep ( #4067 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-15 13:53:11 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						b1a5b756a3 
					 
					
						
						
							
							[Optimize] Support WINT8 and group scale for Machete ( #3905 )  
						
						
						
						
					 
					
						2025-09-15 12:01:34 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						8466219ec8 
					 
					
						
						
							
							fix typos ( #3840 )  
						
						... 
						
						
						
						* fix typos
* ci
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com > 
						
						
					 
					
						2025-09-12 11:04:38 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						d2d04c2d5e 
					 
					
						
						
							
							[setup optimize]Support git submodule ( #4033 )  
						
						... 
						
						
						
						* support git submodule
* update setup
* fix ci network
* fix clone
* revert clone linux
* delete args
* fix ci
* update 
						
						
					 
					
						2025-09-11 17:41:16 +08:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						a7392a0ff9 
					 
					
						
						
							
							【Inference Optimize】DeepSeek-V3-model MLA Optimize ( #3886 )  
						
						... 
						
						
						
						* support MLA chunk_size auto search & cuda_graph 
						
						
					 
					
						2025-09-11 10:46:09 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						637d96c6ae 
					 
					
						
						
							
							[Feature] Support zai-org/GLM-4.5-Air BF16 model ( #3928 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* support glm45_air 
						
						
					 
					
						2025-09-10 19:36:10 +08:00 
						 
				 
			
				
					
						
							
							
								freeliuzc 
							
						 
					 
					
						
						
							
						
						7ee100903f 
					 
					
						
						
							
							support rope_3d in spec mode ( #4034 )  
						
						
						
						
					 
					
						2025-09-10 03:15:05 -07:00 
						 
				 
			
				
					
						
							
							
								Ayakouji 
							
						 
					 
					
						
						
							
						
						453487d5b0 
					 
					
						
						
							
							[Feat] ernie4_5_vl_moe support CudaGraph ( #3226 )  
						
						... 
						
						
						
						* delete dynamic control flow for decode
* coda-style
* fix scatter/gather typos and use input stream instead default stream
* support 0-Size Tensor
* update runner and model
* using static mem address as input
* fix mem leak
* refine code
* update mm_buffer
* fix typo
* fix buffersize
* fix unk token
* refine code
* refine
* support other arch
* open cudagraph in vlci
* fix
* update
* update
* update
* fix cmd
* update
---------
Co-authored-by: aquagull <hongyuh@qq.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com > 
						
						
					 
					
						2025-09-10 13:11:57 +08:00 
						 
				 
			
				
					
						
							
							
								zhupengyang 
							
						 
					 
					
						
						
							
						
						9d0074a91a 
					 
					
						
						
							
							[xpu] add ep custom ops ( #3911 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-10 12:22:50 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						dbab579299 
					 
					
						
						
							
							clean code ( #4020 )  
						
						
						
						
					 
					
						2025-09-10 10:56:15 +08:00 
						 
				 
			
				
					
						
							
							
								lzy 
							
						 
					 
					
						
						
							
						
						f12159b630 
					 
					
						
						
							
							del batch id per token ( #3963 )  
						
						... 
						
						
						
						* Update decoder_write_cache_with_rope_kernel.cu
del batch_id_per_token
* Update decoder_write_cache_with_rope_impl.cuh
* Update test_append_attention.py
* Update test_append_attention.py 
						
						
					 
					
						2025-09-08 21:58:34 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						aadd6a94d8 
					 
					
						
						
							
							fix typos ( #3951 )  
						
						
						
						
					 
					
						2025-09-08 15:22:41 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						2033450391 
					 
					
						
						
							
							rename ep_moe_prefill_func ep_moe_expert_dispatch ( #3938 )  
						
						
						
						
					 
					
						2025-09-08 15:19:28 +08:00 
						 
				 
			
				
					
						
							
							
								Jundong Liu 
							
						 
					 
					
						
						
							
						
						3d0aaa5923 
					 
					
						
						
							
							[Excutor] Experiment Feature-Support Prefill in cudagraph ( #3459 )  
						
						... 
						
						
						
						* Support prefill in Cudagraph
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.1
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.2
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.3
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.4
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.5
* Solve problem about encoder_num_blocks_x_cpu
* Add early-exit mechanism for attention kernel
* fix test case about append-attention
* Update testcode, Add annotations to related tensors
* move get_input_length_list
* solve test_code
* Add annotations about early-exit for attention kernel
* Add annotations about early-exit for attention kernel2
* solve comment
* solve mtp
---------
Co-authored-by: RAM <gstian5555@outlook.com > 
						
						
					 
					
						2025-09-08 13:12:24 +08:00 
						 
				 
			
				
					
						
							
							
								lzy 
							
						 
					 
					
						
						
							
						
						af49b81ffd 
					 
					
						
						
							
							supports dynamic Cfp8 ( #3767 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* supports dynamic Cfp8
* add unittest 
						
						
					 
					
						2025-09-07 20:41:29 -07:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						30a1c1783f 
					 
					
						
						
							
							rename eagle_get_base_model_hidden_states.cu ( #3753 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-07 10:24:58 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						f6f726c773 
					 
					
						
						
							
							clean code in sttantion ( #3917 )  
						
						
						
						
					 
					
						2025-09-05 20:49:01 +08:00 
						 
				 
			
				
					
						
							
							
								Yuan Xiaolan 
							
						 
					 
					
						
						
							
						
						2cf55168ca 
					 
					
						
						
							
							load hadamard_block_size from config ( #3797 )  
						
						
						
						
					 
					
						2025-09-05 17:07:58 +08:00 
						 
				 
			
				
					
						
							
							
								freeliuzc 
							
						 
					 
					
						
						
							
						
						88d44a2c93 
					 
					
						
						
							
							support mtp in v1_scheduler mode ( #3695 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-04 17:39:59 +08:00 
						 
				 
			
				
					
						
							
							
								xiaoxiaohehe001 
							
						 
					 
					
						
						
							
						
						f265a26f8b 
					 
					
						
						
							
							support mtp rope_3d ( #3791 )  
						
						... 
						
						
						
						* support mtp rope_3d
* Update speculate_write_cache_with_rope_kernel.cu 
						
						
					 
					
						2025-09-04 17:18:05 +08:00 
						 
				 
			
				
					
						
							
							
								plusNew001 
							
						 
					 
					
						
						
							
						
						3790505319 
					 
					
						
						
							
							[XPU] Update  XPU stable xvllm and xtdk version for 2.2 ( #3853 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* Add debug environment variable exports
Added debug environment variable exports for CLANG_PATH and XVLLM_PATH.
* Lock paddlepaddle-xpu version in CI script
Temporarily lock paddlepaddle-xpu version due to framework update issues.
* Update no_proxy environment variable in CI workflow
* Install lsof tool in run_ci_xpu.sh
* Update dependency versions for stable release
* Update paddlepaddle-xpu installation command 
						
						
					 
					
						2025-09-03 23:21:00 +08:00 
						 
				 
			
				
					
						
							
							
								lizexu123 
							
						 
					 
					
						
						
							
						
						4c998c3636 
					 
					
						
						
							
							[Code Simplification] delete cum_offsets_out ( #3815 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* fix
* fix 
						
						
					 
					
						2025-09-03 16:15:33 +08:00 
						 
				 
			
				
					
						
							
							
								Yuan Xiaolan 
							
						 
					 
					
						
						
							
						
						fa58a9fa8f 
					 
					
						
						
							
							qk norm for speculate decode C16 ( #3637 )  
						
						
						
						
					 
					
						2025-09-03 14:53:56 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						5441538173 
					 
					
						
						
							
							rename fused_get_rope.cu ( #3752 )  
						
						... 
						
						
						
						* rename fused_get_rope.cu
* fix
* fix typos
* fix
* fix 
						
						
					 
					
						2025-09-03 10:54:34 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						8e1b35a09b 
					 
					
						
						
							
							【Fix bug]  w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 ( #3771 )  
						
						... 
						
						
						
						* fix w4afp8
* 增加集中式配置
* codestyle
* fix fa3 append attn 
						
						
					 
					
						2025-09-02 19:17:01 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						aa067a3106 
					 
					
						
						
							
							rename speculate_token_penalty_multi_scores.cu ( #3735 )  
						
						
						
						
					 
					
						2025-09-02 18:12:11 +08:00 
						 
				 
			
				
					
						
							
							
								lzy 
							
						 
					 
					
						
						
							
						
						7a521bbf62 
					 
					
						
						
							
							Modify mask_offset‘s format ( #3525 )  
						
						... 
						
						
						
						* modify mask_offset in decode
* modify mask_offset unittest
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com > 
						
						
					 
					
						2025-09-02 03:05:35 -07:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						f296aff6cf 
					 
					
						
						
							
							rename speculate_stop_generation_multi_stop_seqs ( #3743 )  
						
						
						
						
					 
					
						2025-09-02 18:04:29 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						d6369b4d51 
					 
					
						
						
							
							fix typos ( #3684 )  
						
						
						
						
					 
					
						2025-09-01 17:50:17 +08:00 
						 
				 
			
				
					
						
							
							
								lizhenyun01 
							
						 
					 
					
						
						
							
						
						bed09ae8f8 
					 
					
						
						
							
							fix mask_offset in append_attn ( #3745 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* fix mask_offset in append_attn
* fix test 
						
						
					 
					
						2025-08-31 15:03:16 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						fe5d09f9ee 
					 
					
						
						
							
							[FIX]Fix Machete compile via ENABLE_MACHETE ( #3727 )  
						
						... 
						
						
						
						* add ENABLE_MACHETE
* fix
* revert
* update
* pre_commit
* fix
* fix
---------
Co-authored-by: Ayakouji <yuhongh@qq.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: aquagull <hongyuh@qq.com > 
						
						
					 
					
						2025-08-30 17:50:17 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						3754a9906d 
					 
					
						
						
							
							[Feature] block sparse attention ( #3668 )  
						
						... 
						
						
						
						* 支持稀疏attn
* fix bug
* code style
* fix moba attn get kv shape
* 修复a100编译
* codestyle
* code style
* code style
* code style
* fix conflict
* 增加单侧
* code style
* 增加eblite 加载时间
* fix bug
* for ci
* for ci
* for ci
* for ci
* 支持mlp block size 128
* 增加小算子单测
* fix 单测 mlp
* 将环境变量加入到config里面
* fix rollout config
* 修复显存
* add test server
* add test server
* fix mlp  最后一层使用full attn 
						
						
					 
					
						2025-08-29 19:46:30 +08:00 
						 
				 
			
				
					
						
							
							
								Yuan Xiaolan 
							
						 
					 
					
						
						
							
						
						c71ee0831c 
					 
					
						
						
							
							add w4afp8 offline script ( #3636 )  
						
						
						
						
					 
					
						2025-08-29 17:56:05 +08:00 
						 
				 
			
				
					
						
							
							
								Ryan 
							
						 
					 
					
						
						
							
						
						45f81b34f0 
					 
					
						
						
							
							add dtype int32 ( #3692 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-08-29 14:56:35 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						b6edd15d55 
					 
					
						
						
							
							fix scaled_gemm_f8_i4_f16_weight_quantize input ( #3685 )  
						
						
						
						
					 
					
						2025-08-29 11:04:04 +08:00 
						 
				 
			
				
					
						
							
							
								lifulll 
							
						 
					 
					
						
						
							
						
						72094d4d82 
					 
					
						
						
							
							enable dcu ci ( #3402 )  
						
						
						
						
					 
					
						2025-08-29 10:23:08 +08:00