yangjianfengo1 
							
						 
					 
					
						
						
							
						
						8e1b35a09b 
					 
					
						
						
							
							【Fix bug]  w4afp8 的nblock固定为256,并且fa3的append attn 增加mask参数 ( #3771 )  
						
						... 
						
						
						
						* fix w4afp8
* 增加集中式配置
* codestyle
* fix fa3 append attn 
						
						
					 
					
						2025-09-02 19:17:01 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						d6369b4d51 
					 
					
						
						
							
							fix typos ( #3684 )  
						
						
						
						
					 
					
						2025-09-01 17:50:17 +08:00 
						 
				 
			
				
					
						
							
							
								Liumengyuan 
							
						 
					 
					
						
						
							
						
						e93d4cfcdd 
					 
					
						
						
							
							Add with_output version AppendAttention ( #3302 )  
						
						... 
						
						
						
						* get use_output from fd_config
* add clear TODO description
* add mask_offset para to align with develop
* fix bug
* fix use_output logic
* fix sot bug 
						
						
					 
					
						2025-08-28 17:10:18 +08:00 
						 
				 
			
				
					
						
							
							
								xiaoxiaohehe001 
							
						 
					 
					
						
						
							
						
						ad319a87cc 
					 
					
						
						
							
							support fa3 rope3d ( #3622 )  
						
						
						
						
					 
					
						2025-08-27 11:31:29 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						3a15e0c53e 
					 
					
						
						
							
							【Fix Bug】 修复 fa3 支持集中式bug ( #3235 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* fix fa3 集中式bug
* 增加qknorm参数 
						
						
					 
					
						2025-08-06 16:24:27 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						64d7a3194d 
					 
					
						
						
							
							集中式支持fa3 ( #3112 )  
						
						
						
						
					 
					
						2025-08-01 18:03:36 +08:00 
						 
				 
			
				
					
						
							
							
								RAM 
							
						 
					 
					
						
						
							
						
						d850660872 
					 
					
						
						
							
							[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel ( #2989 )  
						
						... 
						
						
						
						* reset decoder_block_shape_q buffer
* refactor GetBlockShapeAndSplitKVBlock Kernel and cudagraph padding batch
* update decode_max_tile_size
* fix pre-commit
* update block_multihead_attn_backend
* update flas attn backend
* update MLA Attention
* update XPU Attention
* update gcu,iluvatar model runner
* Update MTP
* fix MTP bug 
						
						
					 
					
						2025-07-31 00:09:31 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						6ccc10ad47 
					 
					
						
						
							
							Unify server-side and model-side Config (Part1) ( #3018 )  
						
						... 
						
						
						
						* move cache config
* fix mtp 
						
						
					 
					
						2025-07-28 10:51:52 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						332154f504 
					 
					
						
						
							
							[feature] Support FA2 ( #3009 )  
						
						
						
						
					 
					
						2025-07-25 14:09:00 +08:00 
						 
				 
			
				
					
						
							
							
								lizhenyun01 
							
						 
					 
					
						
						
							
						
						29c3292f02 
					 
					
						
						
							
							support c4 attn && fix cache  
						
						
						
						
					 
					
						2025-07-24 12:00:52 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						172e69fe17 
					 
					
						
						
							
							FA3 fix bug ( #2987 )  
						
						
						
						
					 
					
						2025-07-23 19:07:43 +08:00 
						 
				 
			
				
					
						
							
							
								lizhenyun01 
							
						 
					 
					
						
						
							
						
						e51f018577 
					 
					
						
						
							
							support chunk_prefill in fa3  
						
						
						
						
					 
					
						2025-07-23 12:19:20 +08:00 
						 
				 
			
				
					
						
							
							
								Nyakku Shigure 
							
						 
					 
					
						
						
							
						
						48e6a0ca26 
					 
					
						
						
							
							[SOT] Mark dynamic dims by type annotations ( #2771 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* [SOT] Mark dynamic dims by type annotations
* fix conflict of forward_meta
* mark more attn backend
* fix missing annotated and add env SOT_SPECIALIZED_DIM_NUMBERS
* auto infer implicit 0 dim dynamic dim
* revert manual marked dims
* revert missing update
* auto infer can use unsafe code in warmup stage
* check -> type_match
* fix codestyle
* restore blank line
* empty commit
* add need_warmup nonlocal;
* add doc for resolver
* add missing type hints
* unquote "ForwardMeta" 
						
						
					 
					
						2025-07-22 00:23:52 -07:00 
						 
				 
			
				
					
						
							
							
								Zero Rains 
							
						 
					 
					
						
						
							
						
						25698d56d1 
					 
					
						
						
							
							polish code with new pre-commit rule ( #2923 )  
						
						
						
						
					 
					
						2025-07-19 23:19:27 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						d306944f4f 
					 
					
						
						
							
							remove cum_offsets from get_block_shape_and_split_kv_block ( #2913 )  
						
						... 
						
						
						
						* remove padding_offsets from get_padding_offset.cu
* remove padding_offsets from get_padding_offset.cu
* remove padding_offsets from get_padding_offset.cu
* remove cum_offsets from get_block_shape_and_split_kv_block
* remove cum_offsets from get_block_shape_and_split_kv_block 
						
						
					 
					
						2025-07-18 16:13:32 +08:00 
						 
				 
			
				
					
						
							
							
								freeliuzc 
							
						 
					 
					
						
						
							
						
						d49f8fb30a 
					 
					
						
						
							
							[Feature][MTP] Support cacheKV transfer in per_chunk mode ( #2890 )  
						
						... 
						
						
						
						* support chunk_prefill both normal and speculative_decoding(mtp)
* optimize pd-disaggregation config
* fix bug 
						
						
					 
					
						2025-07-17 17:58:08 +08:00 
						 
				 
			
				
					
						
							
							
								YuanRisheng 
							
						 
					 
					
						
						
							
						
						4c7b8bc458 
					 
					
						
						
							
							Simplify the Config code ( #2770 )  
						
						... 
						
						
						
						* simplify the code
* fix vl
* delete config
* fix
* perfect code
* fix ci
* fix xpu
* fix xpu
* fix server
* resolve conflict
* fix mtp
* resolve conflict
* fix xpu
* fix xpu
* fix vl
* fix log
* fix qwen moe
* fix qwen moe
* fix qwen moe 
						
						
					 
					
						2025-07-14 19:50:05 +08:00 
						 
				 
			
				
					
						
							
							
								littledgg 
							
						 
					 
					
						
						
							
						
						59071268b6 
					 
					
						
						
							
							[Executor] Move forward_meta.py to fastdeploy/model_executor ( #2774 )  
						
						... 
						
						
						
						* Use PEP 563 in attention.py and fix conflict
* merge commit
* Change what was left out last time 
						
						
					 
					
						2025-07-10 20:36:51 +08:00 
						 
				 
			
				
					
						
							
							
								RichardWooSJTU 
							
						 
					 
					
						
						
							
						
						fee544e808 
					 
					
						
						
							
							fix ep prefill ( #2762 )  
						
						
						
						
					 
					
						2025-07-09 14:03:05 +08:00 
						 
				 
			
				
					
						
							
							
								RichardWooSJTU 
							
						 
					 
					
						
						
							
						
						6610aa29d0 
					 
					
						
						
							
							Revert "[Bug fix] fix attention rank init ( #2743 )" ( #2761 )  
						
						... 
						
						
						
						This reverts commit e8bbe7244b 
						
						
					 
					
						2025-07-09 10:38:12 +08:00 
						 
				 
			
				
					
						
							
							
								RichardWooSJTU 
							
						 
					 
					
						
						
							
						
						e8bbe7244b 
					 
					
						
						
							
							[Bug fix] fix attention rank init ( #2743 )  
						
						... 
						
						
						
						* fix attention rank init
* fix attention rank init 
						
						
					 
					
						2025-07-08 17:19:49 +08:00 
						 
				 
			
				
					
						
							
							
								gaoziyuan 
							
						 
					 
					
						
						
							
						
						26d5d737dd 
					 
					
						
						
							
							【Fearture】support qwen2 some func ( #2740 )  
						
						... 
						
						
						
						* add rl qwen model support
* fix
* fix 
						
						
					 
					
						2025-07-08 12:03:04 +08:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						240bdac2a4 
					 
					
						
						
							
							[feat] support fa3 backend for pd disaggregated ( #2695 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* support fa3 backend run in pd disaggregated
* delete use_fast_ffn 
						
						
					 
					
						2025-07-03 22:33:27 +08:00