yangjianfengo1 
							
						 
					 
					
						
						
							
						
						4325b737e7 
					 
					
						
						
							
							【FIX】Change the name of sparse attn from moba to plas ( #4006 ) ( #4076 )  
						
						... 
						
						
						
						* 【FIX】Change the name of sparse attn from moba to plas (#4006 )
* 更新文档
* 【docs】 update readme (#4000 )
* 更新文档
* update readme
* update docs
* 【FIX】Change the name of sparse attn from moba to plas (#3845 )
* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci
* code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* fix max_num_seqs
* fix test load attn
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-09-23 10:26:40 +08:00 
						 
				 
			
				
					
						
							
							
								yzwu 
							
						 
					 
					
						
						
							
						
						504461b6b5 
					 
					
						
						
							
							[Iluvatar GPU] Optimize attention performance and fix moe load ckpt error ( #3651 )  
						
						
						
						
					 
					
						2025-09-22 21:13:59 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						c4830ef24c 
					 
					
						
						
							
							fix typos ( #4176 )  
						
						... 
						
						
						
						* fix typos
* fix 
						
						
					 
					
						2025-09-22 14:27:17 +08:00 
						 
				 
			
				
					
						
							
							
								lizexu123 
							
						 
					 
					
						
						
							
						
						c86945ef49 
					 
					
						
						
							
							[Feature] support pool ( #3827 )  
						
						... 
						
						
						
						* support pool
* update pooling
* add pooler_config and check
* update
* support AutoWeightsLoader load weight
* fix
* update
* delete print
* update pre-commit
* fix
* fix xpu
* fix ModelRegistry->model_registry
* fix Copilot review
* fix pooler.py
* delete StepPooler
* fix abstract
* fix default_loader_v1
* fix Pre Commit
* support torch qwen3 dense
* add test and fix torch-qwen
* fix
* fix
* adapter ci:
* fix review
* fix pooling_params.py
* fix
* fix tasks.py 2025
* fix print and logger
* Modefy ModelRegistry and delete AutoWeightsLoader
* fix logger
* fix test_embedding
* fix ci bug
* ernie4_5 model_registry
* fix test
* support Qwen3-Embedding-0.6B tp=1 load
* fix extra code
* fix
* delete fix vocab_size
* delete prepare_params_dict
* fix: 
						
						
					 
					
						2025-09-22 14:09:09 +08:00 
						 
				 
			
				
					
						
							
							
								Lucas 
							
						 
					 
					
						
						
							
						
						5c33be5a7d 
					 
					
						
						
							
							[TEST] init first commit ( #4192 )  
						
						
						
						
					 
					
						2025-09-22 10:51:27 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						17a27170bc 
					 
					
						
						
							
							fix typos ( #4093 )  
						
						
						
						
					 
					
						2025-09-15 18:33:30 +08:00 
						 
				 
			
				
					
						
							
							
								freeliuzc 
							
						 
					 
					
						
						
							
						
						46911f903d 
					 
					
						
						
							
							[MTP]update hybrid-mtp-with-ngram ( #4047 )  
						
						
						
						
					 
					
						2025-09-15 17:13:31 +08:00 
						 
				 
			
				
					
						
							
							
								qwes5s5 
							
						 
					 
					
						
						
							
						
						553adb299e 
					 
					
						
						
							
							【FastDeploy CLI】collect-env subcommand ( #4044 )  
						
						... 
						
						
						
						* collect-env subcommand
* trigger ci
---------
Co-authored-by: K11OntheBoat <your_email@example.com > 
						
						
					 
					
						2025-09-15 10:31:23 +08:00 
						 
				 
			
				
					
						
							
							
								qwes5s5 
							
						 
					 
					
						
						
							
						
						58e0785bab 
					 
					
						
						
							
							[metrics] update metrics markdown file ( #4061 )  
						
						... 
						
						
						
						* adjust md
* trigger ci
---------
Co-authored-by: K11OntheBoat <your_email@example.com > 
						
						
					 
					
						2025-09-12 11:13:43 +08:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						2650f58740 
					 
					
						
						
							
							[docs] Update environment variables documentation ( #3957 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-10 21:17:06 -07:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						a7392a0ff9 
					 
					
						
						
							
							【Inference Optimize】DeepSeek-V3-model MLA Optimize ( #3886 )  
						
						... 
						
						
						
						* support MLA chunk_size auto search & cuda_graph 
						
						
					 
					
						2025-09-11 10:46:09 +08:00 
						 
				 
			
				
					
						
							
							
								zhupengyang 
							
						 
					 
					
						
						
							
						
						9d0074a91a 
					 
					
						
						
							
							[xpu] add ep custom ops ( #3911 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-10 12:22:50 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						c60adf4281 
					 
					
						
						
							
							Revert "【FIX】Change the name of sparse attn from moba to plas ( #3845 )" ( #4001 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						This reverts commit e31c8f7336 
						
						
					 
					
						2025-09-09 11:08:23 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						e31c8f7336 
					 
					
						
						
							
							【FIX】Change the name of sparse attn from moba to plas ( #3845 )  
						
						... 
						
						
						
						* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci 
						
						
					 
					
						2025-09-09 10:56:50 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						de34222842 
					 
					
						
						
							
							更新文档 ( #3998 )  
						
						
						
						
					 
					
						2025-09-09 10:44:15 +08:00 
						 
				 
			
				
					
						
							
							
								JYChen 
							
						 
					 
					
						
						
							
						
						8e8a5913da 
					 
					
						
						
							
							add a3b-thinking doc ( #3994 )  
						
						
						
						
					 
					
						2025-09-09 10:27:01 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						1dc1397ef6 
					 
					
						
						
							
							Update docs for thinking model support  
						
						
						
						
					 
					
						2025-09-09 10:08:05 +08:00 
						 
				 
			
				
					
						
							
							
								ming1753 
							
						 
					 
					
						
						
							
						
						12326b60e1 
					 
					
						
						
							
							[Docs] update VL best_practices for release/2.2 ( #3965 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* [Docs] update VL best_practices for release/2.2
* fix bug
* modify 
						
						
					 
					
						2025-09-08 22:07:37 +08:00 
						 
				 
			
				
					
						
							
							
								bukejiyu 
							
						 
					 
					
						
						
							
						
						08b3153661 
					 
					
						
						
							
							update doc ( #3990 )  
						
						... 
						
						
						
						Co-authored-by: root <root@tjdm-inf-sci-k8s-hzz2-h12ni8-0214.tjdm.baidu.com > 
						
						
					 
					
						2025-09-08 21:04:26 +08:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						d00faeec69 
					 
					
						
						
							
							update dsk doc ( #3989 )  
						
						
						
						
					 
					
						2025-09-08 20:42:48 +08:00 
						 
				 
			
				
					
						
							
							
								yinwei 
							
						 
					 
					
						
						
							
						
						7e0bfd024f 
					 
					
						
						
							
							update release note ( #3986 )  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / publish_pre_check (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / print_publish_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	Publish Job / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / CI Images Build (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Base Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Accuracy Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Run Stable Tests (push) Has been cancelled 
				
			 
		
			
				
	CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-08 19:03:14 +08:00 
						 
				 
			
				
					
						
							
							
								JYChen 
							
						 
					 
					
						
						
							
						
						1f056a7469 
					 
					
						
						
							
							[docs] update best practice docs ( #3969 )  
						
						... 
						
						
						
						* update best practice docs
* add version and v1 loader info 
						
						
					 
					
						2025-09-08 17:39:38 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						9ead10e1bc 
					 
					
						
						
							
							更新文档 ( #3975 )  
						
						
						
						
					 
					
						2025-09-08 16:53:37 +08:00 
						 
				 
			
				
					
						
							
							
								xiaolei373 
							
						 
					 
					
						
						
							
						
						571ddc677b 
					 
					
						
						
							
							Modify markdown ( #3896 )  
						
						... 
						
						
						
						* feat(log):add_request_and_response_log
* modify markdown graceful shutdown 
						
						
					 
					
						2025-09-08 16:42:34 +08:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						316ac546d3 
					 
					
						
						
							
							update_wint2_doc ( #3968 )  
						
						
						
						
					 
					
						2025-09-08 15:53:09 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						ed5133f704 
					 
					
						
						
							
							update env docs for Machete ( #3959 )  
						
						
						
						
					 
					
						2025-09-08 14:44:31 +08:00 
						 
				 
			
				
					
						
							
							
								qwes5s5 
							
						 
					 
					
						
						
							
						
						17169a14f2 
					 
					
						
						
							
							[metrics] Add serveral observability metrics ( #3868 )  
						
						... 
						
						
						
						* Add several observability metrics
* [wenxin-tools-584] 【可观测性】支持查看本节点的并发数、剩余block_size、排队请求数等信息
* adjust some metrics and md files
* trigger ci
* adjust ci file
* trigger ci
* trigger ci
---------
Co-authored-by: K11OntheBoat <your_email@example.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-09-08 14:13:13 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						472402bf4e 
					 
					
						
						
							
							Update sparse attn documentation ( #3954 )  
						
						... 
						
						
						
						* 更新文档
* 更新文档
* 更新文档
* 更新文档 
						
						
					 
					
						2025-09-08 12:23:18 +08:00 
						 
				 
			
				
					
						
							
							
								ltd0924 
							
						 
					 
					
						
						
							
						
						7643e6e6b2 
					 
					
						
						
							
							[Docs] add data parallel ( #3883 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* [Docs] add data parallel
* [Docs] add data parallel 
						
						
					 
					
						2025-09-04 20:33:50 +08:00 
						 
				 
			
				
					
						
							
							
								xiaolei373 
							
						 
					 
					
						
						
							
						
						ed97cf8396 
					 
					
						
						
							
							Graceful shut down ( #3785 )  
						
						... 
						
						
						
						* feat(log):add_request_and_response_log
* 优雅退出-接口增加退出时长参数 
						
						
					 
					
						2025-09-04 19:33:50 +08:00 
						 
				 
			
				
					
						
							
							
								AIbin 
							
						 
					 
					
						
						
							
						
						54b458fd98 
					 
					
						
						
							
							[Doc] update wint2 doc ( #3819 )  
						
						... 
						
						
						
						* update_wint2_doc 
						
						
					 
					
						2025-09-03 11:27:43 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						18e5d355a1 
					 
					
						
						
							
							Update version in docs  
						
						
						
						
					 
					
						2025-09-02 19:21:10 +08:00 
						 
				 
			
				
					
						
							
							
								kevin 
							
						 
					 
					
						
						
							
						
						1908465542 
					 
					
						
						
							
							[Feature] mm and thinking model support structred output ( #2749 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* mm support structured output
* update code
* update code
* update format
* update code
* update code
* add enable_thinking default
* update code
* add structured_outputs test case
* add ci install xgrammar
* add ci timeout time
* update test for structured_outputs
* update code
* add error traceback info
* update error msg
* update structred output code
* update code
* update code
* update config
* update torch version
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com > 
						
						
					 
					
						2025-09-02 16:21:09 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						27f2e7a6f1 
					 
					
						
						
							
							Create faq.md  
						
						
						
						
					 
					
						2025-09-02 11:07:37 +08:00 
						 
				 
			
				
					
						
							
							
								lizexu123 
							
						 
					 
					
						
						
							
						
						6dd61a1bab 
					 
					
						
						
							
							fix Document ( #3782 )  
						
						... 
						
						
						
						Co-authored-by: example_name <example_email> 
						
						
					 
					
						2025-09-01 20:22:43 +08:00 
						 
				 
			
				
					
						
							
							
								co63oc 
							
						 
					 
					
						
						
							
						
						d6369b4d51 
					 
					
						
						
							
							fix typos ( #3684 )  
						
						
						
						
					 
					
						2025-09-01 17:50:17 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						0513a78ecc 
					 
					
						
						
							
							Update docs for reasoing-parser  
						
						
						
						
					 
					
						2025-09-01 17:42:58 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						2bd7d90929 
					 
					
						
						
							
							Remove useless parameters  
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						
					 
					
						2025-09-01 14:43:56 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						3754a9906d 
					 
					
						
						
							
							[Feature] block sparse attention ( #3668 )  
						
						... 
						
						
						
						* 支持稀疏attn
* fix bug
* code style
* fix moba attn get kv shape
* 修复a100编译
* codestyle
* code style
* code style
* code style
* fix conflict
* 增加单侧
* code style
* 增加eblite 加载时间
* fix bug
* for ci
* for ci
* for ci
* for ci
* 支持mlp block size 128
* 增加小算子单测
* fix 单测 mlp
* 将环境变量加入到config里面
* fix rollout config
* 修复显存
* add test server
* add test server
* fix mlp  最后一层使用full attn 
						
						
					 
					
						2025-08-29 19:46:30 +08:00 
						 
				 
			
				
					
						
							
							
								Yuan Xiaolan 
							
						 
					 
					
						
						
							
						
						c71ee0831c 
					 
					
						
						
							
							add w4afp8 offline script ( #3636 )  
						
						
						
						
					 
					
						2025-08-29 17:56:05 +08:00 
						 
				 
			
				
					
						
							
							
								周周周 
							
						 
					 
					
						
						
							
						
						17b414c2df 
					 
					
						
						
							
							MoE Default use triton's blockwise fp8 in TP Case ( #3678 )  
						
						
						
						
					 
					
						2025-08-29 11:07:30 +08:00 
						 
				 
			
				
					
						
							
							
								Mattheliu 
							
						 
					 
					
						
						
							
						
						108d989d9d 
					 
					
						
						
							
							[Docs] add fastdeploy_unit_test_guide.md ( #3484 )  
						
						... 
						
						
						
						* docs:add fastdeploy_unit_test_guide.md
* docs:fix fastdeploy_unit_test_guide.md
* docs: add FastDeploy unit test spec (EN) and update usage nav
* fix codestyle 
						
						
					 
					
						2025-08-28 14:12:25 +08:00 
						 
				 
			
				
					
						
							
							
								Jiang-Jia-Jun 
							
						 
					 
					
						
						
							
						
						c694fa2879 
					 
					
						
						
							
							Revert "[Feature] block sparse attention ( #3209 )" ( #3647 )  
						
						... 
						
						
						
						This reverts commit 646a0c2fd8 
						
						
					 
					
						2025-08-27 17:35:04 +08:00 
						 
				 
			
				
					
						
							
							
								JYChen 
							
						 
					 
					
						
						
							
						
						e645db348b 
					 
					
						
						
							
							[docs] Update best practice doc ( #3539 )  
						
						... 
						
						
						
						* fix some docs error
* [docs] x1 best-practice
* update docs
* fix docs 
						
						
					 
					
						2025-08-27 15:45:30 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						ce9c0917c5 
					 
					
						
						
							
							[Precision] Support lm_head layer running in float32 ( #3597 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* support lm_head fp32 bf16 fp16
* support lm_head fp32 bf16 fp16
* add doc and check code
* lm_head_fp32 specify lm_head as fp32
* code check
* check doc 
						
						
					 
					
						2025-08-27 11:34:53 +08:00 
						 
				 
			
				
					
						
							
							
								yangjianfengo1 
							
						 
					 
					
						
						
							
						
						646a0c2fd8 
					 
					
						
						
							
							[Feature] block sparse attention ( #3209 )  
						
						... 
						
						
	
		
			
	 
	
	
		
	
	
		
			
				
	CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
			 
		
			
				
	CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
			 
		
			
				
	Deploy GitHub Pages / deploy (push) Has been cancelled 
				
			 
		
		
	 
 
	 
						
						* 支持稀疏attn
* fix bug
* code style
* fix moba attn get kv shape
* 修复a100编译
* codestyle
* code style
* code style
* code style
* fix conflict
* 增加单侧
* code style
* 增加eblite 加载时间
* fix bug
* for ci
* for ci
* for ci
* for ci
* 支持mlp block size 128
* 增加小算子单测
* fix 单测 mlp
* 将环境变量加入到config里面
* fix rollout config 
						
						
					 
					
						2025-08-26 07:16:04 -07:00 
						 
				 
			
				
					
						
							
							
								Yuanle Liu 
							
						 
					 
					
						
						
							
						
						cbce94a00e 
					 
					
						
						
							
							rename ernie_xxx to ernie4_5_xxx ( #3621 )  
						
						... 
						
						
						
						* rename ernie_xxx to ernie4_5_xxx
* ci fix 
						
						
					 
					
						2025-08-26 19:29:27 +08:00 
						 
				 
			
				
					
						
							
							
								Sunny-bot1 
							
						 
					 
					
						
						
							
						
						c68c3c4b8b 
					 
					
						
						
							
							[Feature] bad words support v1 scheduler and specifiy token ids ( #3608 )  
						
						... 
						
						
						
						* support bad_words_token_ids
* docs
* fix test
* fix
* bad words support kvcache v1 and token ids
* fix 
						
						
					 
					
						2025-08-25 20:14:51 -07:00 
						 
				 
			
				
					
						
							
							
								Kane2011 
							
						 
					 
					
						
						
							
						
						2ae7ab28d2 
					 
					
						
						
							
							[MetaxGPU] adapt to the latest fastdeploy on metax gpu ( #3492 )  
						
						
						
						
					 
					
						2025-08-25 17:44:20 +08:00 
						 
				 
			
				
					
						
							
							
								chen 
							
						 
					 
					
						
						
							
						
						9cab3f47ff 
					 
					
						
						
							
							[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing ( #3552 )  
						
						... 
						
						
						
						* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing
* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs
* delete some code
* code check
* code check and add doc
* fix tokenizer.decoder(-1), return 'Invalid Token'
* add ci for temp_scaled and top_p logprobs
* check test
* check seq len time shape
* logprob clip inf
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com > 
						
						
					 
					
						2025-08-25 14:11:49 +08:00