| 
							
							
								 xiaozude | 7c919070f7 | [Metax] support cutlass moe & optimize flash attention (#4208) 
		
	
	
		
			
				
	
				CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
	
				CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
	
				CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
	
				CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
	
				CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
	
				CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
	
				CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
	
				Deploy GitHub Pages / deploy (push) Has been cancelled | 2025-09-29 11:22:43 +08:00 |  | 
			
				
					| 
							
							
								 chen | 7c1fd19f0f | [OPs] MoE support wfp8afp8(channelwise) and improve per_token_quant_fp8 (#4238) | 2025-09-24 16:39:51 +08:00 |  | 
			
				
					| 
							
							
								 yzwu | 504461b6b5 | [Iluvatar GPU] Optimize attention performance and fix moe load ckpt error (#3651) | 2025-09-22 21:13:59 +08:00 |  | 
			
				
					| 
							
							
								 AIbin | a7392a0ff9 | 【Inference Optimize】DeepSeek-V3-model MLA Optimize (#3886) * support MLA chunk_size auto search & cuda_graph | 2025-09-11 10:46:09 +08:00 |  | 
			
				
					| 
							
							
								 Yuan Xiaolan | 9205c88da1 | support w4afp8 EP inference (#3044) 
		
	
	
		
			
				
	
				CE Compile Job / ce_job_pre_check (push) Has been cancelled 
				
	
				CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled 
				
	
				CE Compile Job / FD-Clone-Linux (push) Has been cancelled 
				
	
				CE Compile Job / Show Code Archive Output (push) Has been cancelled 
				
	
				CE Compile Job / BUILD_SM8090 (push) Has been cancelled 
				
	
				CE Compile Job / BUILD_SM8689 (push) Has been cancelled 
				
	
				CE Compile Job / CE_UPLOAD (push) Has been cancelled 
				
	
				Deploy GitHub Pages / deploy (push) Has been cancelled | 2025-08-25 11:27:45 +08:00 |  | 
			
				
					| 
							
							
								 Kane2011 | b4fef2cf29 | [MetaxGPU] Support FastDeploy on metax gpu  (#3241) * [MetaxGPU] Support FastDeploy on metax gpu
* Update metax_worker.py
1. change worker log;
2. remove custom allreduce, adapt it later;
3. remove cuda graph;
* Update __init__.py
1. remove metax's key work comment
* Update __init__.py
1. remove metax's key word comment;
2. add fused_moe_kernel_paddle import
---------
Co-authored-by: yongqiangma <xing.wo@163.com> | 2025-08-13 11:11:54 +08:00 |  | 
			
				
					| 
							
							
								 lifulll | 1f28bdf994 | dcu adapter ernie45t (#2756) Co-authored-by: lifu <lifu@sugon.com>
Co-authored-by: yongqiangma <xing.wo@163.com> | 2025-07-09 18:56:27 +08:00 |  | 
			
				
					| 
							
							
								 liddk1121 | 1b54a2831e | Adapt for iluvatar gpu (#2684) | 2025-07-07 16:53:14 +08:00 |  | 
			
				
					| 
							
							
								 Jiang-Jia-Jun | 05c670e593 | [Sync] Update to latest code (#2679) * [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com> | 2025-07-03 15:43:53 +08:00 |  | 
			
				
					| 
							
							
								 Jiang-Jia-Jun | 92c2cfa2e7 | Sync v2.0 version of code to github repo | 2025-06-29 23:29:37 +00:00 |  | 
			
				
					| 
							
							
								 jiangjiajun | 684703fd72 | [LLM] First commit the llm deployment code | 2025-06-09 19:20:15 +08:00 |  |