fmiao2372
e43a5fc055
[Intel HPU] enable level 1 prefix caching and fix some bugs ( #4971 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Intel HPU] enable prefix caching and dense tp moe ep and fix some bugs
* update code by copilot
* remove dense tp and moe ep code
2025-11-14 19:42:50 +08:00
Daci
5fc12eddfe
[Optimization] xgrammar async compile, multi thread, speed up ( #4835 )
...
* xgrammar async compile, multi thread, speed up
* fix test_sampler.py & pre-commit err
* add redis version check && fix request.llm_engine_recv_req_timestamp
* xgrammar prefill & decode & v0
* fix test_gpu_prompt_logprobs.py
* add test_guided_decoding.py
* Update fastdeploy/scheduler/splitwise_scheduler.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix torch xgrammar unittest env
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-14 18:05:26 +08:00
周周周
51b1f13547
[Executor]move batch_id_per_token ( #4853 )
2025-11-14 15:38:48 +08:00
yangjianfengo1
ae7bee8122
【New Feature】W4afp8 supports per group quantization ( #4987 )
...
* w4afp8 支持per group
* code style
* fix transpose
* revert fast hardmard
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 19:17:27 +08:00
bukejiyu
4a0d881e15
update ( #4985 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-13 15:58:01 +08:00
Yuanle Liu
2272160faf
fix mtp tsp ( #4990 )
2025-11-12 22:05:19 +08:00
ming1753
3148dbca06
[BugFix] fix VL fp8 bug when moe token_num is 0 ( #4928 )
...
* [BugFix] fix VL fp8 bug when moe token_num is 0
* fix bug
* format
* fix bug
2025-11-12 21:19:36 +08:00
bukejiyu
f0189292df
[CI] fix test_model_cache ( #4982 )
...
* ci
* update
2025-11-12 20:26:49 +08:00
xiaozude
c45b3ccb52
[Metax] optimize flash mla ( #4915 )
2025-11-12 16:43:46 +08:00
MingkunZhang
9d9f5df8d0
[Metax] support default_v1 loader & thinking model ( #4956 )
...
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-12 16:32:26 +08:00
BossPi
bde6e2f931
[BugFix] Avoid loading training file ( #4966 )
...
* bug fix
don't put scheduler.pdparams into model weights
* run pre-commit
2025-11-12 15:49:14 +08:00
bukejiyu
6e2e2fcd29
xpu ( #4969 )
2025-11-12 15:12:59 +08:00
ltd0924
5bf48de999
[KVCache] support unified cache backend ( #4903 )
...
* [Feature] support unified cache backend
* fix
* fix
* fix
* fix
* Update metax_model_runner.py
* fix
* update
* Update test_moba_attention_backend.py
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-11-12 14:54:52 +08:00
yzwu
76e60e98f8
[Iluvatar][CI] fix safetensors_rust.SafetensorError: framework paddle is invalid ( #4972 )
2025-11-12 14:13:40 +08:00
bukejiyu
b09ebb2813
refactor pt loading ( #4532 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-11 21:30:39 +08:00
周周周
da6b4c10e5
[ATTENTION] make buffer alloc as a function ( #4945 )
2025-11-11 19:17:08 +08:00
SunLei
3098aee05f
[Perf] Support tensor transmission between work and engine with zero-copy to improve efficiency ( #4839 )
...
* feat(zmq): support tensor transmission with zero-copy for improved efficiency
* perf: zmq.send disable copy
* zmq recv data for debug
* convert logprobs tensor to cpu
2025-11-11 15:43:11 +08:00
yzwu
3707af7a4f
[Iluvatar] add vl into ci and support v1 loader ( #4774 )
2025-11-11 10:50:17 +08:00
Ryan
07a82afcae
add tie_word_embeddings for lmhead ( #4916 )
2025-11-11 10:46:35 +08:00
Yuanle Liu
3dc0ffa46d
[TSP] Support qwen3 moe tsp + cudagraph ( #4871 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3_moe tsp mode
* fix
* fix
* update
* update
* update
* fix
* support external_rmsnorm
* update
* fix
2025-11-10 23:37:51 +08:00
Sunny-bot1
59d2edde29
[BugFix] Add support for weight shape constraints and group size selection in Machete ( #4911 )
2025-11-10 20:57:35 +08:00
周周周
54536267db
[DeepEP] support P async_finish ( #4899 )
2025-11-10 18:24:02 +08:00
chenjian
78895e2c7d
[Bug Fix] fix bug for PD EP ( #4823 )
...
* fix bug for PD EP
* fix
* optimize perf for engine worker queue
* fix bug
* fix internode ll two stage
* fix for ci
* fix bug
2025-11-10 15:33:29 +08:00
Neil Zhu
6de1ce3b25
[Metax] support ERNIE-4.5-VL-28B ( #4820 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-07 04:55:49 -08:00
ming1753
cba185f1fe
[Feature] Optim PaddleOCR-VL ( #4873 )
...
* [Feature] Optim PaddleOCR-VL
* fix bug
2025-11-07 14:56:44 +08:00
Haonan Luo
79e6bf4bdc
[Others] Delete PaddleOCR Useless Function ( #4815 )
...
* fix paddleocr prefix cache bug
* add test for paddleocr_vl
* disable prefix-caching in ocr
* add test for paddleocr_vl
* Fix top_p for rejection sampling
* delete useless func for paddleocr
---------
Co-authored-by: ming1753 <ideaminghp@163.com >
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com >
2025-11-07 11:14:41 +08:00
YuBaoku
fa28745f19
[CI] Update ERNIE-4.5-VL baseline to adapt to MoE changes ( #4867 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-06 22:02:10 +08:00
Jiang-Jia-Jun
6b68c58e8d
Revert "[Bug Fix] fix ernie4_5_vl_moe ( #4843 )" ( #4863 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
This reverts commit 6460d4df27 .
2025-11-06 19:18:29 +08:00
LokeZhou
6460d4df27
[Bug Fix] fix ernie4_5_vl_moe ( #4843 )
...
* fix ernie4_5_vl_moe
* fix vl moe meta
2025-11-06 19:16:33 +08:00
YuBaoku
819b2dbbae
Revert "【New Feature】W4afp8 supports per group quantization ( #4272 )" ( #4854 )
...
This reverts commit 93fcf7e4ec .
2025-11-06 17:48:28 +08:00
zhupengyang
b54eb7ad81
[XPU] ep+tp all2all ( #4836 )
2025-11-06 17:26:14 +08:00
Echo-Nie
c18b177f21
fix the get_act_fn,_load_st_projector ( #4824 )
2025-11-06 16:13:35 +08:00
Ayakouji
831266da7a
[Fix] fix ernie4_5_vl model torch format loadding ( #4447 )
...
* fix
* add test
* fix test
* fix test
* update
2025-11-06 14:34:21 +08:00
周周周
69fa741763
remove seq_lens_this_time ( #4821 )
2025-11-06 11:06:28 +08:00
K11OntheBoat
62dfad4a5f
[PD Disaggregation] Support Qwen3-MoE use PD + EP inference. ( #4691 )
...
support Qwen-MoE PD/EP
2025-11-06 10:32:15 +08:00
yangjianfengo1
93fcf7e4ec
【New Feature】W4afp8 supports per group quantization ( #4272 )
...
* w4afp8 支持per group
* code style
* 精度完成
* revert append attn utils
* ffn1 动态量化
* ffn2 支持动态量化
* code style
* code style
* 修改单测
* 修改单测
* fix bug
* Implement conditional parameter creation for layers
Add parameter creation for up_gate_proj_in_scale when ep_size > 1.
* code style
* fix conflict
* code style
* code style
* 修复w4aint8 精度
* fix ci
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
2025-11-05 21:00:23 +08:00
yinwei
ea1dd0e735
[XPU]Support V1 loader in weight_only Model ( #4808 )
...
* support v1 loader in wint8
* code style
* update
---------
Co-authored-by: root <root@gajl-bbc-onlinec-com-1498356.gajl.baidu.com >
2025-11-05 17:09:11 +08:00
周周周
876e4a8935
remove input_ids from ForwardMeta ( #4793 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-05 11:55:51 +08:00
kxz2002
9676cc87d6
fix parser register name ( #4795 )
...
Co-authored-by: luukunn <83932082+luukunn@users.noreply.github.com >
2025-11-05 11:27:30 +08:00
zhupengyang
2fd254e5b7
support ep+tp at op layer ( #4688 )
2025-11-05 11:15:57 +08:00
周周周
937eb3c6ed
[get_padding_offset.] clean get_padding_offset.cu ( #4777 )
...
[get_padding_offset.] clean get_padding_offset.cu (#4777 )
2025-11-05 10:47:40 +08:00
chen
1c3ca48128
[Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs ( #4769 )
2025-11-05 10:43:25 +08:00
xiaozude
74722308f2
[Metax] adapt cutlass moe and fix mla attention ( #4602 )
...
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-11-05 10:03:49 +08:00
lzy
3e9dda39ab
supports pd partn ( #4615 )
...
* supports pd partn
* fix codestype
2025-11-04 16:36:35 +08:00
lzy
af7e0f27f3
supports internode_ll_two_stage ( #4162 )
...
* supports internode_ll_two_stage
* supports internode_ll_two_stage
* supports internode_ll_two_stage
* supports internode_ll_two_stage
* supports D internode_ll_two_stage
* fix codestype
* fix xpu internode_ll_two_stage
* fix xpu internode_ll_two_stage
2025-11-04 16:35:40 +08:00
freeliuzc
855a2a609a
fix attn_params ( #4787 )
2025-11-04 13:01:38 +08:00
Yuan Xiaolan
8690cf8569
fix Cfp8 for RL load ( #4144 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-03 17:51:51 +08:00
Neil Zhu
c95d0740ec
[Metax] adapt cutlass moe for ernie-vl ( #4685 )
2025-11-03 17:44:27 +08:00
yinwei
377f3bf5f2
[XPU] add v1 support for bf16 ( #4744 )
...
* support v1 loader
* update code style
* update code
2025-11-03 14:13:17 +08:00
freeliuzc
11398790d3
[Speculative Decoding][MTP]Support attn mask offset ( #4641 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [MTP]Merge support attn (#4591 )
* support mask_offset in speculate decoding
* fix dummpy run output
* add unit test
* fix unit test import
* support attn_mask_offset in mtp mode
* add update_attn_mask op
* fix unit test && fix code-style
2025-11-03 10:08:01 +08:00