Neil Zhu
|
4403a21d4b
|
[Metax] refactor cutlass moe and optimize flash attention (#5361)
* [Metax] refactor moe and flash attention backend
---------
Co-authored-by: zhangchenyi_dl <16219492+zhangchenyidl@user.noreply.gitee.com>
|
2025-12-10 17:15:17 +08:00 |
|
xiaozude
|
c06a6234b9
|
[Metax] optimize mla attention (#5258)
|
2025-12-09 11:18:19 +08:00 |
|
RAM
|
b2908b8e82
|
[New][RL] Support Rollout Routing Replay (#5405)
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
* Revert "Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)"
This reverts commit c45e064f3d.
* Fix XPU and NPU bug
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
|
2025-12-05 22:06:26 +08:00 |
|
Jiang-Jia-Jun
|
c45e064f3d
|
Revert "[RL] Support Rollout Routing Replay (#5321)" (#5402)
This reverts commit 96d2d4877b.
|
2025-12-05 20:19:39 +08:00 |
|
RAM
|
96d2d4877b
|
[RL] Support Rollout Routing Replay (#5321)
* [RL] Support Rollout Routing Replay
* add routing indices cache
* fix config bug and moe forward bug
* R3 Support GLM
* support eb4.5
* fix merge bug
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* add routing replay ci
* support glm topk
* support orther top_k
* fix ci bug
* pre-commit
* only support chatcmpl
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
|
2025-12-05 20:01:33 +08:00 |
|
zccjjj
|
5b900667e3
|
[XPU] support ep4tp1+v1 loader (#5398)
|
2025-12-05 18:51:15 +08:00 |
|
zccjjj
|
e927c65742
|
[XPU] [Optimization] [EP] EP communication optimization. (#5145)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-12-05 10:03:45 +08:00 |
|
fmiao2372
|
209006e6a6
|
[Intel HPU] fix memory fragmentation issue due to warmup process and fix moe all_reduce issue (#5357)
|
2025-12-04 11:29:41 +08:00 |
|
K11OntheBoat
|
2e1680838f
|
[PD Disaggregation] Support PD deployment of DeepSeekv3. (#5251)
* Support deepseekv3 cache transfer for PD deploy
* clean some log info
---------
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com”>
|
2025-12-02 14:11:50 +08:00 |
|
fmiao2372
|
2c7683d551
|
[Intel HPU] change MoE weights and scales from list to tensor and add… (#5289)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Intel HPU] change MoE weights and scales from list to tensor and add q/k rms norm
* update doc
* move HPU_CHUNK_SIZE into envs
|
2025-11-28 19:17:05 +08:00 |
|
Yuanle Liu
|
cb56d46694
|
[Optimization] Refine row parallel bias and nranks and moe all_reduce (#5247)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* rename nranks to tp_size and fix bias in v1 loader
* fix
* update
|
2025-11-26 05:09:09 -08:00 |
|
xiaozude
|
d5bd64336a
|
[Metax] support ENABLE_V1_KVCACHE_SCHEDULER (#5163)
|
2025-11-24 19:19:49 +08:00 |
|
周周周
|
6fa34102e8
|
[Others]get_block_shape_and_split_kv_block clean code (#5123)
|
2025-11-20 16:40:04 +08:00 |
|
Neil Zhu
|
0edda75a56
|
[Metax] optimize cutlass moe and flash attention backend (#5128)
|
2025-11-20 16:12:35 +08:00 |
|
MingkunZhang
|
a36c958c66
|
[Metax] support default_v1 loader based #4988 (#5001)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-11-18 09:44:30 +08:00 |
|
fmiao2372
|
74f33efdbf
|
[Intel HPU] fix bugs caused by other commits (#5074)
* [Intel HPU] fix bugs caused by other commits
* update code by copilot
|
2025-11-17 15:28:55 +08:00 |
|
xiaozude
|
68f638f8b9
|
[Metax] support default_v1 loader and quant_config is None for triton moe (#5030)
|
2025-11-17 10:38:00 +08:00 |
|
fmiao2372
|
e43a5fc055
|
[Intel HPU] enable level 1 prefix caching and fix some bugs (#4971)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [Intel HPU] enable prefix caching and dense tp moe ep and fix some bugs
* update code by copilot
* remove dense tp and moe ep code
|
2025-11-14 19:42:50 +08:00 |
|
xiaozude
|
c45b3ccb52
|
[Metax] optimize flash mla (#4915)
|
2025-11-12 16:43:46 +08:00 |
|
MingkunZhang
|
9d9f5df8d0
|
[Metax] support default_v1 loader & thinking model (#4956)
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>
|
2025-11-12 16:32:26 +08:00 |
|
bukejiyu
|
b09ebb2813
|
refactor pt loading (#4532)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
|
2025-11-11 21:30:39 +08:00 |
|
Neil Zhu
|
6de1ce3b25
|
[Metax] support ERNIE-4.5-VL-28B (#4820)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
|
2025-11-07 04:55:49 -08:00 |
|
yinwei
|
ea1dd0e735
|
[XPU]Support V1 loader in weight_only Model (#4808)
* support v1 loader in wint8
* code style
* update
---------
Co-authored-by: root <root@gajl-bbc-onlinec-com-1498356.gajl.baidu.com>
|
2025-11-05 17:09:11 +08:00 |
|
zhupengyang
|
2fd254e5b7
|
support ep+tp at op layer (#4688)
|
2025-11-05 11:15:57 +08:00 |
|
xiaozude
|
74722308f2
|
[Metax] adapt cutlass moe and fix mla attention (#4602)
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
|
2025-11-05 10:03:49 +08:00 |
|
yinwei
|
377f3bf5f2
|
[XPU] add v1 support for bf16 (#4744)
* support v1 loader
* update code style
* update code
|
2025-11-03 14:13:17 +08:00 |
|
Lucas
|
5c6105f4a2
|
[XPU] bind some OPs for VL model with pybind (#4522)
|
2025-10-27 10:50:08 +08:00 |
|
yyssys
|
822dea8d5f
|
[XPU]Moe uses a new operator (#4585)
* [XPU]Moe uses a new operator
* [XPU]Moe uses a new operator
* update response
|
2025-10-24 23:01:46 +08:00 |
|
xiaozude
|
f7069b8057
|
[Metax] adapt DeepSeek (#4498)
|
2025-10-24 10:14:53 +08:00 |
|
zhupengyang
|
3a43dbf82d
|
[XPU] merge apply_tp, ops support token_num = 0 (#4507)
|
2025-10-23 19:09:58 +08:00 |
|
yinwei
|
bf03b6fcea
|
fix vl bug (#4485)
|
2025-10-20 20:13:34 +08:00 |
|
yyssys
|
97ee3c403a
|
[XPU]Fix w4a8 garbled code issue (#4493)
|
2025-10-20 19:41:11 +08:00 |
|
SuperNova
|
80a16c4c87
|
[fix] adjust mctlass moe api (#4474)
|
2025-10-20 14:23:54 +08:00 |
|
yinwei
|
a64c0408b9
|
[XPU]Fix w4a8 precision bug && rollback moe algo (#4463)
* fix w4a8 precision bug
* add env
* code stype check
|
2025-10-17 18:27:53 +08:00 |
|
chen
|
b134e6afe6
|
[BugFix]Dev fix custom ar unstable result (#4437)
|
2025-10-17 11:47:16 +08:00 |
|
YuanRisheng
|
0355235fb9
|
[FDConfig]Remove total_block_num/dtype/block_size/enc_dec_block_num in ParallelConfig (#4400)
* delete some attr in parallel config
* delete comment
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-16 20:00:37 +08:00 |
|
zhupengyang
|
26ff2f8683
|
[XPU] refine fused moe (#4219)
|
2025-10-16 19:04:07 +08:00 |
|
Lucas
|
a5063b96c8
|
[XPU] moe support VL 0-dim input (#4408)
|
2025-10-16 14:01:01 +08:00 |
|
zhupengyang
|
d6f775e33b
|
[XPU] fix ep (#4393)
|
2025-10-15 11:41:05 +08:00 |
|
YuanRisheng
|
a2ec2c4152
|
[FDConfig]Remove max_model_len in FDConfig (#4350)
* modify max_model_len
* fix unittest
* fix unittest
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-11 14:04:17 +08:00 |
|
yinwei
|
20c7b741f4
|
[XPU] Support W4A8C8-TP4-300B Model (#4068)
* support w4a8
* delete ep block attn
* delete moe_topk_select
* update note
* update
* delte useless info
* update
* add some note
* fix some format
* update scale info
* add ans baseline
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
|
2025-10-10 15:41:32 +08:00 |
|
xiaozude
|
7c919070f7
|
[Metax] support cutlass moe & optimize flash attention (#4208)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-09-29 11:22:43 +08:00 |
|
Lucas
|
87179cb744
|
[XPU] support XPU VL model inference (#4030)
* [XPU] support XPU VL model inference
* fix image op import and device check
* rebase develop
* fix perf
|
2025-09-25 14:34:15 +08:00 |
|
fmiao2372
|
f1b5392e20
|
[Intel HPU] Support intel hpu platform (#4161)
* [Intel HPU] Support intel hpu platform
* fix some issues
* apply precommit and move AttentionBackend_HPU
* fix format issue
* correct ops import
* fix ci issue
* update code in layers
* fix code style issue
* remove dense tp moe ep mode
* fix enc_dec_block_num
* fix rebase issue
* rename hpu to gaudi in readme
* rename ForwardMeta_HPU to HPUForwardMeta
|
2025-09-24 12:27:50 +08:00 |
|
YuanRisheng
|
2e9e53ff7e
|
[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116)
* remove max_num_batched_tokens in parallel config
* remove max_num_seqs
* update test case
* fix test
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-09-17 10:43:35 +08:00 |
|
zhupengyang
|
9409665713
|
[xpu] support ep (#4067)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-09-15 13:53:11 +08:00 |
|
SuperNova
|
805f29a06c
|
[Feature] refactor metax_gpu attention and moe and remove some useless code (#3688)
Co-authored-by: yongqiangma <xing.wo@163.com>
|
2025-09-12 14:40:25 +08:00 |
|
co63oc
|
d6369b4d51
|
fix typos (#3684)
|
2025-09-01 17:50:17 +08:00 |
|
lifulll
|
72094d4d82
|
enable dcu ci (#3402)
|
2025-08-29 10:23:08 +08:00 |
|
xiaoxiaohehe001
|
9afa236e39
|
[NewFeatures] support eplb (#3547)
* [NewFeatures] support eplb
* fix eplb
|
2025-08-26 16:19:30 +08:00 |
|