chen
|
fbb4e0f8d1
|
[CP]Glm45 air 2.2 (#4073)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Feature] Support zai-org/GLM-4.5-Air BF16 model (#3928)
* support glm45_air
* [Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) (#4051)
* check
* fix v1 load for mix and wint8
* check --quantizations 'None'
* check
* support RL rollout
* check v1 loader
* check glm rollout_model, change wfp8afp8 per_token_cast_to_fp8 to native impl
* check rollout moe gate begin layer_id
* check rollout e_score_correction_bias
* delete infer_to_train_mapping={}
* code check
|
2025-09-15 18:52:58 +08:00 |
|
zhouchong
|
ccd52b5596
|
[Model]support qwen2_5_vl (#3557)
* adapt qwen_2_5_vl model
* adapt qwen_2_5_vl VIT model
* adapt qwen2_5_vl images_embeds
* adapt qwen2_5_vl 3D rope
* adapt qwen2_5_vl 3D rope v2
* adapt qwen2_5_vl processor
* adapt qwen2_5_vl bypass resampler_model
* adapt qwen2_5_vl 绕过部分ernie逻辑
* adapt qwen2_5_vl 绕过部分ernie逻辑 v2
* adapt qwen2_5_vl 权重加载与命名修改
* adapt qwen2_5_vl 非必须think_end_id
* adapt qwen2_5_vl 区分多种模型的extract_vision_features
* fix:adapt qwen2_5_vl model
* adapt qwen2_5_vl norm
* adapt qwen2_5_vl processor 更新
* adapt qwen2_5_vl image and video success
* adapt qwen2_5_vl 部分整理代码
* adapt qwen2_5_vl 支持多卡
* adapt qwen2_5_vl on latest develop
* adapt qwen2_5_vl RL
* adapt qwen2_5_vl 整理代码
* support noex rope3d
* adapt qwen2_5_vl add init.py
* adapt qwen2_5_vl add init.py v2
* adapt qwen2_5_vl remove space
* adapt qwen2_5_vl remove space v2
* adapt qwen2_5_vl pre-commit
* adapt qwen2_5_vl update
* adapt qwen2_5_vl pre-commit v2
* adapt qwen2_5_vl modify comments
* adapt qwen2_5_vl fix indentation
* adapt qwen2_5_vl fix indentation v2
---------
Co-authored-by: wangyafeng <wangyafeng@baidu.com>
Co-authored-by: xiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com>
Co-authored-by: CSWYF3634076 <58356743+CSWYF3634076@users.noreply.github.com>
|
2025-08-29 18:28:39 +08:00 |
|
Kane2011
|
b4fef2cf29
|
[MetaxGPU] Support FastDeploy on metax gpu (#3241)
* [MetaxGPU] Support FastDeploy on metax gpu
* Update metax_worker.py
1. change worker log;
2. remove custom allreduce, adapt it later;
3. remove cuda graph;
* Update __init__.py
1. remove metax's key work comment
* Update __init__.py
1. remove metax's key word comment;
2. add fused_moe_kernel_paddle import
---------
Co-authored-by: yongqiangma <xing.wo@163.com>
|
2025-08-13 11:11:54 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Yuanle Liu
|
63d6e7ce06
|
fix and refine vl (#2866)
* refine vl config
* delete attn_sep
* fix vl accuracy
|
2025-07-16 05:59:28 -07:00 |
|
EnflameGCU
|
d0f4d6ba3a
|
[GCU] Support gcu platform (#2702)
baseline: e7fa57ebae
Co-authored-by: yongqiangma <xing.wo@163.com>
|
2025-07-08 13:00:52 +08:00 |
|
liddk1121
|
1b54a2831e
|
Adapt for iluvatar gpu (#2684)
|
2025-07-07 16:53:14 +08:00 |
|
Jiang-Jia-Jun
|
05c670e593
|
[Sync] Update to latest code (#2679)
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
|
2025-07-03 15:43:53 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|