Commit Graph

36 Commits

Author SHA1 Message Date
gaoziyuan
6ef3b611b0 add dp config (#3822) 2025-09-04 11:46:48 +08:00
ltd0924
905d89e42f [Feature] support model weight update in ep (#3765)
* support model weight update in ep

* support model weight update in ep

* support model weight update in ep

* support model weight update in ep

* Update fused_moe_backend_base.py

* Update worker_process.py

* Update worker_process.py

* Update dynamic_weight_manager.py
2025-09-02 17:16:03 +08:00
yangjianfengo1
3754a9906d [Feature] block sparse attention (#3668)
* 支持稀疏attn

* fix bug

* code style

* fix moba attn get kv shape

* 修复a100编译

* codestyle

* code style

* code style

* code style

* fix conflict

* 增加单侧

* code style

* 增加eblite 加载时间

* fix bug

* for ci

* for ci

* for ci

* for ci

* 支持mlp block size 128

* 增加小算子单测

* fix 单测 mlp

* 将环境变量加入到config里面

* fix rollout config

* 修复显存

* add test server

* add test server

* fix mlp  最后一层使用full attn
2025-08-29 19:46:30 +08:00
zhouchong
ccd52b5596 [Model]support qwen2_5_vl (#3557)
* adapt qwen_2_5_vl model

* adapt qwen_2_5_vl VIT model

* adapt qwen2_5_vl images_embeds

* adapt qwen2_5_vl 3D rope

* adapt qwen2_5_vl 3D rope v2

* adapt qwen2_5_vl processor

* adapt qwen2_5_vl bypass resampler_model

* adapt qwen2_5_vl 绕过部分ernie逻辑

* adapt qwen2_5_vl 绕过部分ernie逻辑 v2

* adapt qwen2_5_vl 权重加载与命名修改

* adapt qwen2_5_vl 非必须think_end_id

* adapt qwen2_5_vl 区分多种模型的extract_vision_features

* fix:adapt qwen2_5_vl model

* adapt qwen2_5_vl norm

* adapt qwen2_5_vl  processor 更新

* adapt qwen2_5_vl image and video success

* adapt qwen2_5_vl 部分整理代码

* adapt qwen2_5_vl 支持多卡

* adapt qwen2_5_vl on latest develop

* adapt qwen2_5_vl RL

* adapt qwen2_5_vl 整理代码

* support noex rope3d

* adapt qwen2_5_vl add init.py

* adapt qwen2_5_vl add init.py v2

* adapt qwen2_5_vl remove space

* adapt qwen2_5_vl remove space v2

* adapt qwen2_5_vl pre-commit

* adapt qwen2_5_vl update

* adapt qwen2_5_vl pre-commit v2

* adapt qwen2_5_vl modify comments

* adapt qwen2_5_vl fix indentation

* adapt qwen2_5_vl fix indentation v2

---------

Co-authored-by: wangyafeng <wangyafeng@baidu.com>
Co-authored-by: xiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com>
Co-authored-by: CSWYF3634076 <58356743+CSWYF3634076@users.noreply.github.com>
2025-08-29 18:28:39 +08:00
Yuanle Liu
4957908275 add input_processor plugin (#3657)
* add input_processor plugin

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-08-28 22:53:57 +08:00
Jiang-Jia-Jun
c694fa2879 Revert "[Feature] block sparse attention (#3209)" (#3647)
This reverts commit 646a0c2fd8.
2025-08-27 17:35:04 +08:00
yangjianfengo1
646a0c2fd8 [Feature] block sparse attention (#3209)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* 支持稀疏attn

* fix bug

* code style

* fix moba attn get kv shape

* 修复a100编译

* codestyle

* code style

* code style

* code style

* fix conflict

* 增加单侧

* code style

* 增加eblite 加载时间

* fix bug

* for ci

* for ci

* for ci

* for ci

* 支持mlp block size 128

* 增加小算子单测

* fix 单测 mlp

* 将环境变量加入到config里面

* fix rollout config
2025-08-26 07:16:04 -07:00
bukejiyu
77514e3e1e [V1 Loader] support weight_only (#3413)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* support wint4/wint8

* delete smoe case

* update ci

* print log
2025-08-23 13:13:41 +08:00
YuanRisheng
5b66462f0e Fix fdconfig bugs (#3528)
* fix config

* fix parallel

* fix ips

* fix rl

* open code
2025-08-22 16:17:15 +08:00
gaoziyuan
6fdd83da10 fix some bug (#3434) 2025-08-18 14:39:13 +08:00
gaoziyuan
ccc7f1beb3 fix mapping (#3320) 2025-08-12 16:15:59 +08:00
bukejiyu
20839abccf qwen3_moe (#3084) 2025-08-06 14:45:27 +08:00
lizhenyun01
fe540f6caa [plugin] Custom model_runner/model support (#3186)
* support custom model&&model_runner

* fix merge

* add test && update doc

* fix codestyle

* fix unittest

* load model in rl
2025-08-04 18:52:39 -07:00
gaoziyuan
0443587a57 【Feature】support qwen3 name_mapping (#3179)
* add fd plugins && rm model_classed

* fix reviews

* add docs

* fix

* fix unitest ci

* support qwen3 name_mapping
2025-08-04 01:34:07 -07:00
gaoziyuan
4021d66ea5 【Feature】add fd plugins && rm model_classes (#3123)
* add fd plugins && rm model_classed

* fix reviews

* add docs

* fix

* fix unitest ci
2025-08-03 19:53:20 -07:00
bukejiyu
db698bda01 qwen loader (#3057) 2025-07-30 19:09:38 +08:00
Zero Rains
b2f9a42d87 [Feature] Support repetition early stop (#3024)
* support repetition early stop and support user to set the parameter

* remove log

* fix codestyle

* add the early_stop_config to rollout_config

* update config and EarlyStopper class

* fix the bug for triton

* modify the stop method

* update description

* modify the usage for stop_flags

---------

Co-authored-by: Yuanle Liu <yuanlehome@163.com>
2025-07-29 22:42:54 +08:00
YuanRisheng
502ee92a0a Unify server-side and model-side Config (Part3) (#3047)
* merge model config

* fix arch

* fix rl
2025-07-29 17:07:44 +08:00
YuanRisheng
1a815b7a2a Fix Speculative Config bug (#3049)
* fix speculative bug

* fix rl
2025-07-29 10:50:48 +08:00
Zero Rains
0fb37ab7e4 update flake8 version to support pre-commit in python3.12 (#3000)
* update flake8 version to support pre-commit in python3.12

* polish code
2025-07-24 01:43:31 -07:00
gaoziyuan
dbe6225b33 fix rl config local rank (#2957) 2025-07-22 04:39:54 -07:00
gaoziyuan
95a214ae43 support trainer_degree in name_mapping (#2935) 2025-07-20 23:12:55 -07:00
Zero Rains
25698d56d1 polish code with new pre-commit rule (#2923) 2025-07-19 23:19:27 +08:00
gaoziyuan
6efad14b95 support vl ori_vacab_size (#2900) 2025-07-18 16:26:14 +08:00
Yuanle Liu
dbb9e2506b Fix rollout_model init (#2881) 2025-07-16 22:36:21 -07:00
Yuanle Liu
63d6e7ce06 fix and refine vl (#2866)
* refine vl config

* delete attn_sep

* fix vl accuracy
2025-07-16 05:59:28 -07:00
Yuanle Liu
dda4a9f848 rl update (#2861) 2025-07-16 00:33:10 -07:00
RAM
0fad10b35a [Executor] CUDA Graph support padding batch (#2844)
* cuda graph support padding batch

* Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes.

* Do not insert max_num_seqs when the user specifies a capture list

* Support set graph optimization config from YAML file

* update cuda graph ci

* fix ci bug

* fix ci bug
2025-07-15 19:49:01 -07:00
Yuanle Liu
61b3997b85 refactor rl get_name_mappings_to_training (#2847)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* refactor rl get_name_mappings_to_training

* fix tp>1

* change variable name(ffn1->up_gate_proj/ffn2->down_proj)

* change variable name(linear_weight->weight/linear_bias->bias)

* add rl names mapping for vl

* fix ernie 0.3B error

* fix develop code

* fix
2025-07-15 07:31:42 -07:00
YuanRisheng
4c7b8bc458 Simplify the Config code (#2770)
* simplify the code

* fix vl

* delete config

* fix

* perfect code

* fix ci

* fix xpu

* fix xpu

* fix server

* resolve conflict

* fix mtp

* resolve conflict

* fix xpu

* fix xpu

* fix vl

* fix log

* fix qwen moe

* fix qwen moe

* fix qwen moe
2025-07-14 19:50:05 +08:00
bukejiyu
bad53c6b6e [vl]remove duplicated load logic (#2744)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-13 07:36:26 +08:00
gaoziyuan
749b2e9c89 support qwen3moe name_mapping (#2820) 2025-07-12 12:05:54 +08:00
chen
2c3607407f check (#2811) 2025-07-11 13:54:52 +08:00
gaoziyuan
26d5d737dd 【Fearture】support qwen2 some func (#2740)
* add rl qwen model support

* fix

* fix
2025-07-08 12:03:04 +08:00
Jiang-Jia-Jun
9fd74f75bd Update dynamic_weight_manager.py 2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun
05c670e593 [Sync] Update to latest code (#2679)
* [Sync] Update to latest code

* Add new code files

* Add new code files

* update code

* Try to fix build.sh

* Try to fix build.sh

* Update code

* Update requirements.txt

* Update code

---------

Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
2025-07-03 15:43:53 +08:00