yangjianfengo1
3754a9906d
[Feature] block sparse attention ( #3668 )
...
* 支持稀疏attn
* fix bug
* code style
* fix moba attn get kv shape
* 修复a100编译
* codestyle
* code style
* code style
* code style
* fix conflict
* 增加单侧
* code style
* 增加eblite 加载时间
* fix bug
* for ci
* for ci
* for ci
* for ci
* 支持mlp block size 128
* 增加小算子单测
* fix 单测 mlp
* 将环境变量加入到config里面
* fix rollout config
* 修复显存
* add test server
* add test server
* fix mlp 最后一层使用full attn
2025-08-29 19:46:30 +08:00
Jiang-Jia-Jun
c694fa2879
Revert "[Feature] block sparse attention ( #3209 )" ( #3647 )
...
This reverts commit 646a0c2fd8
.
2025-08-27 17:35:04 +08:00
yangjianfengo1
646a0c2fd8
[Feature] block sparse attention ( #3209 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* 支持稀疏attn
* fix bug
* code style
* fix moba attn get kv shape
* 修复a100编译
* codestyle
* code style
* code style
* code style
* fix conflict
* 增加单侧
* code style
* 增加eblite 加载时间
* fix bug
* for ci
* for ci
* for ci
* for ci
* 支持mlp block size 128
* 增加小算子单测
* fix 单测 mlp
* 将环境变量加入到config里面
* fix rollout config
2025-08-26 07:16:04 -07:00
YuanRisheng
5b66462f0e
Fix fdconfig bugs ( #3528 )
...
* fix config
* fix parallel
* fix ips
* fix rl
* open code
2025-08-22 16:17:15 +08:00
Zero Rains
b2f9a42d87
[Feature] Support repetition early stop ( #3024 )
...
* support repetition early stop and support user to set the parameter
* remove log
* fix codestyle
* add the early_stop_config to rollout_config
* update config and EarlyStopper class
* fix the bug for triton
* modify the stop method
* update description
* modify the usage for stop_flags
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-07-29 22:42:54 +08:00
YuanRisheng
502ee92a0a
Unify server-side and model-side Config (Part3) ( #3047 )
...
* merge model config
* fix arch
* fix rl
2025-07-29 17:07:44 +08:00
YuanRisheng
1a815b7a2a
Fix Speculative Config bug ( #3049 )
...
* fix speculative bug
* fix rl
2025-07-29 10:50:48 +08:00
gaoziyuan
dbe6225b33
fix rl config local rank ( #2957 )
2025-07-22 04:39:54 -07:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
Yuanle Liu
dbb9e2506b
Fix rollout_model init ( #2881 )
2025-07-16 22:36:21 -07:00
Yuanle Liu
63d6e7ce06
fix and refine vl ( #2866 )
...
* refine vl config
* delete attn_sep
* fix vl accuracy
2025-07-16 05:59:28 -07:00
Yuanle Liu
dda4a9f848
rl update ( #2861 )
2025-07-16 00:33:10 -07:00
RAM
0fad10b35a
[Executor] CUDA Graph support padding batch ( #2844 )
...
* cuda graph support padding batch
* Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes.
* Do not insert max_num_seqs when the user specifies a capture list
* Support set graph optimization config from YAML file
* update cuda graph ci
* fix ci bug
* fix ci bug
2025-07-15 19:49:01 -07:00
bukejiyu
bad53c6b6e
[vl]remove duplicated load logic ( #2744 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-13 07:36:26 +08:00
chen
2c3607407f
check ( #2811 )
2025-07-11 13:54:52 +08:00
gaoziyuan
26d5d737dd
【Fearture】support qwen2 some func ( #2740 )
...
* add rl qwen model support
* fix
* fix
2025-07-08 12:03:04 +08:00