freeliuzc
|
c8985727a6
|
support mtp in hybird-dp-tp mode (#4299)
|
2025-09-28 15:58:45 +08:00 |
|
YuanRisheng
|
808b548761
|
support tmp (#3675)
|
2025-08-28 19:42:32 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Yuanle Liu
|
61b3997b85
|
refactor rl get_name_mappings_to_training (#2847)
Deploy GitHub Pages / deploy (push) Has been cancelled
* refactor rl get_name_mappings_to_training
* fix tp>1
* change variable name(ffn1->up_gate_proj/ffn2->down_proj)
* change variable name(linear_weight->weight/linear_bias->bias)
* add rl names mapping for vl
* fix ernie 0.3B error
* fix develop code
* fix
|
2025-07-15 07:31:42 -07:00 |
|
GoldPancake
|
e7fa57ebae
|
Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707)
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix mtp eh_proj layer
* fix mtp update_cfg function
* fix stringdoc
* simplify class name
|
2025-07-04 14:15:04 +08:00 |
|