周周周
|
b23e684b67
|
revert group size 3 (#5079)
|
2025-11-17 18:54:13 +08:00 |
|
周周周
|
c0a4393d72
|
[ATTENTION] unitest (#4962)
|
2025-11-14 13:45:53 +08:00 |
|
Haonan Luo
|
1b9f351d21
|
Support GPT-OSS-BF16 (#4240)
* [Feature] AppendAtten support sinks & HEAD_DIM=64
* fix bug
* fix bug
* fix bug
* fix bug
* [Feature] support gpt-oss
* fix bug
* add mask
* support-gpt-oss
* support-gpt-oss
* fix long seq
* support wint8
* support wint8
* support wint8
* update test
* change sliding windows init pos
---------
Co-authored-by: ming1753 <ideaminghp@163.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
|
2025-10-20 14:44:58 +08:00 |
|
Zhenghai Zhang
|
6adfbe07ad
|
【Hackathon 9th No.86】autogen MultiQueryDecoderAttention template_instantiation -part (#4383)
* split MultiQueryDecoderAttention template_instantiation
* update comment
* CI
|
2025-10-16 17:08:19 +08:00 |
|