Haonan Luo
|
1b9f351d21
|
Support GPT-OSS-BF16 (#4240)
* [Feature] AppendAtten support sinks & HEAD_DIM=64
* fix bug
* fix bug
* fix bug
* fix bug
* [Feature] support gpt-oss
* fix bug
* add mask
* support-gpt-oss
* support-gpt-oss
* fix long seq
* support wint8
* support wint8
* support wint8
* update test
* change sliding windows init pos
---------
Co-authored-by: ming1753 <ideaminghp@163.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
|
2025-10-20 14:44:58 +08:00 |
|
Zhenghai Zhang
|
6adfbe07ad
|
【Hackathon 9th No.86】autogen MultiQueryDecoderAttention template_instantiation -part (#4383)
* split MultiQueryDecoderAttention template_instantiation
* update comment
* CI
|
2025-10-16 17:08:19 +08:00 |
|
Zhenghai Zhang
|
c46d5e48f8
|
【Hackathon 9th No.86】autogen MultiQueryAppendC8Attention template_instantiation -part (#4330)
* split MultiQueryAppendC8Attention template_instantiation
* update setup_ops.py
* fix ci
* fix bug
|
2025-10-10 15:07:48 +08:00 |
|
lzy
|
af49b81ffd
|
supports dynamic Cfp8 (#3767)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* supports dynamic Cfp8
* add unittest
|
2025-09-07 20:41:29 -07:00 |
|
周周周
|
ddb10ac509
|
[Inference, rename] remove padding_offsets from atten use batch_id_per_token (#2880)
* remove padding_offsets from atten
|
2025-07-17 18:41:31 +08:00 |
|
周周周
|
aa76085d1f
|
[Attention] remove cum_offsets from atten, and use cu_seqlens_q (#2870)
Deploy GitHub Pages / deploy (push) Has been cancelled
[Attention] remove cum_offsets from atten, and use cu_seqlens_q (#2870)
|
2025-07-16 20:10:57 +08:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|