freeliuzc
|
76759108c9
|
[Feature][SpeculativeDecoding]Support tree-attention (#3514)
* support tree-attention
* fix merge bug
* fix unit-test api
* fix merge bug
|
2025-08-22 13:36:41 +08:00 |
|
lzy
|
1e06b9fa6d
|
make append_attn supports mask_offset (#3138)
* make append_attn supports mask_offset
* add unittest
|
2025-08-14 03:40:55 -07:00 |
|
周周周
|
ddb10ac509
|
[Inference, rename] remove padding_offsets from atten use batch_id_per_token (#2880)
* remove padding_offsets from atten
|
2025-07-17 18:41:31 +08:00 |
|
周周周
|
aa76085d1f
|
[Attention] remove cum_offsets from atten, and use cu_seqlens_q (#2870)
Deploy GitHub Pages / deploy (push) Has been cancelled
[Attention] remove cum_offsets from atten, and use cu_seqlens_q (#2870)
|
2025-07-16 20:10:57 +08:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|