ming1753
|
69be77c8c0
|
[Feature] support prompt repetition_penalty (#2954)
* [Feature] support prompt repetition_penalty (#2806)
* [Bug Fix] fix bug of prompt penalty (#2888)
|
2025-07-22 19:42:33 +08:00 |
|
chen
|
d33105baeb
|
[Feature] Online Chat API Support Return logprobs (#2777)
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner and xpu_model_runner
* get_worker() check platform
|
2025-07-10 16:33:40 +08:00 |
|
Sunny-bot1
|
1e2319cbef
|
Rename top_p_sampling to top_k_top_p_sampling (#2791)
|
2025-07-10 00:09:25 -07:00 |
|
Sunny-bot1
|
e45050cae3
|
[Feature] support top_k_top_p sampling (#2753)
* support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix
|
2025-07-09 20:58:58 -07:00 |
|
GoldPancake
|
f7cad30a38
|
[Feature] Add speculative decoding simulation benchmark. (#2751)
* Add speculative decoding simulation benchmark
* Fix the name of the parameter
|
2025-07-09 12:08:43 +08:00 |
|
EnflameGCU
|
d0f4d6ba3a
|
[GCU] Support gcu platform (#2702)
baseline: e7fa57ebae
Co-authored-by: yongqiangma <xing.wo@163.com>
|
2025-07-08 13:00:52 +08:00 |
|
liddk1121
|
1b54a2831e
|
Adapt for iluvatar gpu (#2684)
|
2025-07-07 16:53:14 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|