chen
|
2136990144
|
[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3536)
* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing
* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs
* code check
* code check
* fix tokenizer.decoder(-1), return 'Invalid Token'
* check seq len time shape
* logprob clip inf
* code check
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com>
|
2025-08-25 14:11:18 +08:00 |
|
lizexu123
|
67990e0572
|
[Feature] support min_p_sampling (#2872)
Deploy GitHub Pages / deploy (push) Has been cancelled
* Fastdeploy support min_p
* add test_min_p
* fix
* min_p_sampling
* update
* delete vl_gpu_model_runner.py
* fix
* Align usage of min_p with vLLM
* fix
* modified unit test
* fix test_min_sampling
* pre-commit all files
* fix
* fix
* fix
* fix xpu_model_runner.py
|
2025-07-20 23:17:59 -07:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Sunny-bot1
|
f6ad26fc08
|
fix topp default value (#2814)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-07-11 17:10:21 +08:00 |
|
chen
|
d33105baeb
|
[Feature] Online Chat API Support Return logprobs (#2777)
* online chat support logprobs
* check xpu
* check vl_gpu_model_runner and xpu_model_runner
* get_worker() check platform
|
2025-07-10 16:33:40 +08:00 |
|
Sunny-bot1
|
e45050cae3
|
[Feature] support top_k_top_p sampling (#2753)
* support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix
|
2025-07-09 20:58:58 -07:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|