Commit Graph

6 Commits

Author SHA1 Message Date
chen
2136990144 [Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3536)
* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing

* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs

* code check

* code check

* fix tokenizer.decoder(-1), return 'Invalid Token'

* check seq len time shape

* logprob clip inf

* code check

---------

Co-authored-by: sunlei1024 <sunlei5788@gmail.com>
2025-08-25 14:11:18 +08:00
lizexu123
67990e0572 [Feature] support min_p_sampling (#2872)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* Fastdeploy support min_p

* add test_min_p

* fix

* min_p_sampling

* update

* delete vl_gpu_model_runner.py

* fix

* Align usage of min_p with vLLM

* fix

* modified unit test

* fix test_min_sampling

* pre-commit all files

* fix

* fix

* fix

* fix xpu_model_runner.py
2025-07-20 23:17:59 -07:00
ming1753
1f15ca21e4 [Feature] support prompt repetition_penalty (#2806)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-17 12:05:52 +08:00
chen
d33105baeb [Feature] Online Chat API Support Return logprobs (#2777)
* online chat support logprobs

* check xpu

* check vl_gpu_model_runner and xpu_model_runner

* get_worker() check platform
2025-07-10 16:33:40 +08:00
Jiang-Jia-Jun
92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72 [LLM] First commit the llm deployment code 2025-06-09 19:20:15 +08:00