Juncai
|
80efe98f8d
|
[PD Disaggregation] Add timestamp for analyzing splitwise deployment (#5317)
* Add timestamp for analyzing splitwise deployment
* up
* up
* up
* up
* up
* up
* fix format
* fix
|
2025-12-08 10:08:44 +08:00 |
|
qwes5s5
|
117980dd4e
|
[LogProbs]Enable prompt logprobs output and modify data transmission method for the online interface. (#5089)
* add prompt logprobs
* Merge prompt_logprobs_tensors and prompt_logprobs
* fix param check
* trigger ci
* fix unitest
* fix logprobs bug
|
2025-12-02 13:49:51 +08:00 |
|
SunLei
|
c424e08dc5
|
[Speculative Decoding] split draft_tokens into standalone post-processing path (#5205)
* refactor(mtp): split draft_tokens into standalone post-processing path for MTP + logprobs
* Restore Request.__repr__ implementation
* ci
* add envs
* fix unittest
|
2025-11-27 11:22:41 +08:00 |
|
chenjian
|
3ea1b44a58
|
[Optimization] Improve perf for fd response token with internal adapter (#4992)
* [Optimize] Improve perf for fd response token with internal adapter
* fix
* fix bug
* fix ci
* fix ci
* fix ci
* fix ci
|
2025-11-21 19:02:03 +08:00 |
|
qwes5s5
|
a2d06118e1
|
[Logprobs]Support prompt_logprobs and max_logprobs (#4897)
* add prompt logprobs
* trigger ci
* fix unitest
* Update fastdeploy/config.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/entrypoints/llm.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix max_logprobs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-11-12 19:29:48 +08:00 |
|