Daci
|
5fc12eddfe
|
[Optimization] xgrammar async compile, multi thread, speed up (#4835)
* xgrammar async compile, multi thread, speed up
* fix test_sampler.py & pre-commit err
* add redis version check && fix request.llm_engine_recv_req_timestamp
* xgrammar prefill & decode & v0
* fix test_gpu_prompt_logprobs.py
* add test_guided_decoding.py
* Update fastdeploy/scheduler/splitwise_scheduler.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix torch xgrammar unittest env
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-11-14 18:05:26 +08:00 |
|
李泳桦
|
a012e3608b
|
[Feature] support logits processors (#4515)
* [feat] provide an interface for logits processors and a builtin LogitBiasLogitsProcessor
* [chore] fix code style
* [fix] add unit test & fix existing bugs
* [feat] add engine/worker arg --logits-processors
* [fix] redefine user args as logits_processors_args and fix some bugs
* [fix] fix test_sampler
* Update fastdeploy/model_executor/logits_processor/builtin.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/model_executor/logits_processor/__init__.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update tests/model_executor/test_logits_processor.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* [fix] fix typo
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* [fix] fix bracelet
* [chore] redefine logits processor interface: pass the entire share_inputs into LP, do not copy share_inputs and logits
* [doc] add docs
* [fix] fix logit bias processor not applied when decoding is too fast & add docs and tests
* [fix] fix redundant code
* [feat] skip apply() if no bias is specified
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-10-29 00:08:53 +08:00 |
|
chen
|
5c63a089f6
|
[Feature] Support logprobs_mode (#4567)
|
2025-10-27 14:27:48 +08:00 |
|
YUNSHEN XIE
|
3a6058e445
|
Add stable ci (#3460)
* add stable ci
* fix
* update
* fix
* rename tests dir;fix stable ci bug
* add timeout limit
* update
|
2025-08-20 08:57:17 +08:00 |
|