Daci
|
83dbc4e5dd
|
[Feature] Guided Decoding add LLguidance backend (#5124)
* llguidance
* add requirements_guided_decoding.txt and doc
* fix test_guidance_*.py
* fix test_guidance_*.py && mv
* fix llguidance choice
* test_guidance_*
* rm lazy loader
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
|
2025-12-03 20:23:57 +08:00 |
|
Daci
|
eab8384da6
|
[Feature] ThreadPoolExecutor async fill_token_bitmask (#5083)
* ThreadPoolExecutor async fill_token_bitmask
* ThreadPoolExecutor async fill_token_bitmask logging
* fix test_guided_decoding
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* add fill_bitmask_parallel_batch_size ENV
* FD_FILL_BITMASK_BATCH fastdeploy.envs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-11-19 10:04:16 +08:00 |
|
Daci
|
5fc12eddfe
|
[Optimization] xgrammar async compile, multi thread, speed up (#4835)
* xgrammar async compile, multi thread, speed up
* fix test_sampler.py & pre-commit err
* add redis version check && fix request.llm_engine_recv_req_timestamp
* xgrammar prefill & decode & v0
* fix test_gpu_prompt_logprobs.py
* add test_guided_decoding.py
* Update fastdeploy/scheduler/splitwise_scheduler.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix torch xgrammar unittest env
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-11-14 18:05:26 +08:00 |
|
YuanRisheng
|
a37c9416ac
|
[FDConfig]Remove reasoning_parser/guided_decoding_backend/disable_any_whitespace/device_ids in FDConfig (#4362)
* remove devices id
* fix unittest
* fix ce
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-17 10:40:59 +08:00 |
|
YuanRisheng
|
2e9e53ff7e
|
[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116)
* remove max_num_batched_tokens in parallel config
* remove max_num_seqs
* update test case
* fix test
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-09-17 10:43:35 +08:00 |
|
kevin
|
1908465542
|
[Feature] mm and thinking model support structred output (#2749)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* mm support structured output
* update code
* update code
* update format
* update code
* update code
* add enable_thinking default
* update code
* add structured_outputs test case
* add ci install xgrammar
* add ci timeout time
* update test for structured_outputs
* update code
* add error traceback info
* update error msg
* update structred output code
* update code
* update code
* update config
* update torch version
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-09-02 16:21:09 +08:00 |
|
kevin
|
67298cf4c0
|
add error traceback info (#3419)
Deploy GitHub Pages / deploy (push) Has been cancelled
* add error traceback info
* update error msg
* update code
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-08-19 19:32:04 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|