Yonghua Li
0c8c6369ed
[Feature] [PD Disaggregation] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports ( #5415 )
...
* [feat] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports
* [fix] fix some bugs
* [fix] fix rdma port for cache manager/messager
* [fix] temporarily cancel port availability check to see if it can pass ci test
* [feat] simplify args for multi api server
* [fix] fix dp
* [fix] fix port for xpu
* [fix] add tests for ports post processing & fix ci
* [test] fix test_multi_api_server
* [fix] fix rdma_comm_ports args for multi_api_server
* [fix] fix test_common_engine
* [fix] fix test_cache_transfer_manager
* [chore] automatically setting FD_ENABLE_MULTI_API_SERVER
* [fix] avoid api server from creating engine_args twice
* [fix] fix test_run_batch
* [fix] fix test_metrics
* [fix] fix splitwise connector init
* [test] add test_rdma_transfer and test_expert_service
* [fix] fix code syntax
* [fix] fix test_rdma_transfer and build wheel with rdma script
2025-12-17 15:50:42 +08:00
xiaolei373
a30b4da260
[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 ( #5458 )
2025-12-16 16:36:09 +08:00
GoldPancake
909059c60a
[Feature] Support for request-level speculative decoding metrics monitoring. ( #5518 )
...
* support spec metrics monitor per request
* fix bug
* remove debug log
* fix ut bugs
2025-12-12 12:22:18 +08:00
kevin
954a145d57
[Optimization] support mm prefill batch ( #5313 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support mm prefill batch
* update code
* update code
* update code
* update code
* fix encoder cache bug
* update code
* update code
* fix bug
* fix paddle ocr bug
* fix xpu bug
* update code
2025-12-11 22:21:14 +08:00
qwes5s5
d79438bb86
add detoken switch ( #5463 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-10 21:44:02 +08:00
luukunn
fbc9bce1e9
[Feature]Optimization of Thinking Pattern Framework ( #4302 )
...
* add model status in vl
* add x1 parser
* add model_status
* fix parser
* fix parser
* fix parser
* fix parser
* Revert "fix parser"
This reverts commit 300f446d8a .
* fix parser
* fix
* fix
* fix
* fix
* fix parser
* fix unit test
* fix unit test
* add unit test
* fix
* fix
* add unit test
* fix unit test
* add unit test
* add unit test
* fix unit test
* fix unit test
* fix bug
* fix unit test
* x1 tool parser
* fix unit test
* fix unit test
* fix unit test
* fix n
* fix unit test
* add unit test
* add unit test
* remove pring
2025-12-10 16:17:06 +08:00
ming1753
9e15191cce
[BugFix] fix audio end bug ( #5464 )
2025-12-10 13:37:26 +08:00
Juncai
80efe98f8d
[PD Disaggregation] Add timestamp for analyzing splitwise deployment ( #5317 )
...
* Add timestamp for analyzing splitwise deployment
* up
* up
* up
* up
* up
* up
* fix format
* fix
2025-12-08 10:08:44 +08:00
lizexu123
d4979347ca
[Bug fix] Fix the multi-input accuracy issue in the pooling model. ( #5374 )
...
* fix multi-inputs
* fix threshold
* fix threshold
* fix
2025-12-05 20:18:17 +08:00
Ayakouji
a8f8791668
[Optimization] Qwen2.5-VL support multi-batch prefill ( #5269 )
...
* update
* fix
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix dict access
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-05 18:22:39 +08:00
ming1753
dd2e9a14c7
[BugFix] Compatible with asynchronous functions ( #5378 )
...
* [BugFix] fix data_processor asyn bug
* fix bug
2025-12-05 11:05:21 +08:00
lizexu123
946025480e
[Bug fix] fix pooling models ( #5358 )
...
* fix
* fix
* fix test
* fix gpu_model_runner
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-04 11:06:30 +08:00
qwes5s5
a52aea073c
fix logprobs ( #5335 )
2025-12-04 10:38:51 +08:00
ming1753
5f8d4aedea
[Feature] support audio tts ( #5333 )
2025-12-03 21:06:48 +08:00
xiaolei373
a4bb3e9960
[bugfix]remove metrics middleware ( #5332 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-03 17:07:45 +08:00
lizexu123
c563eca791
[Feature] support reward model ( #5301 )
...
* Your commit message here
* add test
* update develop
* support reward
* support enable_chunk_prefill
* support bingfa
* support convert is reward
* update test
* delete print
* fix enable_thinking
* add document
* fix place
* fix test
* fix
* support enable_prefix_caching
* add no-enable_prefix-caching test
* fix
* support enable_prefix_caching
* delete print
* fix document
* fix
* fix test
* fix document and delete chinese
* udpate
* enable_thinking
* fix test
2025-12-02 14:55:31 +08:00
qwes5s5
117980dd4e
[LogProbs]Enable prompt logprobs output and modify data transmission method for the online interface. ( #5089 )
...
* add prompt logprobs
* Merge prompt_logprobs_tensors and prompt_logprobs
* fix param check
* trigger ci
* fix unitest
* fix logprobs bug
2025-12-02 13:49:51 +08:00
Yonghua Li
a535050b11
[FDConfig] remove engine client args, use fd_config instead ( #5217 )
...
* [refactor] remove engine client args, use fd_config instead
* [chore] update
* [fix] fix
* [fix] fix
* [chore] rename config to fd_config
* [fix] fix run_batch
* [ci] add ci case for engine client
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-28 01:20:54 -08:00
fl0w2o48
e63d715fc3
[BugFix][Metrics] Fix Prometheus Multiprocess Metrics Issues and Add ZMQ Communication Metrics ( #5185 )
...
* [Feature] add metrics for ZMQ and fix multiprocess metrics
* fix test_metrics.py
---------
Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com >
2025-11-27 15:05:09 +08:00
SunLei
c424e08dc5
[Speculative Decoding] split draft_tokens into standalone post-processing path ( #5205 )
...
* refactor(mtp): split draft_tokens into standalone post-processing path for MTP + logprobs
* Restore Request.__repr__ implementation
* ci
* add envs
* fix unittest
2025-11-27 11:22:41 +08:00
kxz2002
2d787590c4
[Feature] The 45VL supports prompt_token_ids + messages input. ( #5148 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support prompt_token_ids + messages
* fix bug
* refact code structure
* support cache mm items
* refact code structure
* delete test cases
* modify unit test
* add unit test
* add unit test
* fix append
* add check for messages
2025-11-25 23:11:44 +08:00
Yonghua Li
09379183e2
[BugFix] fix work metrics not returned by metrics api ( #4912 )
...
* [BugFix] fix work metrics not returned by metrics api
* [fix] fix conflict
* [fix] fix ci
2025-11-25 19:12:29 +08:00
kevin
8e4e3ff510
[Feature] support eplb in api_server ( #4782 )
...
* support eplb in api_server
* update code
* add eplb test case
* update eplb
* support tp+dp eplb
* update test cese
* update code
* update code
* fix bug
* update copilot review
* update test case name
2025-11-24 20:22:29 +08:00
LiqinruiG
a5cd7c9039
[BugFix] rollback max_tokens and min_tokens when continue to infer ( #5082 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* [BugFix] rollback max_tokens and min_tokens when continue to infer
* [BugFix] rollback max_tokens and min_tokens when continue to infer
* [fix] add more logger info: max_tokens
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-19 18:43:42 +08:00
kxz2002
97189079b9
[BugFix] unify max_tokens ( #4968 )
...
* unify max tokens
* modify and add unit test
* modify and add unit test
* modify and add unit tests
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-18 20:01:33 +08:00
LiqinruiG
33f96ff93a
[BugFix] rollback max_tokens and min_tokens when continue to infer ( #5052 )
...
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-17 14:31:26 +08:00
qwes5s5
36216e62f0
[Log] Add trace log and add loggingInstrumentor tool ( #4692 )
...
* add trace logger and trace print
* trigger ci
* fix unittest
* translate notes and add copyright
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-17 11:08:57 +08:00
zhouchong
5444af6ff6
[APIServer] metrics use port the same as api_port ( #5016 )
...
* metrics use port the same as api_port
* Be tolerant to tests that monkeypatch/partially mock args.
* Reduce code redundancy
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-11-17 10:42:45 +08:00
kxz2002
9703108c28
[BugFix] adjust max_tokens and min_tokens when continue to generate tokens ( #5010 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* fix max and min tokens initial commit
* fix double subtraction
* add unit tests
2025-11-13 23:52:54 +08:00
qwes5s5
a2d06118e1
[Logprobs]Support prompt_logprobs and max_logprobs ( #4897 )
...
* add prompt logprobs
* trigger ci
* fix unitest
* Update fastdeploy/config.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/llm.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix max_logprobs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 19:29:48 +08:00
Yuanle Liu
3dc0ffa46d
[TSP] Support qwen3 moe tsp + cudagraph ( #4871 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3_moe tsp mode
* fix
* fix
* update
* update
* update
* fix
* support external_rmsnorm
* update
* fix
2025-11-10 23:37:51 +08:00
kxz2002
2dfbcf3cc9
[BugFix] Fix inference_start_time ( #4922 )
...
* fix inference_start_time
* fix inference_start_time
2025-11-10 19:28:44 +08:00
luukunn
41c0bef964
[BugFix] When the value of "temperature" is 0, adjust it to 1e-06 ( #4900 )
...
* add default temperature value
* add unit test
* update
* update
* add unit test
* update
* fix unit test
2025-11-10 13:24:33 +08:00
kxz2002
87911b7cf1
[Feature] Enable FastDeploy to support adding the “--api-key” authentication parameter. ( #4806 )
...
* add api key initial commit
* add unit test
* modify unit test
* move middleware to a single file and add unit tests
2025-11-08 18:24:02 +08:00
Juncai
08ca0f6aea
[Feature] [PD] add simple router and refine splitwise deployment ( #4709 )
...
* add simple router and refine splitwise deployment
* fix
2025-11-06 14:56:02 +08:00
李泳桦
fcd2f05dff
[BugFix] fix messages being inplace modified in offline chat api ( #4831 )
2025-11-05 20:46:33 +08:00
luukunn
7b35488779
【DataProcessor】add options thinking_mode ( #4735 )
...
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add unit test
2025-11-03 14:30:07 +08:00
kxz2002
a2870ed4a9
[Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” ( #4668 )
...
* parser register name unify
* change ernie_x1 to ernie-x1
* change ernie4_5_vl to ernie-45-vl
* fix unit test
2025-10-31 10:45:27 +08:00
kxz2002
82bd7e5db4
[BugFix] Fix finish reason in _create_chat_completion_choice ( #4582 )
...
* fix n_param _create_chat_completion_choicel
* fix unit test
* fix final_res
* modify unit tests
2025-10-31 10:42:19 +08:00
kxz2002
c30bfb294f
[Feature] add a new reasoning parser ( #4571 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* add new reasoning_parser initial commit
* add parser file content
* add register
* ernie_test_reasoning_parser
* support <tool_call> token and add tool_parser
* add and fix unit tests
* modify reasoning_parser
* modify reasoning parser and tool parser
* modify unit tests
* modify reasoning_parser and tool_parser
* modify unit tests
* fix tool_parser
* modify the logic of reasoning_parser and tool_parser
* add and modify unit tests
* standardize code style
* simplify reasoning_parser and tool_parser
* modify unit test
2025-10-29 18:16:50 +08:00
ApplEOFDiscord
14f8cddaf1
[Feature] add mm token usage ( #4570 )
...
* add mm token usage
* fix unit test
* fix unit test
* fix unit test
* fix model path
* fix unit test
* fix unit test
* fix unit test
* remove uncomment
* change var name
* fix code style
* fix code style
* fix code style
* fix code style
* fix unit test
2025-10-29 14:37:12 +08:00
xiaolei373
14e7d88ea4
[feature] support reward api ( #4518 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Co-authored-by: SunLei <sunlei5788@gmail.com >
2025-10-29 00:20:28 +08:00
李泳桦
a012e3608b
[Feature] support logits processors ( #4515 )
...
* [feat] provide an interface for logits processors and a builtin LogitBiasLogitsProcessor
* [chore] fix code style
* [fix] add unit test & fix existing bugs
* [feat] add engine/worker arg --logits-processors
* [fix] redefine user args as logits_processors_args and fix some bugs
* [fix] fix test_sampler
* Update fastdeploy/model_executor/logits_processor/builtin.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/logits_processor/__init__.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/model_executor/test_logits_processor.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [fix] fix typo
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [fix] fix bracelet
* [chore] redefine logits processor interface: pass the entire share_inputs into LP, do not copy share_inputs and logits
* [doc] add docs
* [fix] fix logit bias processor not applied when decoding is too fast & add docs and tests
* [fix] fix redundant code
* [feat] skip apply() if no bias is specified
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-10-29 00:08:53 +08:00
SunLei
2a9ed72533
feat: add support for API usage with multimodal models ( #4548 )
...
* feat: add support for API usage with multimodal models
* completion_tokens contains num_image_tokens
* remove test_request.py
* fix: paddle.device.is_compiled_with_cuda()
* fix test_unstream_without_logprobs
2025-10-28 20:23:46 +08:00
Daci
6426414a0f
[Feature] EngineWorkerQueue anonymous port ( #4597 )
...
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
2025-10-28 10:22:37 +08:00
zhouchong
6dcf5a3e87
fix: resolve decode bug in offline stream output ( #4603 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-27 20:17:10 +08:00
kevin
8aab4e367f
[Feature] mm support prefix cache ( #4134 )
...
* support mm prefix caching
* update code
* fix mm_hashes
* support encoder cache
* add encoder cache
* update code
* update encoder cache
* fix features bug
* fix worker bug
* support processor cache, need to optimize yet
* refactor multimodal data cache
* update code
* update code
* update v1 scheduler
* update code
* update code
* update codestyle
* support turn off processor cache and encoder cache
* update pre-commit
* fix code
* solve review
* update code
* update code
* update test case
* set processor cache in GiB
* update test case
* support mm prefix caching for qwen model
* fix code style check
* update pre-commit
* fix unit test
* fix unit test
* add ci test case
* fix rescheduled bug
* change text_after_process to prompt_tokens
* fix unit test
* fix chat template
* change model path
* [EP] fix adapter bugs (#4572 )
* Update expert_service.py
* Update common_engine.py
* Update expert_service.py
* fix v1 hang bug (#4573 )
* fix import image_ops error on some platforms (#4559 )
* [CLI]Update parameters in bench latecy cli tool and fix collect-env cli tool (#4558 )
* add collect-env
* del files
* [Graph Optimization] Add dy_runnable and introduce cudagraph_switch_threshold for cudagraph mode switching (#4578 )
* add new branch for sot
* reorder
* fix batch bug
* [XPU]Moe uses a new operator (#4585 )
* [XPU]Moe uses a new operator
* [XPU]Moe uses a new operator
* update response
* [Feature] Support Paddle-OCR (#4396 )
* init
* update code
* fix code style & disable thinking
* adapt for common_engine.update_mm_requests_chunk_size
* use 3d rope
* use flash_attn_unpadded
* opt siglip
* update to be compatible with the latest codebase
* fix typo
* optim OCR performance
* fix bug
* fix bug
* fix bug
* fix bug
* normlize name
* modify xpu rope
* revert logger
* fix bug
* fix bug
* fix bug
* support default_v1
* optim performance
* fix bug
---------
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com >
Co-authored-by: zhangyue66 <zhangyue66@baidu.com >
* [DataProcessor] add reasoning_tokens into usage info (#4520 )
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
* perf: Optimize task queue communication from engine to worker (#4531 )
* perf: Optimize task queue communication from engine to worker
* perf: get_tasks to numpy
* perf: get_tasks remove to_numpy
* fix: request & replace ENV
* remove test_e2w_perf.py
* fix code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Clean up ports after processing results (#4587 )
* [CI] Add /re-run command in PR comments to restart failed CI workflows (#4593 )
* [Others] api server exits when worker process is dead (#3271 )
* [fix] fix terminal hangs when worker process is dead
* [chore] change sleep time of monitor
* [chore] remove redundant comments
* update docs
---------
Co-authored-by: ApplEOFDiscord <wwy640130@163.com >
Co-authored-by: ApplEOFDiscord <31272106+ApplEOFDiscord@users.noreply.github.com >
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
Co-authored-by: yinwei <yinwei_hust@163.com >
Co-authored-by: JYChen <zoooo0820@qq.com >
Co-authored-by: qwes5s5 <45442318+qwes5s5@users.noreply.github.com >
Co-authored-by: Ryan <zihaohuang@aliyun.com >
Co-authored-by: yyssys <atyangshuang@foxmail.com >
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com >
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com >
Co-authored-by: zhangyue66 <zhangyue66@baidu.com >
Co-authored-by: kxz2002 <115912648+kxz2002@users.noreply.github.com >
Co-authored-by: SunLei <sunlei5788@gmail.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com >
2025-10-27 17:39:51 +08:00
李泳桦
cdc40cdc2a
[Others] api server exits when worker process is dead ( #3271 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [fix] fix terminal hangs when worker process is dead
* [chore] change sleep time of monitor
* [chore] remove redundant comments
2025-10-27 10:23:48 +08:00
kxz2002
327fa4c255
[DataProcessor] add reasoning_tokens into usage info ( #4520 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
2025-10-25 16:57:58 +08:00
ming1753
e4e3cede7f
[Feature] Support Paddle-OCR ( #4396 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* init
* update code
* fix code style & disable thinking
* adapt for common_engine.update_mm_requests_chunk_size
* use 3d rope
* use flash_attn_unpadded
* opt siglip
* update to be compatible with the latest codebase
* fix typo
* optim OCR performance
* fix bug
* fix bug
* fix bug
* fix bug
* normlize name
* modify xpu rope
* revert logger
* fix bug
* fix bug
* fix bug
* support default_v1
* optim performance
* fix bug
---------
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com >
Co-authored-by: zhangyue66 <zhangyue66@baidu.com >
2025-10-24 23:34:30 +08:00