chenjian
25498efcf3
[Optimize] Support and robust for tpN for PD ( #4595 )
...
* [Optimize] Support and robust for tpN for PD
* fix
* fix
* support dpM tpN for cache messager
* fix
* fix token counter
* fix bug for merge develop
* fix bug
* robust cache messager for v0
2025-11-03 15:38:31 +08:00
luukunn
7b35488779
【DataProcessor】add options thinking_mode ( #4735 )
...
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add thinking_mode
* add unit test
2025-11-03 14:30:07 +08:00
yinwei
377f3bf5f2
[XPU] add v1 support for bf16 ( #4744 )
...
* support v1 loader
* update code style
* update code
2025-11-03 14:13:17 +08:00
chenjian
f83d0cf127
[Feature] Support eplb for fd ( #4599 )
...
* support eplb
* support eplb
---------
Co-authored-by: kevin <chengyf112@gmail.com >
2025-11-03 14:08:15 +08:00
freeliuzc
11398790d3
[Speculative Decoding][MTP]Support attn mask offset ( #4641 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [MTP]Merge support attn (#4591 )
* support mask_offset in speculate decoding
* fix dummpy run output
* add unit test
* fix unit test import
* support attn_mask_offset in mtp mode
* add update_attn_mask op
* fix unit test && fix code-style
2025-11-03 10:08:01 +08:00
freeliuzc
f44f4bafd1
support mtp in splitewise and scheduler_v1 mode ( #4743 )
2025-11-03 10:07:15 +08:00
lizexu123
4ac6de9a3c
[Feature] support pooling model runner ( #4590 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3-embedding
* support qwen3-embedding-0.6b
* fix
* fix bug
* fix test_return_token_ids.py and update enable_thinking
* fix mtp dummy_run
* merge develop
* fix np.float32
* delete FD_DISABLE_CHUNKED_PREFILL and FD_USE_GET_SAVE_OUTPUT_V1
* delete and build_stream_transfer_data
* fix test_update_v1:
* fix
* fix
* update dummy_run post_process
* delete test_update_v1
* fix
* fix dummy_run
* fix model_path
* fix model_path
* fix dummy_run
2025-10-31 22:32:05 +08:00
Yuanle Liu
b301bd6c31
[BugFix] fix thinking bug ( #4710 )
...
* fix thinking bug
* fix ut
* update
* fix
2025-10-31 22:00:31 +08:00
ddchenhao66
3cbca75cc8
[XPU] xpu support neox style ROPE ( #4719 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-31 18:14:25 +08:00
Jundong Liu
88a94c821b
[FDConfig] [PD Disaggregation] [Graph Optimization] Close Cudagraph for P node when PD Disaggregation ( #4632 )
...
* Close cudagraph for P node when PD Disaggregation
* fix problem
2025-10-31 16:44:25 +08:00
AIbin
316f784016
fix wint2 config ( #4721 )
2025-10-31 15:44:14 +08:00
kevin
c801d31c9c
add checker ( #4711 )
2025-10-31 15:26:35 +08:00
kevin
096d87d335
fix bug ( #4679 )
2025-10-31 14:59:18 +08:00
李泳桦
0f75b62de2
[BugFix] Fix profile run in pd-disaggregated deployment ( #4584 )
...
* [fix] fix pd+dp+ep bug
* [fix] fix again
* [ci] fix code style
2025-10-31 14:42:00 +08:00
kevin
64e875b460
[Scheduler] update v1 prefill batch ( #4611 )
...
* update v1 prefill batch
* update code
* update code
2025-10-31 14:03:01 +08:00
Sunny-bot1
9b18f0b55d
cache scale load ( #4624 )
2025-10-31 11:58:33 +08:00
GoldPancake
1f3ce65b58
[Feature] support mtp distribution equivalence verification ( #4699 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-31 11:45:04 +08:00
Ryan
28de91b50f
[Graph Optimization] SOT+CUDAGraph support ERNIE4.5T VL 28B / 424B ( #4645 )
...
* 45TVL support sot+CUDAGraph
* mv unitest from ce_deploy 2 e2e
* add test_EB_VL_Lite_sot_serving
* rm useless line
* add openai_client
* fix unitest && reduce computing resources
2025-10-31 11:38:43 +08:00
kxz2002
a2870ed4a9
[Feature] Unify the registration name recognition for tool_parser and reasoning_parser to “-” ( #4668 )
...
* parser register name unify
* change ernie_x1 to ernie-x1
* change ernie4_5_vl to ernie-45-vl
* fix unit test
2025-10-31 10:45:27 +08:00
kxz2002
82bd7e5db4
[BugFix] Fix finish reason in _create_chat_completion_choice ( #4582 )
...
* fix n_param _create_chat_completion_choicel
* fix unit test
* fix final_res
* modify unit tests
2025-10-31 10:42:19 +08:00
ddchenhao66
b87384aa70
[XPU] xpu currently disable prefix cache for VL model ( #4695 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-31 10:36:39 +08:00
chen
b73a78155f
fix --logprobs-mode raw_logits ( #4681 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-30 19:53:42 +08:00
zhouchong
35286ce31a
fix total_block_num init error in worker_process ( #4687 )
2025-10-30 19:53:09 +08:00
kxz2002
7dc9d9885e
[BugFix] fix offline llm chat "enable_thinking" is always "False" ( #4686 )
...
* fix enable_thinking
* recover ernie4_5_vl_processor
2025-10-30 19:45:41 +08:00
周周周
0089287534
[noauxtc_kernel] remove useless code ( #4643 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* remove num_tokens
* remove num_tokens
* false
* final commit
2025-10-30 18:59:04 +08:00
Jiang-Jia-Jun
f1de348cbf
Update common_engine.py
2025-10-30 14:05:04 +08:00
Haonan Luo
d7d0112bbf
[CI] Add test for paddleocr_vl ( #4627 )
2025-10-30 13:40:04 +08:00
RAM
cd3b7cc392
[Graph Optimization] Add the CUDAGraph usage switch for Draft Model ( #4601 )
...
* add draft model using cudagraph switch
* set default as false
* capture draft model in ci
* fix bug
2025-10-30 11:44:50 +08:00
GoldPancake
fddda50cb9
Add ut for speculative sampler ( #4650 )
2025-10-30 10:37:49 +08:00
ltd0924
50be19a88a
[EP] fix several bugs in data parallel ( #4657 )
...
* Simplify profiling block setup in expert_service.py
Refactor profiling block initialization to avoid duplication.
* Update common_engine.py
2025-10-30 09:50:49 +08:00
kxz2002
c30bfb294f
[Feature] add a new reasoning parser ( #4571 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* add new reasoning_parser initial commit
* add parser file content
* add register
* ernie_test_reasoning_parser
* support <tool_call> token and add tool_parser
* add and fix unit tests
* modify reasoning_parser
* modify reasoning parser and tool parser
* modify unit tests
* modify reasoning_parser and tool_parser
* modify unit tests
* fix tool_parser
* modify the logic of reasoning_parser and tool_parser
* add and modify unit tests
* standardize code style
* simplify reasoning_parser and tool_parser
* modify unit test
2025-10-29 18:16:50 +08:00
Lucas
8f40dfa9bf
[XPU] fix pos_emb_type bug ( #4638 )
2025-10-29 17:14:32 +08:00
ApplEOFDiscord
14f8cddaf1
[Feature] add mm token usage ( #4570 )
...
* add mm token usage
* fix unit test
* fix unit test
* fix unit test
* fix model path
* fix unit test
* fix unit test
* fix unit test
* remove uncomment
* change var name
* fix code style
* fix code style
* fix code style
* fix code style
* fix unit test
2025-10-29 14:37:12 +08:00
RichardWooSJTU
0dde936e93
[BugFix] fix total_block_num init error in worker_process ( #4553 )
...
* fix total_block_num init error in worker_process
* fix req and token client
* fix req and token client
* fix xpu xi
* fix xpu ci
2025-10-28 20:42:12 -07:00
xiaolei373
14e7d88ea4
[feature] support reward api ( #4518 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Co-authored-by: SunLei <sunlei5788@gmail.com >
2025-10-29 00:20:28 +08:00
李泳桦
a012e3608b
[Feature] support logits processors ( #4515 )
...
* [feat] provide an interface for logits processors and a builtin LogitBiasLogitsProcessor
* [chore] fix code style
* [fix] add unit test & fix existing bugs
* [feat] add engine/worker arg --logits-processors
* [fix] redefine user args as logits_processors_args and fix some bugs
* [fix] fix test_sampler
* Update fastdeploy/model_executor/logits_processor/builtin.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/model_executor/logits_processor/__init__.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/model_executor/test_logits_processor.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [fix] fix typo
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [fix] fix bracelet
* [chore] redefine logits processor interface: pass the entire share_inputs into LP, do not copy share_inputs and logits
* [doc] add docs
* [fix] fix logit bias processor not applied when decoding is too fast & add docs and tests
* [fix] fix redundant code
* [feat] skip apply() if no bias is specified
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-10-29 00:08:53 +08:00
Yuanle Liu
20756cd2bb
fix import jit.marker.unified ( #4622 )
2025-10-28 22:11:03 +08:00
ming1753
561b9f38d3
[BugFix] fix paddleocr prefix cache bug ( #4625 )
...
* fix paddleocr prefix cache bug
* disable prefix-caching in ocr
2025-10-28 21:38:12 +08:00
RAM
fff5fb5e39
[Graph Optimization] Refactor default capture list ( #4617 )
...
* fix bug and refine code
* add debug count
* refine code
2025-10-28 21:31:02 +08:00
Lucas
0a0c74e717
[XPU] Support PaddleOCR-VL model for XPU ( #4529 )
...
* [XPU] support PaddleOCR-VL in XPU
* [XPU] fix PaddleOCR-VL pos_emb_type
2025-10-28 20:35:04 +08:00
SunLei
2a9ed72533
feat: add support for API usage with multimodal models ( #4548 )
...
* feat: add support for API usage with multimodal models
* completion_tokens contains num_image_tokens
* remove test_request.py
* fix: paddle.device.is_compiled_with_cuda()
* fix test_unstream_without_logprobs
2025-10-28 20:23:46 +08:00
Ryan
07956a87b3
[Graph Optimization] Fix IR graph dependency error exposed after enabling SOT by updating the return value of TextImageGatherScatter ( #4610 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* fix TextImageGatherScatter in sot
* fix codestyle
2025-10-28 18:31:23 +08:00
lizhenyun01
4d2f478d53
[BugFix] fix TPDP mix parallel infer ( #4583 )
...
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-10-28 16:58:20 +08:00
freeliuzc
c63361fd1d
[Speculative Decoding][MTP]Support mtp in epdptp mode ( #4614 )
...
* support mtp many features
* support mtp reshard in rl mode
* fix function
* support mtp ep
* support mtp in hybird-dp-tp mode
* default open scheduler_v1 in mtp
2025-10-28 16:02:47 +08:00
Daci
6426414a0f
[Feature] EngineWorkerQueue anonymous port ( #4597 )
...
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
* EngineWorkerQueue 支持匿名端口设置
2025-10-28 10:22:37 +08:00
ming1753
7681375a19
[BugFix] PaddleOCR-VL fix FD_DEBUG type and support v1 loader ( #4605 )
...
* [Bug Fix] PaddleOCRVL fix FD_DEBUG type and support HF model
* fix bug
* fix bug
* fix bug
2025-10-28 09:47:47 +08:00
zhouchong
6dcf5a3e87
fix: resolve decode bug in offline stream output ( #4603 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-10-27 20:17:10 +08:00
周周周
3729e910a6
remove dev sync in prefill ( #4598 )
2025-10-27 19:54:43 +08:00
RAM
25a983ba9c
1.fix the bug of draft model with ep 2.fix sampler bug ( #4589 )
2025-10-27 17:47:34 +08:00
kevin
8aab4e367f
[Feature] mm support prefix cache ( #4134 )
...
* support mm prefix caching
* update code
* fix mm_hashes
* support encoder cache
* add encoder cache
* update code
* update encoder cache
* fix features bug
* fix worker bug
* support processor cache, need to optimize yet
* refactor multimodal data cache
* update code
* update code
* update v1 scheduler
* update code
* update code
* update codestyle
* support turn off processor cache and encoder cache
* update pre-commit
* fix code
* solve review
* update code
* update code
* update test case
* set processor cache in GiB
* update test case
* support mm prefix caching for qwen model
* fix code style check
* update pre-commit
* fix unit test
* fix unit test
* add ci test case
* fix rescheduled bug
* change text_after_process to prompt_tokens
* fix unit test
* fix chat template
* change model path
* [EP] fix adapter bugs (#4572 )
* Update expert_service.py
* Update common_engine.py
* Update expert_service.py
* fix v1 hang bug (#4573 )
* fix import image_ops error on some platforms (#4559 )
* [CLI]Update parameters in bench latecy cli tool and fix collect-env cli tool (#4558 )
* add collect-env
* del files
* [Graph Optimization] Add dy_runnable and introduce cudagraph_switch_threshold for cudagraph mode switching (#4578 )
* add new branch for sot
* reorder
* fix batch bug
* [XPU]Moe uses a new operator (#4585 )
* [XPU]Moe uses a new operator
* [XPU]Moe uses a new operator
* update response
* [Feature] Support Paddle-OCR (#4396 )
* init
* update code
* fix code style & disable thinking
* adapt for common_engine.update_mm_requests_chunk_size
* use 3d rope
* use flash_attn_unpadded
* opt siglip
* update to be compatible with the latest codebase
* fix typo
* optim OCR performance
* fix bug
* fix bug
* fix bug
* fix bug
* normlize name
* modify xpu rope
* revert logger
* fix bug
* fix bug
* fix bug
* support default_v1
* optim performance
* fix bug
---------
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com >
Co-authored-by: zhangyue66 <zhangyue66@baidu.com >
* [DataProcessor] add reasoning_tokens into usage info (#4520 )
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
* perf: Optimize task queue communication from engine to worker (#4531 )
* perf: Optimize task queue communication from engine to worker
* perf: get_tasks to numpy
* perf: get_tasks remove to_numpy
* fix: request & replace ENV
* remove test_e2w_perf.py
* fix code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Clean up ports after processing results (#4587 )
* [CI] Add /re-run command in PR comments to restart failed CI workflows (#4593 )
* [Others] api server exits when worker process is dead (#3271 )
* [fix] fix terminal hangs when worker process is dead
* [chore] change sleep time of monitor
* [chore] remove redundant comments
* update docs
---------
Co-authored-by: ApplEOFDiscord <wwy640130@163.com >
Co-authored-by: ApplEOFDiscord <31272106+ApplEOFDiscord@users.noreply.github.com >
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
Co-authored-by: yinwei <yinwei_hust@163.com >
Co-authored-by: JYChen <zoooo0820@qq.com >
Co-authored-by: qwes5s5 <45442318+qwes5s5@users.noreply.github.com >
Co-authored-by: Ryan <zihaohuang@aliyun.com >
Co-authored-by: yyssys <atyangshuang@foxmail.com >
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com >
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com >
Co-authored-by: zhangyue66 <zhangyue66@baidu.com >
Co-authored-by: kxz2002 <115912648+kxz2002@users.noreply.github.com >
Co-authored-by: SunLei <sunlei5788@gmail.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com >
2025-10-27 17:39:51 +08:00