kxz2002
|
8a40374bfe
|
[BugFix] Fix ernie4_5_vl_processor.py and qwen_vl_processor.py can not disable thinking (#4762)
* fix ernie4_5_vl_processor.py and qwen_vl_processor.py
* add unit test
|
2025-11-04 16:00:32 +08:00 |
|
kevin
|
8aab4e367f
|
[Feature] mm support prefix cache (#4134)
* support mm prefix caching
* update code
* fix mm_hashes
* support encoder cache
* add encoder cache
* update code
* update encoder cache
* fix features bug
* fix worker bug
* support processor cache, need to optimize yet
* refactor multimodal data cache
* update code
* update code
* update v1 scheduler
* update code
* update code
* update codestyle
* support turn off processor cache and encoder cache
* update pre-commit
* fix code
* solve review
* update code
* update code
* update test case
* set processor cache in GiB
* update test case
* support mm prefix caching for qwen model
* fix code style check
* update pre-commit
* fix unit test
* fix unit test
* add ci test case
* fix rescheduled bug
* change text_after_process to prompt_tokens
* fix unit test
* fix chat template
* change model path
* [EP] fix adapter bugs (#4572)
* Update expert_service.py
* Update common_engine.py
* Update expert_service.py
* fix v1 hang bug (#4573)
* fix import image_ops error on some platforms (#4559)
* [CLI]Update parameters in bench latecy cli tool and fix collect-env cli tool (#4558)
* add collect-env
* del files
* [Graph Optimization] Add dy_runnable and introduce cudagraph_switch_threshold for cudagraph mode switching (#4578)
* add new branch for sot
* reorder
* fix batch bug
* [XPU]Moe uses a new operator (#4585)
* [XPU]Moe uses a new operator
* [XPU]Moe uses a new operator
* update response
* [Feature] Support Paddle-OCR (#4396)
* init
* update code
* fix code style & disable thinking
* adapt for common_engine.update_mm_requests_chunk_size
* use 3d rope
* use flash_attn_unpadded
* opt siglip
* update to be compatible with the latest codebase
* fix typo
* optim OCR performance
* fix bug
* fix bug
* fix bug
* fix bug
* normlize name
* modify xpu rope
* revert logger
* fix bug
* fix bug
* fix bug
* support default_v1
* optim performance
* fix bug
---------
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com>
Co-authored-by: zhangyue66 <zhangyue66@baidu.com>
* [DataProcessor] add reasoning_tokens into usage info (#4520)
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
* perf: Optimize task queue communication from engine to worker (#4531)
* perf: Optimize task queue communication from engine to worker
* perf: get_tasks to numpy
* perf: get_tasks remove to_numpy
* fix: request & replace ENV
* remove test_e2w_perf.py
* fix code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
* Clean up ports after processing results (#4587)
* [CI] Add /re-run command in PR comments to restart failed CI workflows (#4593)
* [Others] api server exits when worker process is dead (#3271)
* [fix] fix terminal hangs when worker process is dead
* [chore] change sleep time of monitor
* [chore] remove redundant comments
* update docs
---------
Co-authored-by: ApplEOFDiscord <wwy640130@163.com>
Co-authored-by: ApplEOFDiscord <31272106+ApplEOFDiscord@users.noreply.github.com>
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com>
Co-authored-by: yinwei <yinwei_hust@163.com>
Co-authored-by: JYChen <zoooo0820@qq.com>
Co-authored-by: qwes5s5 <45442318+qwes5s5@users.noreply.github.com>
Co-authored-by: Ryan <zihaohuang@aliyun.com>
Co-authored-by: yyssys <atyangshuang@foxmail.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com>
Co-authored-by: zhangyue66 <zhangyue66@baidu.com>
Co-authored-by: kxz2002 <115912648+kxz2002@users.noreply.github.com>
Co-authored-by: SunLei <sunlei5788@gmail.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com>
|
2025-10-27 17:39:51 +08:00 |
|
LiqinruiG
|
4251ac5e95
|
【Fix】 remove text_after_process & raw_prediction (#4421)
* remove text_after_process & raw_prediction
* remove text_after_process & raw_prediction
|
2025-10-16 19:00:18 +08:00 |
|
luukunn
|
ee9d8a840a
|
[fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4086)
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* add completion_token_ids
* add logger
* fix reasoning_max_tokens ParameterError
* add unittest
* add unittest
* add unittest
* add unittest
* add unittest
* add unit test
|
2025-09-19 14:26:01 +08:00 |
|
lddfym
|
2056a428bd
|
[bug fix] Fix the placeholder in qwen prompt and add some unittests (#4065)
* fix the placeholder in qwen prompt
* fix the placeholder in qwen prompt
* add soem unittests for qwen_vl_processor
|
2025-09-11 20:00:02 +08:00 |
|
Yuanle Liu
|
cbce94a00e
|
rename ernie_xxx to ernie4_5_xxx (#3621)
* rename ernie_xxx to ernie4_5_xxx
* ci fix
|
2025-08-26 19:29:27 +08:00 |
|
lddfym
|
27666ee586
|
[Feature] Add Qwen25-VL Processor (#3501)
* add qwen-2.5-vl processor
* add qwen25-vl processor
* add qwen25-vl processor
* add qwen25-vl processor
* add qwen25-vl processor position_ids
* add qwen25-vl processor
* add qwen25-vl processor
* position_ids
* add test for qwen25-vl
* organize comments
* formatted
* qwen_vl_processor
* add qwen_vl_processor unittest
* update model path
* update model path
* update qwen_vl_processor unittest
* add unittest and bug fix
* add unittest and bug fix
* Update fastdeploy/input/qwen_mm_processor/image_processor.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Update fastdeploy/input/qwen_vl_processor.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-08-22 16:49:42 +08:00 |
|