kevin
|
109d48e456
|
[Feature] support async download features (#5003)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support async download features
* add test case
* update code
|
2025-11-19 22:23:36 +08:00 |
|
ltd0924
|
303c986cc7
|
[FDConfig] add block number verfied (#4983)
* Update config.py
* fix
* update unit test
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com>
|
2025-11-13 09:48:44 +08:00 |
|
Echo-Nie
|
ff653503ff
|
[Docs] Add License in Unittest (#4957)
* add copyright
* add CopyRight
|
2025-11-12 10:44:09 +08:00 |
|
kevin
|
cc34487810
|
[Feature] support mm disable_chunked (#4803)
* support mm disable_chunked
* update code
* update code
* update code
|
2025-11-06 21:32:25 +08:00 |
|
kevin
|
8aab4e367f
|
[Feature] mm support prefix cache (#4134)
* support mm prefix caching
* update code
* fix mm_hashes
* support encoder cache
* add encoder cache
* update code
* update encoder cache
* fix features bug
* fix worker bug
* support processor cache, need to optimize yet
* refactor multimodal data cache
* update code
* update code
* update v1 scheduler
* update code
* update code
* update codestyle
* support turn off processor cache and encoder cache
* update pre-commit
* fix code
* solve review
* update code
* update code
* update test case
* set processor cache in GiB
* update test case
* support mm prefix caching for qwen model
* fix code style check
* update pre-commit
* fix unit test
* fix unit test
* add ci test case
* fix rescheduled bug
* change text_after_process to prompt_tokens
* fix unit test
* fix chat template
* change model path
* [EP] fix adapter bugs (#4572)
* Update expert_service.py
* Update common_engine.py
* Update expert_service.py
* fix v1 hang bug (#4573)
* fix import image_ops error on some platforms (#4559)
* [CLI]Update parameters in bench latecy cli tool and fix collect-env cli tool (#4558)
* add collect-env
* del files
* [Graph Optimization] Add dy_runnable and introduce cudagraph_switch_threshold for cudagraph mode switching (#4578)
* add new branch for sot
* reorder
* fix batch bug
* [XPU]Moe uses a new operator (#4585)
* [XPU]Moe uses a new operator
* [XPU]Moe uses a new operator
* update response
* [Feature] Support Paddle-OCR (#4396)
* init
* update code
* fix code style & disable thinking
* adapt for common_engine.update_mm_requests_chunk_size
* use 3d rope
* use flash_attn_unpadded
* opt siglip
* update to be compatible with the latest codebase
* fix typo
* optim OCR performance
* fix bug
* fix bug
* fix bug
* fix bug
* normlize name
* modify xpu rope
* revert logger
* fix bug
* fix bug
* fix bug
* support default_v1
* optim performance
* fix bug
---------
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com>
Co-authored-by: zhangyue66 <zhangyue66@baidu.com>
* [DataProcessor] add reasoning_tokens into usage info (#4520)
* add reasoning_tokens into usage info initial commit
* add unit tests
* modify unit test
* modify and add unit tests
* fix unit test
* move steam usage to processor
* modify processor
* modify test_logprobs
* modify test_logprobs.py
* modify stream reasoning tokens accumulation
* fix unit test
* perf: Optimize task queue communication from engine to worker (#4531)
* perf: Optimize task queue communication from engine to worker
* perf: get_tasks to numpy
* perf: get_tasks remove to_numpy
* fix: request & replace ENV
* remove test_e2w_perf.py
* fix code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
* Clean up ports after processing results (#4587)
* [CI] Add /re-run command in PR comments to restart failed CI workflows (#4593)
* [Others] api server exits when worker process is dead (#3271)
* [fix] fix terminal hangs when worker process is dead
* [chore] change sleep time of monitor
* [chore] remove redundant comments
* update docs
---------
Co-authored-by: ApplEOFDiscord <wwy640130@163.com>
Co-authored-by: ApplEOFDiscord <31272106+ApplEOFDiscord@users.noreply.github.com>
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com>
Co-authored-by: yinwei <yinwei_hust@163.com>
Co-authored-by: JYChen <zoooo0820@qq.com>
Co-authored-by: qwes5s5 <45442318+qwes5s5@users.noreply.github.com>
Co-authored-by: Ryan <zihaohuang@aliyun.com>
Co-authored-by: yyssys <atyangshuang@foxmail.com>
Co-authored-by: ming1753 <61511741+ming1753@users.noreply.github.com>
Co-authored-by: root <root@szzj-acg-tge1-fdda9.szzj.baidu.com>
Co-authored-by: zhangyue66 <zhangyue66@baidu.com>
Co-authored-by: kxz2002 <115912648+kxz2002@users.noreply.github.com>
Co-authored-by: SunLei <sunlei5788@gmail.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
Co-authored-by: 李泳桦 <39643373+liyonghua0910@users.noreply.github.com>
|
2025-10-27 17:39:51 +08:00 |
|
YuanRisheng
|
a2ec2c4152
|
[FDConfig]Remove max_model_len in FDConfig (#4350)
* modify max_model_len
* fix unittest
* fix unittest
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-11 14:04:17 +08:00 |
|
AIbin
|
7b1689f437
|
schedule_bugfix (#4336)
|
2025-10-09 20:40:10 +08:00 |
|
YuanRisheng
|
0d3a57a2c6
|
fix unittest (#4155)
|
2025-09-17 20:20:26 +08:00 |
|
YuanRisheng
|
2e9e53ff7e
|
[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116)
* remove max_num_batched_tokens in parallel config
* remove max_num_seqs
* update test case
* fix test
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-09-17 10:43:35 +08:00 |
|
chenjian
|
67e6d8c691
|
[Feature] Set prefix caching as default (#3814)
* Set prefix caching as default
* Set prefix caching as default
* Set prefix caching as default
* skip dynamic load scene
* fix kill bug
* fix kill bug
* fix kill bug
* fix
* fix
* fix ci
|
2025-09-16 20:34:27 +08:00 |
|
chenjian
|
37f1632732
|
[Optimize] optimize prefix cache in develop (#3890)
* optimize prefix cache in release22
* fix
* fix
* fix
* add ci for v1
* add unit test
---------
Co-authored-by: xiegegege <46314656+xiegegege@users.noreply.github.com>
|
2025-09-12 10:15:59 +08:00 |
|