JYChen
dafe02a7b9
[stop sequence] support stop sequence ( #3025 )
...
* stop seqs in multi-ends
* unittest for gpu stop op
* kernel tid==0
2025-07-29 14:17:37 +08:00
李泳桦
69996a40da
[feat] add disable_chat_template in chat api as a substitute for previous raw_request ( #3020 )
...
* [feat] add disable_chat_template in chat api as a substitute for previous raw_request
* [fix] pre-commit code check
2025-07-25 20:57:32 +08:00
EnflameGCU
7634ffb709
[GCU] Add CI ( #3006 )
2025-07-25 10:59:29 +08:00
Zero Rains
0fb37ab7e4
update flake8 version to support pre-commit in python3.12 ( #3000 )
...
* update flake8 version to support pre-commit in python3.12
* polish code
2025-07-24 01:43:31 -07:00
Yzc216
e14587a954
[Feature] multi-source download ( #2986 )
...
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
2025-07-24 14:26:37 +08:00
李泳桦
8a619e9db5
[Feature] Add return_token_ids, prompt_token_ids, and delete training, raw_request in request body ( #2940 )
...
* [feat] add return_token_ids, prompt_token_ids, delete raw_request in request body
* [fix] return_token_ids not working in curl request
* [test] improve some test cases of return_token_ids and prompt_token_ids
* [fix] the server responds ok even if request.messages is an empty list
2025-07-21 19:31:14 +08:00
Yuanle Liu
2f74e93d7e
use dist.all_reduce(min) to sync num_blocks_local ( #2933 )
...
* pre-commit all files check
* reduce min num_blocks_local
* fix nranks=1
* pre-commit when commit-msg
2025-07-21 01:23:36 -07:00
lizexu123
67990e0572
[Feature] support min_p_sampling ( #2872 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* Fastdeploy support min_p
* add test_min_p
* fix
* min_p_sampling
* update
* delete vl_gpu_model_runner.py
* fix
* Align usage of min_p with vLLM
* fix
* modified unit test
* fix test_min_sampling
* pre-commit all files
* fix
* fix
* fix
* fix xpu_model_runner.py
2025-07-20 23:17:59 -07:00
gaoziyuan
95a214ae43
support trainer_degree in name_mapping ( #2935 )
2025-07-20 23:12:55 -07:00
YuanRisheng
bce2c6cd7c
rename test dir ( #2934 )
2025-07-21 14:05:45 +08:00
liddk1121
17c5d3a241
[Iluvatar GPU] Add CI scripts ( #2876 )
2025-07-21 09:44:42 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
ZhangYulongg
b8676d71a8
update ci cases
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-18 21:44:07 +08:00
ZhangYulongg
43976138de
update ci cases
2025-07-18 21:44:07 +08:00
ZhangYulongg
e546e6b1b0
update ci cases
2025-07-18 21:44:07 +08:00
ZhangYulongg
eb77b1be6d
update ci cases
2025-07-18 21:44:07 +08:00
Jiang-Jia-Jun
fbe3547c95
[Feature] Support include_stop_str_in_output in chat/completion ( #2910 )
...
* [Feature] Support include_stop_str_in_output in chat/completion
* Add ci test for include_stop_str_in_output
* Update version of openai
* Fix ci test
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
2025-07-18 16:59:18 +08:00
ming1753
1f15ca21e4
[Feature] support prompt repetition_penalty ( #2806 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-17 12:05:52 +08:00
RAM
0fad10b35a
[Executor] CUDA Graph support padding batch ( #2844 )
...
* cuda graph support padding batch
* Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes.
* Do not insert max_num_seqs when the user specifies a capture list
* Support set graph optimization config from YAML file
* update cuda graph ci
* fix ci bug
* fix ci bug
2025-07-15 19:49:01 -07:00
xiegegege
16940822a7
add result save for ci ( #2824 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
LGTM
2025-07-12 23:34:46 +08:00
littledgg
59071268b6
[Executor] Move forward_meta.py to fastdeploy/model_executor ( #2774 )
...
* Use PEP 563 in attention.py and fix conflict
* merge commit
* Change what was left out last time
2025-07-10 20:36:51 +08:00
RAM
03a74995b8
Clear dead code And supplementary notes ( #2757 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* 1.supplementary notes 2.delete dead code
* fix bug of forward meta
* Global modification of forward meta
* fix vl model_runner bug
2025-07-09 16:17:34 +08:00
xiegetest
f6ffbc3cbd
add precision check for ci ( #2732 )
...
* add precision check for ci
* add precision check for ci
* add precision check for ci
* add precision check for ci
---------
Co-authored-by: xiegegege <xiege01@baidu.com >
2025-07-08 18:43:53 +08:00
YuBaoku
dacc46f04c
[CI] Add validation for MTP and CUDAGraph ( #2710 )
...
* set git identity to avoid merge failure in CI
* add ci cases
* [CI] Add validation for MTP and CUDAGraph
2025-07-04 18:13:54 +08:00
LQX
11cfdf5d89
添加XPU CI, test=model ( #2701 )
...
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
* 添加XPU CI, test=model
2025-07-04 16:16:06 +08:00
YuBaoku
bb880c8d7c
Update CI test cases ( #2671 )
...
* set git identity to avoid merge failure in CI
* add ci cases
2025-07-02 15:08:39 +08:00
YUNSHEN XIE
d5af78945b
Add ci ( #2650 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add ci ut and workflow
* Automatically cancel any previous CI runs for the ci.yml workflow, keeping only the latest one active
2025-06-30 20:20:49 +08:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00
XieYunshen
0825146538
add ci ut and workflow
2025-06-16 02:18:00 +08:00
jiangjiajun
149c79699d
[LLM] First commit the llm deployment code
2025-06-16 00:04:48 +08:00