Yonghua Li
6961130e04
[Cherry-Pick] [BugFix] fix scheduler hang when input length is very close to max_model_len ( #5394 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] fix scheduler hang when input length is very close to max_model_len
* [fix] update local_scheduler for v1 scheduler
* [fix] code style
2025-12-05 21:51:59 +08:00
chen
bce3739a57
[BugFix] fix v1_loader for wint8 rl ( #5224 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix v1_loader for wint8 rl
* check
2025-11-26 21:19:54 +08:00
Yonghua Li
52e5db9983
[BugFix] fix num_requests_running after clear_data ( #4923 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix] fix num_requests_running after clear_data
* [fix] fix tasks_list and stop flags not cleared when _free_blocks failed
2025-11-13 13:49:42 +08:00
RevL147
b0d213f750
fix token_type_ids for eb45-vl ( #4775 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-06 14:19:57 +08:00
ApplEOFDiscord
359dec7431
process transparent image ( #4832 )
2025-11-06 13:42:36 +08:00
ApplEOFDiscord
a7562ddf4b
http get retry ( #4770 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-11-03 16:52:26 +08:00
李泳桦
f6f9c12b87
[fix] fix ipc signal suffix for ep ( #4324 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-20 16:19:29 +08:00
chen
8d2aaf3ba4
[cp][Loader] 2.2 check paddle version for v1 loader ( #4478 )
...
* check
* check
* check import
2025-10-20 15:27:59 +08:00
chen
f660188a85
[cp][BugFix]2.2_fix_custom_ar_unstable_result ( #4436 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [BugFix]Dev fix custom ar unstable result (#4437 )
* code check
2025-10-17 16:04:54 +08:00
ApplEOFDiscord
4178c110d2
[Bug Fix] fix outdated doc and disable mm model prefix caching ( #4425 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* fix outdated doc and disable mm model prefix caching
* fix outdated doc and disable mm model prefix caching
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-10-16 11:10:33 +08:00
chen
adeee84dd6
fix block_wise_fp8_v1_loader_moe_shape ( #4385 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-15 14:23:38 +08:00
李泳桦
e0946ae128
[fix] fix requests & block metrics ( #4325 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix] fix requests & block metrics
* [chore] rename variables
2025-10-15 11:19:20 +08:00
Jiang-Jia-Jun
836ba294fc
Remove unused import in engine_client.py ( #3961 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Removed unused import statement for model_executor.
2025-10-11 10:50:03 +08:00
gaoziyuan
b489943261
Update rollout_model.py ( #4347 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-10-10 16:21:05 +08:00
ltd0924
e42dc8c694
[BUGFIX] clear request ( #4320 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-09-29 20:37:58 +08:00
chen
63a03ee152
[feature]2.2 custom_allreduce support cudagraph recapture ( #4307 )
...
* custom_allreduce support cudagraph recapture
* delete code
* add shut_down/restart default group
2025-09-29 18:14:21 +08:00
kxz2002
9cc2c99539
initial commit ( #4304 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-29 11:21:57 +08:00
luukunn
31e32b5821
[fix]remove reasoning_max_tokens=max_toksns*0.8 in sampling_params ( #4294 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4086 )
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* add completion_token_ids
* add logger
* fix reasoning_max_tokens ParameterError
* add unittest
* add unittest
* add unittest
* add unittest
* add unittest
* add unit test
* fix
* [fix]update apply_chat_template (#4137 )
* update apply_chat_template
* fix unittest
* fix unittest
* fix
* fix
* fix unit test
* fix
* fix unit test
* add unit test
* fix reasoning_max_tokens
2025-09-28 14:44:54 +08:00
luukunn
aebe12a58d
[fix]update apply_chat_template ( #4249 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4086 )
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* add completion_token_ids
* add logger
* fix reasoning_max_tokens ParameterError
* add unittest
* add unittest
* add unittest
* add unittest
* add unittest
* add unit test
* fix
* [fix]update apply_chat_template (#4137 )
* update apply_chat_template
* fix unittest
* fix unittest
* fix
* fix
* fix unit test
* fix
* fix unit test
* add unit test
2025-09-25 16:41:56 +08:00
chen
8fdb950e9f
include_stop_str_in_output=False not return eos text ( #4231 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-24 14:07:30 +08:00
Zhong Hui
a460462d2a
fix ernie vl distributed attr. ( #4217 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-23 19:37:38 +08:00
李泳桦
cb8d87b945
[fix] fix clearing caches synchronization and add more logs ( #4212 )
...
* [fix] fix clearing caches synchronization and add more logs
* [chore] print cache_ready_signal in log
2025-09-23 19:36:38 +08:00
ltd0924
de4feff147
[Feature]CP support data clear ( #4214 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* Update serving_chat.py
* Update serving_completion.py
* Update serving_completion.py
* mv connection_manager init
* [BugFix] fix kv cache
* fix format
* [Feature] support clear data
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
Co-authored-by: RAM <gstian5555@outlook.com >
2025-09-23 16:53:39 +08:00
chen
f38b174a75
Fix noaux_tc cuda Error 700 in CUDAGraph and Add wfp8apf8 moe quant method ( #4115 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* improve per_token_quant_fp8 performance
* support moe wfp8apf8
* check glm test
* fix noaux_tc op in cudagraph, support noaux_tc return the correct
* check
* check inf and overwrite score in noaux_tc
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-22 21:27:37 +08:00
luukunn
6b47773bd6
[fix]Modify follow-up push parameters and Modify the verification method for thinking length ( #4177 )
...
* [fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4086 )
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* 续推参数 generated_token_ids 修改成 completion_token_ids;修改思考长度校验方式
* add completion_token_ids
* add logger
* fix reasoning_max_tokens ParameterError
* add unittest
* add unittest
* add unittest
* add unittest
* add unittest
* add unit test
* fix
2025-09-22 21:12:05 +08:00
李泳桦
0358329946
[fix] initialize available_gpu_block_num with max_gpu_block_num ( #4193 )
2025-09-22 18:56:00 +08:00
RAM
01f6934162
[Executor] Adjust signal sending order in RL training ( #3773 ) ( #4066 ) ( #4178 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* Adjust processing order
* fix bug
* fix update_parameters bug
* refine code
2025-09-22 14:31:36 +08:00
chen
7bdc6f41e5
fix glm all_reduce tp group ( #4188 )
2025-09-22 10:57:13 +08:00
ltd0924
bba279cf38
[Feature] support rdma IB transfer ( #4123 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* Update serving_chat.py
* Update serving_completion.py
* Update serving_completion.py
* mv connection_manager init
* [BugFix] fix kv cache
* fix format
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-09-19 12:54:49 +08:00
Sunny-bot1
4f460db556
[CP2.2] Machete support group scale & wint8 & v1 loader ( #4166 )
...
* support v1 loader for machete (#3999 )
* [Optimize] Support WINT8 and group scale for Machete (#3905 )
* [Optimize] Machete using group scale default (#4121 )
2025-09-19 11:13:12 +08:00
JYChen
74d7b9151d
fix mtp ( #4153 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Co-authored-by: YuanRisheng <yuanrisheng@baidu.com >
2025-09-18 10:53:07 +08:00
李泳桦
0fa28b1068
[fix] fix ep group all-reduce ( #4140 )
...
* [fix] fix ep group all-reduce
* [fix] fix clear/update lock not working when workers > 1
* [chore] add preemption triggered info log
* [fix] fix code style
* fix model_weights_signal (#4092 )
* fix model_weights_signal
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-09-18 10:34:49 +08:00
Jiang-Jia-Jun
cffde70949
Add assertion for ENABLE_V1_KVCACHE_SCHEDULER ( #4146 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-17 16:02:56 +08:00
K11OntheBoat
7f9a9b37f3
Support limit thinking lengths ( #4070 )
...
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”>
2025-09-17 12:40:08 +08:00
gaoziyuan
b41988f4bc
fix gid ( #4038 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-16 20:56:36 +08:00
李泳桦
7ccbcc5a62
[feat] support prefix cache clearing when /clear_load_weight is called ( #4091 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [feat] support clearing prefix cache (cherry-picked from release/2.1)
* [fix] fix ipc suffix, use port instead
* [fix] fix prefix caching not enabled
* [fix] fix code style
* [fix] wait for rank0 to update weight status
2025-09-16 11:11:20 +08:00
chen
fbb4e0f8d1
[CP]Glm45 air 2.2 ( #4073 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Feature] Support zai-org/GLM-4.5-Air BF16 model (#3928 )
* support glm45_air
* [Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) (#4051 )
* check
* fix v1 load for mix and wint8
* check --quantizations 'None'
* check
* support RL rollout
* check v1 loader
* check glm rollout_model, change wfp8afp8 per_token_cast_to_fp8 to native impl
* check rollout moe gate begin layer_id
* check rollout e_score_correction_bias
* delete infer_to_train_mapping={}
* code check
2025-09-15 18:52:58 +08:00
chenjian
4f8ff478b3
[Feature] Support mixed deployment with yiyan adapter in release22 ( #3974 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Feature] Support mixed deployment with yiyan adapter in release2.2
* [Feature] Support mixed deployment with yiyan adapter in release2.2
* fix metrics
* add unit test
* add unit test
* add unit test
* add unit test
* add unit test
* add unit test
2025-09-10 16:01:13 +08:00
guozhuangzhuang
c4098d56a0
Fixed the issue of metrics file conflicts between multiple instances … ( #4010 )
...
* Fixed the issue of metrics file conflicts between multiple instances on a single machine
* Use uuid to name the metrics shared folder
* Use uuid to name the metrics shared folder
2025-09-10 13:48:24 +08:00
ltd0924
a6b161b007
[Fix] fix multi api server log dir ( #3966 )
...
* fix scheduler bug
* fix
* Update api_server.py
* Update multi_api_server.py
* [Fix]
2025-09-10 13:48:17 +08:00
Yuanle Liu
7272afe3dc
Fix down projection weight shape in fused MOE layer ( #4041 )
2025-09-10 12:49:03 +08:00
yangjianfengo1
dfc94371ee
【FIX】Change the name of sparse attn from moba to plas ( #4006 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* 更新文档
* 【docs】 update readme (#4000 )
* 更新文档
* update readme
* update docs
* 【FIX】Change the name of sparse attn from moba to plas (#3845 )
* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci
* code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-10 10:04:29 +08:00
Zero Rains
35b8362804
get org_vocab_size from args ( #3984 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
2025-09-09 15:07:51 +08:00
zhuzixuan
d43c2f2577
[Optimize]Error messages about Model api. ( #3839 ) ( #3972 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* add v1/models interface related
* add model parameters
* default model verification
* unit test
* check model err_msg
* unit test
* type annotation
* model parameter in response
* modify document description
* modify document description
* unit test
* verification
* verification update
* model_name
* pre-commit
* update test case
* update test case
* Update tests/entrypoints/openai/test_serving_models.py
* Update tests/entrypoints/openai/test_serving_models.py
* Update tests/entrypoints/openai/test_serving_models.py
* Update tests/entrypoints/openai/test_serving_models.py
* Update fastdeploy/entrypoints/openai/serving_models.py
* 优化报错信息。
---------
Co-authored-by: yangzichao01 <yangzichao01@baidu.com >
Co-authored-by: Yzc216 <101054010+Yzc216@users.noreply.github.com >
Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-09-09 10:58:11 +08:00
lizhenyun01
d40a1046de
[Feature] support rl_tp_degree ( #3934 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Feature] support rl_tp_degree
* add rl_tp_degree in lmhead
* add rl_tp_degree in bias
* fix split_axis=0 in bias
* fix split_axis in weight
* fix bias rl_tp_degree
* fix bias rl_tp_degree
* change attr to dict
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-08 16:20:32 +08:00
luukunn
1023a67765
[BugFix] fix default parser ( #3932 )
...
* add reasoning parser plugin
* fix finish reason
* fix default parser
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-08 14:12:13 +08:00
Zero Rains
d43549953c
[Cherry-Pick][Bug Fix]fix the bug for real size 0 in cudagraph ( #3888 )
...
* fix the bug for real size 0 in cudagraph
* fix cache_messager
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-08 14:06:10 +08:00
ming1753
d6bf6de5e6
[Bug Fix] Fix mm performance degradation ( #3942 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
* [Bug Fix] Fix mm performance degradation
* formate
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
Co-authored-by: chenjian <1435317881@qq.com >
2025-09-08 00:32:22 +08:00
chenjian
38e734e183
[Feature] support hierarchical cache in v1 ( #3939 )
2025-09-08 00:31:34 +08:00
chenjian
b2bb37d7c0
[Fix] when prompt token ids is numpy ( #3944 )
2025-09-07 23:02:03 +08:00