yangjianfengo1
|
329d074326
|
[Docx] fix the broken link (#4479)
* 修改文档
* 修改文档
|
2025-10-17 18:28:50 +08:00 |
|
yinwei
|
a64c0408b9
|
[XPU]Fix w4a8 precision bug && rollback moe algo (#4463)
* fix w4a8 precision bug
* add env
* code stype check
|
2025-10-17 18:27:53 +08:00 |
|
chen
|
63ef593450
|
check paddle version for v1 loader (#4473)
|
2025-10-17 17:25:03 +08:00 |
|
yzwu
|
4b661512ca
|
[Iluvatar GPU] Adapt VL model (#4313)
|
2025-10-17 16:13:38 +08:00 |
|
yangjianfengo1
|
ba5c2b7e37
|
[Docx] add language (en/cn) switch links (#4470)
* add install docs
* 修改文档
* 修改文档
|
2025-10-17 15:47:41 +08:00 |
|
Ayakouji
|
a3e0a15495
|
fix seqlen sync (#4442)
|
2025-10-17 14:37:52 +08:00 |
|
xiaolei373
|
720697e265
|
add environment variables (#4466)
|
2025-10-17 14:20:01 +08:00 |
|
YuBaoku
|
01510876ab
|
[CI] Fix partial instability issues (#4461)
|
2025-10-17 14:17:06 +08:00 |
|
ddchenhao66
|
14785eb65d
|
[XPU] abstract a hardware-agnostic operator wrapper for prefix cache and specify xpu device id definition (#4455)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Co-authored-by: ddchenhao66 <dhaochen163.com>
|
2025-10-17 14:05:33 +08:00 |
|
lizexu123
|
c234b995ab
|
[Feature] support pooling model dummy_run (#4345)
* support qwen3-embedding
* fix ci bug
* support pooling dummy_run
* fix
* delete print
* parallel_config.max_model_len
* delete is_pooling_model in dummy_run
* fix
* fd_model
* fix embedding load
* fix
* fix post_process
|
2025-10-17 13:30:55 +08:00 |
|
Ryan
|
15b6b8dc25
|
[CINN] Remove the restriction of automatically falling back to SOT after enabling CINN (#4411)
* remove CINN limitation
* fix unitest
* fix codestyle
|
2025-10-17 12:51:07 +08:00 |
|
chen
|
b134e6afe6
|
[BugFix]Dev fix custom ar unstable result (#4437)
|
2025-10-17 11:47:16 +08:00 |
|
Ryan
|
6160145f82
|
[SOT] Change warnings to errors and remove fallback operations (#4378)
* Change warnings to errors and remove fallback operations
* fix unitest
* fix codestyle
|
2025-10-17 11:27:04 +08:00 |
|
chenjian
|
0413c32b8f
|
[Optimize] Set preempted schedule log as info level (#4453)
|
2025-10-17 11:25:46 +08:00 |
|
Zero Rains
|
5885953211
|
[Others] add PR Template (#4452)
* add PR Template
* update
* update
* update
* update
* update
* update
|
2025-10-17 11:09:51 +08:00 |
|
Sunny-bot1
|
930f7b781c
|
[Optimization] Put get_block_shape_and_split_kv_block in cuda graph for append attention backend (#4443)
* get block in cuda graph
* fix sot
|
2025-10-17 10:59:56 +08:00 |
|
Ryan
|
49cea8fb1c
|
[SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694)
* rm inplace info && to(gpu)
* update append_attention
* unpin paddle version
* add full_cuda_graph=False
* add blank line
---------
Co-authored-by: SigureMo <sigure.qaq@gmail.com>
|
2025-10-17 10:57:55 +08:00 |
|
YuanRisheng
|
a37c9416ac
|
[FDConfig]Remove reasoning_parser/guided_decoding_backend/disable_any_whitespace/device_ids in FDConfig (#4362)
* remove devices id
* fix unittest
* fix ce
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-17 10:40:59 +08:00 |
|
xiaolei373
|
d1637db86a
|
modify_comment (#4460)
|
2025-10-17 10:10:09 +08:00 |
|
chen
|
db82e9a022
|
[BugFix]Fix wfp8afp8 triton moe group_topk renormalized=True (#4449)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix group_topk renormalized=True
* check test
|
2025-10-16 23:17:48 +08:00 |
|
xiaolei373
|
dbca63f862
|
[bugfix] kill cache_transfer_manager process (#4401)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
|
2025-10-16 20:45:24 +08:00 |
|
YuanRisheng
|
0355235fb9
|
[FDConfig]Remove total_block_num/dtype/block_size/enc_dec_block_num in ParallelConfig (#4400)
* delete some attr in parallel config
* delete comment
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-16 20:00:37 +08:00 |
|
Ryan
|
b87e2c6184
|
[CUDAGraph]Add support for custom all-reduce operators under SOT mode (#4386)
|
2025-10-16 19:31:19 +08:00 |
|
zhupengyang
|
26ff2f8683
|
[XPU] refine fused moe (#4219)
|
2025-10-16 19:04:07 +08:00 |
|
Jianyu Li
|
3bbe99eae7
|
[Intel HPU] Enable dist sampler on intel hpu platform (#4445)
|
2025-10-16 19:02:27 +08:00 |
|
LiqinruiG
|
4251ac5e95
|
【Fix】 remove text_after_process & raw_prediction (#4421)
* remove text_after_process & raw_prediction
* remove text_after_process & raw_prediction
|
2025-10-16 19:00:18 +08:00 |
|
Zhang Yulong
|
8f77adc381
|
Add data dictionary for API response processing (#4454)
Initialize data dictionary for response handling.
|
2025-10-16 17:23:11 +08:00 |
|
Zhenghai Zhang
|
6adfbe07ad
|
【Hackathon 9th No.86】autogen MultiQueryDecoderAttention template_instantiation -part (#4383)
* split MultiQueryDecoderAttention template_instantiation
* update comment
* CI
|
2025-10-16 17:08:19 +08:00 |
|
kevin
|
f72be7a2c8
|
[BUG] fix ep bug (#4275)
* fix ep bug
* update code
* update code
* update code
* [BugFix] fix config bugs (#4370)
* Update expert_service.py
* Update common_engine.py
* Update expert_service.py
* Update expert_service.py
* Update expert_service.py
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
* update code
---------
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-10-16 16:46:40 +08:00 |
|
SunLei
|
5abf59715d
|
perf: optimize ZMQ communication with async queue and single-threaded… (#4444)
* perf: optimize ZMQ communication with async queue and single-threaded model
* perf: _async_output_busy_loop
* fix: async_output_queue init
|
2025-10-16 15:46:26 +08:00 |
|
Zhang Yulong
|
98f8c3703a
|
Add filtering for failed requests in benchmark outputs (#4448)
Filter out requests with end_timestamp == 0.0
|
2025-10-16 14:57:47 +08:00 |
|
Zhang Yulong
|
9dc3968c13
|
[benchmark] Fix benchmark duration calculation logic (#4446)
* Fix benchmark duration calculation logic
Calculate benchmark duration using filtered outputs.
* Fix benchmark duration calculation using benchmark_outputs
|
2025-10-16 14:36:29 +08:00 |
|
Lucas
|
a5063b96c8
|
[XPU] moe support VL 0-dim input (#4408)
|
2025-10-16 14:01:01 +08:00 |
|
gaoziyuan
|
fd5dd1a0f1
|
[Bugfix]fix ep clear buffer perf (#4389)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix
* Update fused_moe_backend_base.py
|
2025-10-16 13:05:39 +08:00 |
|
chenjian
|
670aaa3f83
|
[Bug fix] Fix pd for x1 thinking (#4433)
|
2025-10-16 12:03:45 +08:00 |
|
ddchenhao66
|
8e392f0ea6
|
[XPU] support prefix cache (#4423)
Co-authored-by: ddchenhao66 <dhaochen163.com>
|
2025-10-16 11:27:41 +08:00 |
|
ltd0924
|
5bde20b0c9
|
[BugFix] fix config bugs (#4370)
* Update expert_service.py
* Update common_engine.py
* Update expert_service.py
* Update expert_service.py
* Update expert_service.py
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-10-16 10:25:21 +08:00 |
|
Zhang Yulong
|
7f94f063ff
|
Update benchmark_serving.py (#4438)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
丢弃的请求依旧保存,用于结果分析
|
2025-10-15 20:36:19 +08:00 |
|
SunLei
|
b4b579a7ed
|
Feature:Add support for Pooling Model Embedding and provide an OpenAI-compatible API. (#4344)
* feat: add OpenAIServing
* feat: add ZmqOpenAIServing & OpenAIServingEmbedding
* feat: Refine the basic ServingEngine class and introduce ServingContext
* fix: codestyle
* fix: request
* fix: pooling_params
* feat: _process_chat_template_kwargs
* feat: support batch request
* feat: pooling_params verify & default parameters
---------
Co-authored-by: sunlei1024 <sunlei1024@example.com>
|
2025-10-15 19:42:59 +08:00 |
|
freeliuzc
|
744287e1a9
|
fix param (#4419)
|
2025-10-15 18:44:24 +08:00 |
|
ltd0924
|
fbdb056de0
|
[BUGFIX] clear request #4286 (#4402)
Co-authored-by: ltd0924 <luotingdan@baidu.com>
|
2025-10-15 17:43:28 +08:00 |
|
Lucas
|
bdc0207277
|
[XPU] fix VL multi-batch accuracy issue (#4394)
|
2025-10-15 17:27:43 +08:00 |
|
ltd0924
|
d8841b7b40
|
[BugFix] fix workers=1 (#4364)
* [Feature] support prefix cache in DP
* fix
* Update common_engine.py
* Update common_engine.py
* Update common_engine.py
* Update common_engine.py
* [BugFix] fix workers more than 1
* fix
* Update api_server.py
* fix
* Update api_server.py
* fix
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com>
|
2025-10-15 17:06:25 +08:00 |
|
bukejiyu
|
bcaa98ff9c
|
V1 loader default (#4251)
* v1 laoder
* update
* update
|
2025-10-15 16:49:17 +08:00 |
|
tianshuo78520a
|
e98c1c2f47
|
Disable gcu ci (#4427)
* Disable GCU CI
* Disable GCU CI
* Update _ci_gcu.yml
|
2025-10-15 16:06:25 +08:00 |
|
AIbin
|
6938df9c23
|
【Fix CI Bug】Fix ci bug (#4413)
* Support DSK-v3.2 model
* Support DSK-v3.2
* Support DSK-v3.2
* Support DSK-3.2
* fix CI bug
* fix_CI_BUG
* update ci bug
|
2025-10-15 14:19:04 +08:00 |
|
chen
|
4efd073a41
|
fix block_wise_fp8_v1_loader_moe_shape (#4384)
|
2025-10-15 14:08:53 +08:00 |
|
freeliuzc
|
582aebd48b
|
[MTP]support mtp chunk_prefill_v1 (#4366)
* support mtp chunk_prefill_v1
* fix mtp chunkprefill output, fix unit test
* fix unit test
* fix save_output
|
2025-10-15 13:21:32 +08:00 |
|
李泳桦
|
ffe7af8a97
|
[fix] fix requests & block metrics (#4404)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [fix] fix requests & block metrics
* [chore] rename variables
|
2025-10-15 11:49:24 +08:00 |
|
qwes5s5
|
abb62624b8
|
[fix] Fixed the issue of excessive/redundant spans being returned for streaming requests. (#4375)
* fix stream span
* fix stream span
|
2025-10-15 11:47:47 +08:00 |
|