Divano
|
91dc87f1c5
|
add some evil cases (#3240)
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
|
2025-08-06 14:23:55 +08:00 |
|
xjkmfa
|
256a82b0b3
|
Add ci case for min token and max token (#3229)
Co-authored-by: xujing43 <xujing43@baidu.com>
|
2025-08-06 14:10:57 +08:00 |
|
Zero Rains
|
36dc73470d
|
Fix the confused enable_early_stop when only set early_stop_config (#3214)
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix the confused early_stop_config when only set early_stop_config
* pre-commit
* write a general method
|
2025-08-06 11:42:27 +08:00 |
|
YuanRisheng
|
a6e8b780f8
|
fix approve (#3224)
|
2025-08-06 10:36:01 +08:00 |
|
yangjianfengo1
|
89397516a8
|
[New Feature] Support W4Afp8 MoE GroupGemm (#3171)
* init
* 增加多线程编译
* fix bug
* fix bug
* code style
* 增加fp16
* 将print替换成assert
* 修复stmatrix
* 减小单测shape
* 减小单测shape
|
2025-08-06 10:34:05 +08:00 |
|
sg263
|
841e831575
|
[Trace]add trace when fd start (#3174)
Deploy GitHub Pages / deploy (push) Has been cancelled
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* fix annotation
* fix annotation when add opentelemetry
* fix opentelemetry-instrumentation-fastapi
* fix pentelemetry-bootstrap
* fix opentelemetry can not work in uvicorn
* move conf to env
* fd start add trace
* fix pre-commit
* fix pre-commit
* change FD_JOB_ID
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: shige <shige@baidu.com>
|
2025-08-05 21:18:27 +08:00 |
|
YUNSHEN XIE
|
e0bbd3b6ca
|
fix approve ci (#3212)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-08-05 17:21:26 +08:00 |
|
Yuan Xiaolan
|
7ce00e597c
|
support qk norm (#3145)
|
2025-08-05 16:46:14 +08:00 |
|
RAM
|
4a10e29804
|
fix mla attention backend (#3176)
|
2025-08-05 16:43:15 +08:00 |
|
Yuan Xiaolan
|
af543b7f0f
|
revise get_moe_scores (#3164)
|
2025-08-05 16:43:07 +08:00 |
|
Divano
|
e24929efa3
|
Ce add bad cases (#3215)
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
|
2025-08-05 16:37:28 +08:00 |
|
lizexu123
|
b01cfd6007
|
[BugFix] support real batch_size (#3109)
* support real bsz
* fix
* fix xpu_model_runner.py,gpu_model_runner.py,gcu_model_runner.py,iluvatar_model_runner.py
* add event_loop_ep
* fix
* Add comments
* fix
* support mtp real_batch_size
* fix
* self.tmp_seq_lens_this_time->self.seq_lens_this_time_buffer
* fix
* fix VL real_seq_lens_this_time
* fix
* fix mtp
* fix
* fix mtp
* fix xpu
* fix
|
2025-08-05 16:33:54 +08:00 |
|
Jiang-Jia-Jun
|
55939f7942
|
Update engine.py
|
2025-08-05 16:10:36 +08:00 |
|
chen
|
04fc7eb931
|
fix test_air_top_p_sampling name (#3211)
|
2025-08-05 15:47:50 +08:00 |
|
Divano
|
9f1936ae28
|
Ce add repitation early stop cases (#3213)
* add repitation early stop cases
* add repitation early stop cases
|
2025-08-05 15:47:28 +08:00 |
|
RichardWooSJTU
|
1e9a8e8cef
|
fix lm head bias (#3185)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-05 15:40:24 +08:00 |
|
RichardWooSJTU
|
f5c64a074c
|
[EP] Refactor DeepEP Engine Organization for Mixed Mode & Buffer Management Optimization (#3182)
* Add support for mixed-ep across multi nodes
* code refine
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-05 15:40:11 +08:00 |
|
ming1753
|
14ed75f7d3
|
[Test] scaled_gemm_f8_i4_f16 skip test while sm != 89 (#3210)
|
2025-08-05 15:25:28 +08:00 |
|
yangjianfengo1
|
40f7f3e0d8
|
[New Feature] fa3 支持flash mask (#3184)
* 支持flash mask
* 修改test_flash_mask
* 修改test.sh
|
2025-08-05 12:20:48 +08:00 |
|
YUNSHEN XIE
|
b8f3c73aac
|
fix coverage report (#3198)
* fix coverage report
* fix
|
2025-08-05 11:24:55 +08:00 |
|
Divano
|
fb7a0689cc
|
add more cases (#3207)
|
2025-08-05 11:17:36 +08:00 |
|
RAM
|
c593e1a39c
|
[Bug Fix]Fix bug of append attention test case (#3202)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-08-05 11:04:45 +08:00 |
|
RichardWooSJTU
|
e39159f3bd
|
Add switch to apply fine-grained per token quant fp8 (#3192)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-04 19:54:03 -07:00 |
|
Divano
|
88596c0c63
|
Add more base chat cases (#3203)
* add test base class
* fix codestyle
* fix codestyle
* add base chat
|
2025-08-05 10:24:12 +08:00 |
|
lizhenyun01
|
fe540f6caa
|
[plugin] Custom model_runner/model support (#3186)
* support custom model&&model_runner
* fix merge
* add test && update doc
* fix codestyle
* fix unittest
* load model in rl
|
2025-08-04 18:52:39 -07:00 |
|
Sunny-bot1
|
72ef5a9c93
|
[FIX]fix bad_words when sending requests consecutively (#3197)
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix bad_words
* fix log
* fix log
|
2025-08-04 05:59:41 -07:00 |
|
Yuan Xiaolan
|
1f8289e106
|
fix expertwise_scale (#3181)
|
2025-08-04 20:06:15 +08:00 |
|
YuBaoku
|
3eb9a5df60
|
[CI] add test_compare_top_logprobs (#3191)
|
2025-08-04 19:49:24 +08:00 |
|
SunLei
|
68bc1d12c0
|
[Bugfix] Fix uninitialized decoded_token and add corresponding unit test. (#3195)
|
2025-08-04 19:23:58 +08:00 |
|
Longzhi Wang
|
01d7586661
|
[Bug fix] Fix cudagraph when use ep. (#3130)
* fix cudagraph when use ep
* fix typo
* reduce full length to adapt large bsz such 128/256
|
2025-08-04 18:06:18 +08:00 |
|
周周周
|
2bd8a50649
|
remove useless code (#3166)
|
2025-08-04 18:03:08 +08:00 |
|
gaoziyuan
|
0443587a57
|
【Feature】support qwen3 name_mapping (#3179)
* add fd plugins && rm model_classed
* fix reviews
* add docs
* fix
* fix unitest ci
* support qwen3 name_mapping
|
2025-08-04 01:34:07 -07:00 |
|
Zero Rains
|
17f51f0c92
|
[unitest] fix the bug in test_sampler (#3157)
|
2025-08-04 01:23:25 -07:00 |
|
YuanRisheng
|
79bbacc152
|
Fix approve shell scripts (#3108)
* fix approve
* fix
|
2025-08-04 15:51:33 +08:00 |
|
Divano
|
3bfb2eca92
|
Update test_base_chat.py (#3183)
|
2025-08-04 15:09:53 +08:00 |
|
ltd0924
|
c9e6ce1518
|
Update cache_messager.py (#3172)
|
2025-08-04 14:32:34 +08:00 |
|
gaoziyuan
|
4021d66ea5
|
【Feature】add fd plugins && rm model_classes (#3123)
* add fd plugins && rm model_classed
* fix reviews
* add docs
* fix
* fix unitest ci
|
2025-08-03 19:53:20 -07:00 |
|
bukejiyu
|
1582814905
|
fix load_pre_sharded_checkpoint (#3152)
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-08-04 10:44:20 +08:00 |
|
Divano
|
66d3bb89ad
|
Update __init__.py (#3163)
升级测试基类兼容性
|
2025-08-04 09:40:09 +08:00 |
|
AIbin
|
22fe695f1c
|
【Inference Optimize】Support automatic generation of marlin kernel (#3149)
* Support automatic generation of marlin kernel
|
2025-08-01 22:43:18 +08:00 |
|
ApplEOFDiscord
|
b71cbb466d
|
[Feature] remove dependency on enable_mm and refine multimodal's code (#3014)
* remove dependency on enable_mm
* fix codestyle check error
* fix codestyle check error
* update docs
* resolve conflicts on model config
* fix unit test error
* fix code style check error
---------
Co-authored-by: shige <1021937542@qq.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-08-01 20:01:18 +08:00 |
|
plusNew001
|
243394044d
|
[XPU]Updata XPU dockerfiles (#3144)
* [CI] add xpu ci case
* [CI]Update run_ci_xpu.sh
* [XPU]Update Dockerfile.xpu
* Update Dockerfile.xpu
|
2025-08-01 19:41:59 +08:00 |
|
Zhang Yulong
|
0eb32bb9c8
|
add cases (#3155)
|
2025-08-01 18:38:57 +08:00 |
|
yangjianfengo1
|
64d7a3194d
|
集中式支持fa3 (#3112)
|
2025-08-01 18:03:36 +08:00 |
|
YUNSHEN XIE
|
bdb83e007d
|
fix ci (#3141)
|
2025-08-01 17:42:26 +08:00 |
|
Divano
|
50db0d7ba9
|
add case (#3150)
* add test base class
* fix codestyle
* fix codestyle
* add base chat
|
2025-08-01 17:30:58 +08:00 |
|
Ryan
|
94264bbf60
|
[Code Simplification] Refactor Post-processing in VL Model Forward Method (#2937)
* rm sth useless
* refactor model forward
* mv bool index to kernel
|
2025-08-01 17:28:07 +08:00 |
|
yinwei
|
3a4db15765
|
Fix out-of-memory issue during single-XPU deployment (#3133)
|
2025-08-01 17:12:03 +08:00 |
|
JYChen
|
c34088b0fd
|
fix stop seq unittest (#3126)
|
2025-08-01 16:50:05 +08:00 |
|
ming1753
|
fc5f43c6bc
|
[Docs] Optimal Deployment (#2768)
|
2025-08-01 11:56:27 +08:00 |
|