YuBaoku
|
55ac449c31
|
[CI] remove useless case (#3261)
|
2025-08-07 15:09:40 +08:00 |
|
RAM
|
820798aec5
|
[Executor]Update graph test case and delete test_attention (#3257)
* 1.update graph test case 2.delete test_attention
* code style
* delete print
|
2025-08-07 14:05:15 +08:00 |
|
YuanRisheng
|
0074b423a9
|
fix ci bug (#3239)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-08-07 11:32:39 +08:00 |
|
hong19860320
|
93a1731891
|
[Doc] Update deps and fix dead links (#3252)
|
2025-08-07 11:04:31 +08:00 |
|
李泳桦
|
09cc4e2802
|
[fix] fix completion stream api output_tokens not in usage (#3247)
|
2025-08-07 10:36:00 +08:00 |
|
Yzc216
|
d9e3f88f9e
|
[Feature] multi source download (#3125)
Deploy GitHub Pages / deploy (push) Has been cancelled
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
* modify model download path
* add requirements
* error optimization
* 连接失败兜底
* 连接失败兜底
* 连接失败兜底
* unit test
* unit test
* unit test
* test
* test
|
2025-08-07 00:40:27 +08:00 |
|
bukejiyu
|
9408e667a5
|
[bugfix]fix blockwisefp8 and all_reduce (#3243)
* fix
* update
* fix linear for prequant loader
|
2025-08-06 23:54:33 +08:00 |
|
yangjianfengo1
|
3a15e0c53e
|
【Fix Bug】 修复 fa3 支持集中式bug (#3235)
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix fa3 集中式bug
* 增加qknorm参数
|
2025-08-06 16:24:27 +08:00 |
|
lizexu123
|
afff4d37ea
|
[Feature] support seed parameter (#3161)
* support seed
* fix
* add SamplingMetadata seed test
* The next_tokens values are inconsistent!
* add air and rejection seed test
* fix
* add SamplingParams seed test
* fix seed=0
* Default to defualt
* fix
* fix args_utils
* fix review
* fix review
* fix
* fix
* add xpu,gcu,iluvatar support seed
* fix
|
2025-08-06 15:20:47 +08:00 |
|
bukejiyu
|
20839abccf
|
qwen3_moe (#3084)
|
2025-08-06 14:45:27 +08:00 |
|
Divano
|
91dc87f1c5
|
add some evil cases (#3240)
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
* add evil cases
|
2025-08-06 14:23:55 +08:00 |
|
xjkmfa
|
256a82b0b3
|
Add ci case for min token and max token (#3229)
Co-authored-by: xujing43 <xujing43@baidu.com>
|
2025-08-06 14:10:57 +08:00 |
|
Zero Rains
|
36dc73470d
|
Fix the confused enable_early_stop when only set early_stop_config (#3214)
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix the confused early_stop_config when only set early_stop_config
* pre-commit
* write a general method
|
2025-08-06 11:42:27 +08:00 |
|
YuanRisheng
|
a6e8b780f8
|
fix approve (#3224)
|
2025-08-06 10:36:01 +08:00 |
|
yangjianfengo1
|
89397516a8
|
[New Feature] Support W4Afp8 MoE GroupGemm (#3171)
* init
* 增加多线程编译
* fix bug
* fix bug
* code style
* 增加fp16
* 将print替换成assert
* 修复stmatrix
* 减小单测shape
* 减小单测shape
|
2025-08-06 10:34:05 +08:00 |
|
sg263
|
841e831575
|
[Trace]add trace when fd start (#3174)
Deploy GitHub Pages / deploy (push) Has been cancelled
* add opentelemetry
* add opentelemetry
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* add opentelemetry on dequeue
* fix annotation
* fix annotation when add opentelemetry
* fix opentelemetry-instrumentation-fastapi
* fix pentelemetry-bootstrap
* fix opentelemetry can not work in uvicorn
* move conf to env
* fd start add trace
* fix pre-commit
* fix pre-commit
* change FD_JOB_ID
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: shige <shige@baidu.com>
|
2025-08-05 21:18:27 +08:00 |
|
YUNSHEN XIE
|
e0bbd3b6ca
|
fix approve ci (#3212)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-08-05 17:21:26 +08:00 |
|
Yuan Xiaolan
|
7ce00e597c
|
support qk norm (#3145)
|
2025-08-05 16:46:14 +08:00 |
|
RAM
|
4a10e29804
|
fix mla attention backend (#3176)
|
2025-08-05 16:43:15 +08:00 |
|
Yuan Xiaolan
|
af543b7f0f
|
revise get_moe_scores (#3164)
|
2025-08-05 16:43:07 +08:00 |
|
Divano
|
e24929efa3
|
Ce add bad cases (#3215)
* add repitation early stop cases
* add repitation early stop cases
* add bad cases
* add bad cases
|
2025-08-05 16:37:28 +08:00 |
|
lizexu123
|
b01cfd6007
|
[BugFix] support real batch_size (#3109)
* support real bsz
* fix
* fix xpu_model_runner.py,gpu_model_runner.py,gcu_model_runner.py,iluvatar_model_runner.py
* add event_loop_ep
* fix
* Add comments
* fix
* support mtp real_batch_size
* fix
* self.tmp_seq_lens_this_time->self.seq_lens_this_time_buffer
* fix
* fix VL real_seq_lens_this_time
* fix
* fix mtp
* fix
* fix mtp
* fix xpu
* fix
|
2025-08-05 16:33:54 +08:00 |
|
Jiang-Jia-Jun
|
55939f7942
|
Update engine.py
|
2025-08-05 16:10:36 +08:00 |
|
chen
|
04fc7eb931
|
fix test_air_top_p_sampling name (#3211)
|
2025-08-05 15:47:50 +08:00 |
|
Divano
|
9f1936ae28
|
Ce add repitation early stop cases (#3213)
* add repitation early stop cases
* add repitation early stop cases
|
2025-08-05 15:47:28 +08:00 |
|
RichardWooSJTU
|
1e9a8e8cef
|
fix lm head bias (#3185)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-05 15:40:24 +08:00 |
|
RichardWooSJTU
|
f5c64a074c
|
[EP] Refactor DeepEP Engine Organization for Mixed Mode & Buffer Management Optimization (#3182)
* Add support for mixed-ep across multi nodes
* code refine
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-05 15:40:11 +08:00 |
|
ming1753
|
14ed75f7d3
|
[Test] scaled_gemm_f8_i4_f16 skip test while sm != 89 (#3210)
|
2025-08-05 15:25:28 +08:00 |
|
yangjianfengo1
|
40f7f3e0d8
|
[New Feature] fa3 支持flash mask (#3184)
* 支持flash mask
* 修改test_flash_mask
* 修改test.sh
|
2025-08-05 12:20:48 +08:00 |
|
YUNSHEN XIE
|
b8f3c73aac
|
fix coverage report (#3198)
* fix coverage report
* fix
|
2025-08-05 11:24:55 +08:00 |
|
Divano
|
fb7a0689cc
|
add more cases (#3207)
|
2025-08-05 11:17:36 +08:00 |
|
RAM
|
c593e1a39c
|
[Bug Fix]Fix bug of append attention test case (#3202)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-08-05 11:04:45 +08:00 |
|
RichardWooSJTU
|
e39159f3bd
|
Add switch to apply fine-grained per token quant fp8 (#3192)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-04 19:54:03 -07:00 |
|
Divano
|
88596c0c63
|
Add more base chat cases (#3203)
* add test base class
* fix codestyle
* fix codestyle
* add base chat
|
2025-08-05 10:24:12 +08:00 |
|
lizhenyun01
|
fe540f6caa
|
[plugin] Custom model_runner/model support (#3186)
* support custom model&&model_runner
* fix merge
* add test && update doc
* fix codestyle
* fix unittest
* load model in rl
|
2025-08-04 18:52:39 -07:00 |
|
Sunny-bot1
|
72ef5a9c93
|
[FIX]fix bad_words when sending requests consecutively (#3197)
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix bad_words
* fix log
* fix log
|
2025-08-04 05:59:41 -07:00 |
|
Yuan Xiaolan
|
1f8289e106
|
fix expertwise_scale (#3181)
|
2025-08-04 20:06:15 +08:00 |
|
YuBaoku
|
3eb9a5df60
|
[CI] add test_compare_top_logprobs (#3191)
|
2025-08-04 19:49:24 +08:00 |
|
SunLei
|
68bc1d12c0
|
[Bugfix] Fix uninitialized decoded_token and add corresponding unit test. (#3195)
|
2025-08-04 19:23:58 +08:00 |
|
Longzhi Wang
|
01d7586661
|
[Bug fix] Fix cudagraph when use ep. (#3130)
* fix cudagraph when use ep
* fix typo
* reduce full length to adapt large bsz such 128/256
|
2025-08-04 18:06:18 +08:00 |
|
周周周
|
2bd8a50649
|
remove useless code (#3166)
|
2025-08-04 18:03:08 +08:00 |
|
gaoziyuan
|
0443587a57
|
【Feature】support qwen3 name_mapping (#3179)
* add fd plugins && rm model_classed
* fix reviews
* add docs
* fix
* fix unitest ci
* support qwen3 name_mapping
|
2025-08-04 01:34:07 -07:00 |
|
Zero Rains
|
17f51f0c92
|
[unitest] fix the bug in test_sampler (#3157)
|
2025-08-04 01:23:25 -07:00 |
|
YuanRisheng
|
79bbacc152
|
Fix approve shell scripts (#3108)
* fix approve
* fix
|
2025-08-04 15:51:33 +08:00 |
|
Divano
|
3bfb2eca92
|
Update test_base_chat.py (#3183)
|
2025-08-04 15:09:53 +08:00 |
|
ltd0924
|
c9e6ce1518
|
Update cache_messager.py (#3172)
|
2025-08-04 14:32:34 +08:00 |
|
gaoziyuan
|
4021d66ea5
|
【Feature】add fd plugins && rm model_classes (#3123)
* add fd plugins && rm model_classed
* fix reviews
* add docs
* fix
* fix unitest ci
|
2025-08-03 19:53:20 -07:00 |
|
bukejiyu
|
1582814905
|
fix load_pre_sharded_checkpoint (#3152)
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-08-04 10:44:20 +08:00 |
|
Divano
|
66d3bb89ad
|
Update __init__.py (#3163)
升级测试基类兼容性
|
2025-08-04 09:40:09 +08:00 |
|
AIbin
|
22fe695f1c
|
【Inference Optimize】Support automatic generation of marlin kernel (#3149)
* Support automatic generation of marlin kernel
|
2025-08-01 22:43:18 +08:00 |
|