Commit Graph

2937 Commits

Author SHA1 Message Date
YuBaoku
55ac449c31 [CI] remove useless case (#3261) 2025-08-07 15:09:40 +08:00
RAM
820798aec5 [Executor]Update graph test case and delete test_attention (#3257)
* 1.update graph test case 2.delete test_attention

* code style

* delete print
2025-08-07 14:05:15 +08:00
YuanRisheng
0074b423a9 fix ci bug (#3239)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-07 11:32:39 +08:00
hong19860320
93a1731891 [Doc] Update deps and fix dead links (#3252) 2025-08-07 11:04:31 +08:00
李泳桦
09cc4e2802 [fix] fix completion stream api output_tokens not in usage (#3247) 2025-08-07 10:36:00 +08:00
Yzc216
d9e3f88f9e [Feature] multi source download (#3125)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* multi-source download

* multi-source download

* huggingface download revision

* requirement

* style

* add revision arg

* test

* pre-commit

* Change default download

* change requirements.txt

* modify English Documentation

* documentation

* modify model download path

* add requirements

* error optimization

* 连接失败兜底

* 连接失败兜底

* 连接失败兜底

* unit test

* unit test

* unit test

* test

* test
2025-08-07 00:40:27 +08:00
bukejiyu
9408e667a5 [bugfix]fix blockwisefp8 and all_reduce (#3243)
* fix

* update

* fix linear for prequant loader
2025-08-06 23:54:33 +08:00
yangjianfengo1
3a15e0c53e 【Fix Bug】 修复 fa3 支持集中式bug (#3235)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix fa3 集中式bug

* 增加qknorm参数
2025-08-06 16:24:27 +08:00
lizexu123
afff4d37ea [Feature] support seed parameter (#3161)
* support seed

* fix

* add SamplingMetadata seed test

* The next_tokens values are inconsistent!

* add air and rejection seed test

* fix

* add SamplingParams seed test

* fix seed=0

* Default to defualt

* fix

* fix args_utils

* fix review

* fix review

* fix

* fix

* add xpu,gcu,iluvatar support seed

* fix
2025-08-06 15:20:47 +08:00
bukejiyu
20839abccf qwen3_moe (#3084) 2025-08-06 14:45:27 +08:00
Divano
91dc87f1c5 add some evil cases (#3240)
* add repitation early stop cases

* add repitation early stop cases

* add bad cases

* add bad cases

* add evil cases
2025-08-06 14:23:55 +08:00
xjkmfa
256a82b0b3 Add ci case for min token and max token (#3229)
Co-authored-by: xujing43 <xujing43@baidu.com>
2025-08-06 14:10:57 +08:00
Zero Rains
36dc73470d Fix the confused enable_early_stop when only set early_stop_config (#3214)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix the confused early_stop_config when only set early_stop_config

* pre-commit

* write a general method
2025-08-06 11:42:27 +08:00
YuanRisheng
a6e8b780f8 fix approve (#3224) 2025-08-06 10:36:01 +08:00
yangjianfengo1
89397516a8 [New Feature] Support W4Afp8 MoE GroupGemm (#3171)
* init

* 增加多线程编译

* fix bug

* fix bug

* code style

* 增加fp16

* 将print替换成assert

* 修复stmatrix

* 减小单测shape

* 减小单测shape
2025-08-06 10:34:05 +08:00
sg263
841e831575 [Trace]add trace when fd start (#3174)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* add opentelemetry

* add opentelemetry

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* add opentelemetry on dequeue

* fix annotation

* fix annotation when add opentelemetry

* fix opentelemetry-instrumentation-fastapi

* fix pentelemetry-bootstrap

* fix opentelemetry can not work in uvicorn

* move conf to env

* fd start add trace

* fix pre-commit

* fix pre-commit

* change FD_JOB_ID

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: shige <shige@baidu.com>
2025-08-05 21:18:27 +08:00
YUNSHEN XIE
e0bbd3b6ca fix approve ci (#3212)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-05 17:21:26 +08:00
Yuan Xiaolan
7ce00e597c support qk norm (#3145) 2025-08-05 16:46:14 +08:00
RAM
4a10e29804 fix mla attention backend (#3176) 2025-08-05 16:43:15 +08:00
Yuan Xiaolan
af543b7f0f revise get_moe_scores (#3164) 2025-08-05 16:43:07 +08:00
Divano
e24929efa3 Ce add bad cases (#3215)
* add repitation early stop cases

* add repitation early stop cases

* add bad cases

* add bad cases
2025-08-05 16:37:28 +08:00
lizexu123
b01cfd6007 [BugFix] support real batch_size (#3109)
* support real bsz

* fix

* fix xpu_model_runner.py,gpu_model_runner.py,gcu_model_runner.py,iluvatar_model_runner.py

* add event_loop_ep

* fix

* Add comments

* fix

* support mtp real_batch_size

* fix

* self.tmp_seq_lens_this_time->self.seq_lens_this_time_buffer

* fix

* fix VL real_seq_lens_this_time

* fix

* fix mtp

* fix

* fix mtp

* fix xpu

* fix
2025-08-05 16:33:54 +08:00
Jiang-Jia-Jun
55939f7942 Update engine.py 2025-08-05 16:10:36 +08:00
chen
04fc7eb931 fix test_air_top_p_sampling name (#3211) 2025-08-05 15:47:50 +08:00
Divano
9f1936ae28 Ce add repitation early stop cases (#3213)
* add repitation early stop cases

* add repitation early stop cases
2025-08-05 15:47:28 +08:00
RichardWooSJTU
1e9a8e8cef fix lm head bias (#3185)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
2025-08-05 15:40:24 +08:00
RichardWooSJTU
f5c64a074c [EP] Refactor DeepEP Engine Organization for Mixed Mode & Buffer Management Optimization (#3182)
* Add support for mixed-ep across multi nodes

* code refine

---------

Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
2025-08-05 15:40:11 +08:00
ming1753
14ed75f7d3 [Test] scaled_gemm_f8_i4_f16 skip test while sm != 89 (#3210) 2025-08-05 15:25:28 +08:00
yangjianfengo1
40f7f3e0d8 [New Feature] fa3 支持flash mask (#3184)
* 支持flash mask

* 修改test_flash_mask

* 修改test.sh
2025-08-05 12:20:48 +08:00
YUNSHEN XIE
b8f3c73aac fix coverage report (#3198)
* fix coverage report

* fix
2025-08-05 11:24:55 +08:00
Divano
fb7a0689cc add more cases (#3207) 2025-08-05 11:17:36 +08:00
RAM
c593e1a39c [Bug Fix]Fix bug of append attention test case (#3202)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-05 11:04:45 +08:00
RichardWooSJTU
e39159f3bd Add switch to apply fine-grained per token quant fp8 (#3192)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
2025-08-04 19:54:03 -07:00
Divano
88596c0c63 Add more base chat cases (#3203)
* add test base class

* fix codestyle

* fix codestyle

* add base chat
2025-08-05 10:24:12 +08:00
lizhenyun01
fe540f6caa [plugin] Custom model_runner/model support (#3186)
* support custom model&&model_runner

* fix merge

* add test && update doc

* fix codestyle

* fix unittest

* load model in rl
2025-08-04 18:52:39 -07:00
Sunny-bot1
72ef5a9c93 [FIX]fix bad_words when sending requests consecutively (#3197)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix bad_words

* fix log

* fix log
2025-08-04 05:59:41 -07:00
Yuan Xiaolan
1f8289e106 fix expertwise_scale (#3181) 2025-08-04 20:06:15 +08:00
YuBaoku
3eb9a5df60 [CI] add test_compare_top_logprobs (#3191) 2025-08-04 19:49:24 +08:00
SunLei
68bc1d12c0 [Bugfix] Fix uninitialized decoded_token and add corresponding unit test. (#3195) 2025-08-04 19:23:58 +08:00
Longzhi Wang
01d7586661 [Bug fix] Fix cudagraph when use ep. (#3130)
* fix cudagraph when use ep

* fix typo

* reduce full length to adapt large bsz such 128/256
2025-08-04 18:06:18 +08:00
周周周
2bd8a50649 remove useless code (#3166) 2025-08-04 18:03:08 +08:00
gaoziyuan
0443587a57 【Feature】support qwen3 name_mapping (#3179)
* add fd plugins && rm model_classed

* fix reviews

* add docs

* fix

* fix unitest ci

* support qwen3 name_mapping
2025-08-04 01:34:07 -07:00
Zero Rains
17f51f0c92 [unitest] fix the bug in test_sampler (#3157) 2025-08-04 01:23:25 -07:00
YuanRisheng
79bbacc152 Fix approve shell scripts (#3108)
* fix approve

* fix
2025-08-04 15:51:33 +08:00
Divano
3bfb2eca92 Update test_base_chat.py (#3183) 2025-08-04 15:09:53 +08:00
ltd0924
c9e6ce1518 Update cache_messager.py (#3172) 2025-08-04 14:32:34 +08:00
gaoziyuan
4021d66ea5 【Feature】add fd plugins && rm model_classes (#3123)
* add fd plugins && rm model_classed

* fix reviews

* add docs

* fix

* fix unitest ci
2025-08-03 19:53:20 -07:00
bukejiyu
1582814905 fix load_pre_sharded_checkpoint (#3152)
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-04 10:44:20 +08:00
Divano
66d3bb89ad Update __init__.py (#3163)
升级测试基类兼容性
2025-08-04 09:40:09 +08:00
AIbin
22fe695f1c 【Inference Optimize】Support automatic generation of marlin kernel (#3149)
* Support automatic generation of marlin kernel
2025-08-01 22:43:18 +08:00