bukejiyu
0b51b9c35b
fix qwen3 235B tp 8 ( #3697 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-28 23:46:25 +08:00
Yuanle Liu
4957908275
add input_processor plugin ( #3657 )
...
* add input_processor plugin
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
* update
2025-08-28 22:53:57 +08:00
ming1753
02b3644903
[Bug Fix] VL Support w4a8/w4afp8 ( #3686 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
2025-08-28 21:38:35 +08:00
YuanRisheng
808b548761
support tmp ( #3675 )
2025-08-28 19:42:32 +08:00
gaoziyuan
fc635acc47
[BugFix]fix dp&ep&tp and muti node infer ( #3629 )
...
* rm log
* fix bug
* fix bug
* fix dp&ep&tp and muti node infer
* fix
---------
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-08-28 19:09:10 +08:00
Liumengyuan
2a73a6df03
fix_fp8_deepgemm_moe_tp_bug ( #3658 )
2025-08-28 17:19:02 +08:00
Liumengyuan
e93d4cfcdd
Add with_output version AppendAttention ( #3302 )
...
* get use_output from fd_config
* add clear TODO description
* add mask_offset para to align with develop
* fix bug
* fix use_output logic
* fix sot bug
2025-08-28 17:10:18 +08:00
ltd0924
94ded434bd
[BugFix] ep mixed offline exit ( #3661 )
...
* Update expert_service.py
* Update expert_service.py
2025-08-28 17:09:07 +08:00
ltd0924
e5015eea05
[BugFix] fix logger ( #3666 )
2025-08-28 17:08:00 +08:00
ltd0924
98c217b428
Update config.py ( #3669 )
2025-08-28 15:30:51 +08:00
Yuan Xiaolan
d37331fc71
fix w4afp8_gemm_scale_permute import error on A100 ( #3611 )
2025-08-28 11:42:23 +08:00
YuanRisheng
ad9b95e6dd
fix rl bugs ( #3654 )
2025-08-28 11:09:34 +08:00
yangjianfengo1
e81046fdad
【New Feature】集中式支持w4afp8 ( #3644 )
...
* 支持tp w4afp8
* code style
2025-08-28 10:53:24 +08:00
周周周
76513f6416
Support 45t fp8 8 GPU ( #3659 )
2025-08-28 10:52:53 +08:00
ltd0924
3d92fb09f7
[BugFix] fix parameter is 0 ( #3592 )
...
* Update engine_client.py
* fix
* Update common_engine.py
2025-08-28 09:52:36 +08:00
Sunny-bot1
479c8b85d3
[Optimize]support machete weight only gemm ( #3561 )
...
* support machete weight only gemm
* add generate
* update
* fix
* change file location
* add sm_version limit
* fix
* fix
* fix ci
* fix coverage
* fix xpu
2025-08-28 09:49:58 +08:00
Zero Rains
e37e86b3b8
[V1 Loader]support param create and load for wint2 and xpu backend ( #3581 )
...
* support wint2 backend'
* [V1 Loader]support param create and load for wint2 and xpu backend
* update weight shape name
* update
* update
* update baseline.txt
* update model name
* update baseline.txt
* fix codestyle
* remove debug coode
2025-08-28 09:49:36 +08:00
lizexu123
b28a0343a6
fix ENABLE_V1_KVCACHE_SCHEDULER ( #3625 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-08-27 21:21:29 +08:00
ltd0924
2974016103
[BugFix] fix ce bugs ( #3641 )
...
* [BugFix] fix tp8 client refuse
* fix engine port bug
* Update utils.py
2025-08-27 20:38:15 +08:00
Yuanle Liu
836345a4dd
delete ernie4_5_vl_tokenizer ( #3631 )
2025-08-27 20:36:02 +08:00
Jiang-Jia-Jun
c694fa2879
Revert "[Feature] block sparse attention ( #3209 )" ( #3647 )
...
This reverts commit 646a0c2fd8
.
2025-08-27 17:35:04 +08:00
李泳桦
b2afdf4fc6
[fix] qwen output inconsistency when top_p=0 ( #3634 )
...
* [fix] qwen output inconsistency when top_p=0
* [fix] remove decode pre_id code
2025-08-27 17:16:23 +08:00
lzy
1265f6c192
deepgemm don't support tp+ep (for ci) ( #3638 )
...
* deepgemm don't support tp+ep (for ci)
* deepgemm don't support tp+ep (for ci)
2025-08-27 16:39:19 +08:00
chen
5ad8721506
check ( #3639 )
2025-08-27 14:32:13 +08:00
chen
ce9c0917c5
[Precision] Support lm_head layer running in float32 ( #3597 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support lm_head fp32 bf16 fp16
* support lm_head fp32 bf16 fp16
* add doc and check code
* lm_head_fp32 specify lm_head as fp32
* code check
* check doc
2025-08-27 11:34:53 +08:00
xiaoxiaohehe001
ad319a87cc
support fa3 rope3d ( #3622 )
2025-08-27 11:31:29 +08:00
yangjianfengo1
646a0c2fd8
[Feature] block sparse attention ( #3209 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* 支持稀疏attn
* fix bug
* code style
* fix moba attn get kv shape
* 修复a100编译
* codestyle
* code style
* code style
* code style
* fix conflict
* 增加单侧
* code style
* 增加eblite 加载时间
* fix bug
* for ci
* for ci
* for ci
* for ci
* 支持mlp block size 128
* 增加小算子单测
* fix 单测 mlp
* 将环境变量加入到config里面
* fix rollout config
2025-08-26 07:16:04 -07:00
gaoziyuan
82e64b13e1
[NewFeature]Support dp multi api server && Fix some bug in mixed ep && merge develop ( #3598 )
...
* [Feature] update ep
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix queue ports idx
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* fix ci
* Update engine.py
* fix ci
* fix some bug in mixed ep
* add server fix and op fix
* rm some log
* fix code style
* ltd fix
* fix
* fix
* fix some bug
* fix bug
* fix bug
* fix style
* Update config.py
* Update splitwise_connector.py
* Update cache_messager.py
* Update __init__.py
* merge and fix
* Update engine.py
* Update common_engine.py
* Update run_ci_xpu.sh
* Update ernie_processor.py
* Update ernie_processor.py
---------
Co-authored-by: ltd0924 <ltd0924@sina.com >
Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com >
2025-08-26 19:59:02 +08:00
Yuanle Liu
cbce94a00e
rename ernie_xxx to ernie4_5_xxx ( #3621 )
...
* rename ernie_xxx to ernie4_5_xxx
* ci fix
2025-08-26 19:29:27 +08:00
YuanRisheng
642480f5f6
[CI] Standard unittest ( #3606 )
...
* standard unittest
* fix bugs
* fix script
2025-08-26 19:03:11 +08:00
SunLei
2f28f40d90
fix: replace list * n initialization with list comprehension to avoid shared references ( #3618 )
2025-08-26 17:53:31 +08:00
bukejiyu
3200a80de3
[v1 loader]support fp8 ( #3593 )
...
* support fp8
* update ci
2025-08-26 02:42:46 -07:00
RAM
00898603c8
[CUDAGraph]Add debug func ( #3616 )
...
* add print dot files
* refine code
2025-08-26 16:43:48 +08:00
xiaoxiaohehe001
9afa236e39
[NewFeatures] support eplb ( #3547 )
...
* [NewFeatures] support eplb
* fix eplb
2025-08-26 16:19:30 +08:00
Yuanle Liu
56e2d7e668
adaptive rms_norm's dtype ( #3617 )
...
* adaptive rms_norm's dtype
* adaptive rms_norm's dtype
* add approve coverage
---------
Co-authored-by: liuyuanle <liuyuanle@baidu.com >
2025-08-26 15:29:15 +08:00
lzy
d339df2e90
Supports DP+TP+EP hybrid parallel deployment strategy ( #3489 )
...
* Support DP+TP+EP hybrid parallel deployment strategy
* Support DP+TP+EP hybrid parallel deployment strategy
* fix conflict
* add moe_tp_ep function split_allgather_out
* del tp_group in moe_cutlass_backend
* for ci
* fix parallel_config for ci
* del log
2025-08-26 00:04:01 -07:00
freeliuzc
52eda7fdb3
[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram ( #3610 )
2025-08-26 14:29:22 +08:00
AIbin
0a0d2959b9
qkv_a_proj horizontal fusion ( #3591 )
...
Support DSK qkv_a_proj horizontal fusion under V0 Loder
2025-08-26 14:25:57 +08:00
Sunny-bot1
c68c3c4b8b
[Feature] bad words support v1 scheduler and specifiy token ids ( #3608 )
...
* support bad_words_token_ids
* docs
* fix test
* fix
* bad words support kvcache v1 and token ids
* fix
2025-08-25 20:14:51 -07:00
lizexu123
c43a4bec00
[Features] support hugging face qwen3 dense and qwen2 model ( #3574 )
...
* support qwen2 and qwen3 hugging face
* fix moe
* defualt_v1 loader
* hugging_face_format deprecated
* modify hugging_face_foramt to model_format
* model_format auto
* fix environemt
* fix bug
* fix qwen3-0.6 bug
* model_format is str
* fix
2025-08-26 10:54:53 +08:00
ltd0924
66c5addce4
[Bugfix] fix api server control signal bugs ( #3531 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update serving_chat.py
* Update serving_completion.py
* Update serving_completion.py
2025-08-25 21:13:04 +08:00
RAM
2fa173e327
[Executor] CUDAGraph support RL training ( #3265 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* add clear graph opt backend
* cuda graph support rl
* add branch
* 1.fix dynamic_weight_manager bug 2.add clear api for CasualLM
* open test case
* fix typo
* update mkdocs.yaml
* [Docs]Update mkdocs.yml
* update test case
* use unittest in graph test case
2025-08-25 20:59:30 +08:00
Kane2011
2ae7ab28d2
[MetaxGPU] adapt to the latest fastdeploy on metax gpu ( #3492 )
2025-08-25 17:44:20 +08:00
chen
9cab3f47ff
[Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing ( #3552 )
...
* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing
* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs
* delete some code
* code check
* code check and add doc
* fix tokenizer.decoder(-1), return 'Invalid Token'
* add ci for temp_scaled and top_p logprobs
* check test
* check seq len time shape
* logprob clip inf
---------
Co-authored-by: sunlei1024 <sunlei5788@gmail.com >
2025-08-25 14:11:49 +08:00
Yuan Xiaolan
9205c88da1
support w4afp8 EP inference ( #3044 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-08-25 11:27:45 +08:00
bukejiyu
bdbac0aa3d
support qwen2 weight only ( #3571 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
2025-08-24 11:14:34 +08:00
bukejiyu
77514e3e1e
[V1 Loader] support weight_only ( #3413 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* support wint4/wint8
* delete smoe case
* update ci
* print log
2025-08-23 13:13:41 +08:00
YuanRisheng
e481b7a779
fix sot ( #3556 )
2025-08-23 08:37:06 +08:00
Zero Rains
79f0dbbb55
[V1 Loader] Support qwen2(bf16) ( #3502 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen2(bf16)
* merge bias_loader and weight_loader
2025-08-23 01:08:23 +08:00
zhink
df7c31012b
Modified to support custom all reduce by default ( #3538 )
2025-08-22 16:59:05 +08:00