bukejiyu
41bfa1090d
[CI]delete test_common_model ( #4794 )
...
* fix
* update
* update
2025-11-04 13:57:55 +08:00
bukejiyu
69c2f3cda1
[CI]test common model ( #4697 )
...
* ut
* update
2025-11-03 16:48:36 +08:00
CSWYF3634076
acd331780c
[V1 loader] Qwen25 VL support v1 loader and torch style safetensors load ( #4388 )
...
* [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix
* [Docs]offine infer add apply_chat_template add_generation_prompt parameter
* [Model]qwen2.5VL support --use-cudagraph
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v2
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v3
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v4
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v5
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v6
* [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v7
* qwen25vl v1 loader
* qwen25vl v1 loader v2
* qwen25vl v1 loader v3
* qwen25vl v1 loader fix tp2 weight PySafeSlice
* qwen25vl v1 loader no test
* qwen25vl v1 loader add unit test
* qwen25vl v1 loader add unit test v2
* qwen25vl v1 loader add torch unit test v3
* qwen25vl v1 loader add torch unit test v4
* qwen25vl v1 loader add torch unit test v5
* qwen25vl v1 loader add torch unit test v6
2025-10-27 10:54:15 +08:00
Zhang Yulong
83b720804b
Clean up ports after processing results ( #4587 )
2025-10-27 10:13:24 +08:00
Sunny-bot1
4ffe41a747
WINT4/WINT8 dense gemm default use Machete ( #4451 )
2025-10-23 17:57:59 +08:00
RAM
775edcc09a
[Executor] Default use CUDAGraph ( #3594 )
...
* add start intercept
* Adjustment GraphOptConfig
* pre-commit
* default use cudagraph
* set default value
* default use cuda graph
* pre-commit
* fix test case bug
* disable rl
* fix moba attention
* only support gpu
* Temporarily disable PD Disaggregation
* set max_num_seqs of test case as 1
* set max_num_seqs and temperature
* fix max_num_batched_tokens bug
* close cuda graph
* success run wint2
* profile run with max_num_batched_tokens
* 1.add c++ memchecker 2.success run wint2
* updatee a800 yaml
* update docs
* 1. delete check 2. fix plas attn test case
* default use use_unique_memory_pool
* add try-except for warmup
* ban mtp, mm, rl
* fix test case mock
* fix ci bug
* fix form_model_get_output_topp0 bug
* fix ci bug
* refine deepseek ci
* refine code
* Disable PD
* fix sot yaml
2025-10-21 14:25:45 +08:00
bukejiyu
62d1c48363
[v1 loader]code style ( #4204 )
...
* code style
* update
2025-09-23 19:36:00 +08:00
yangjianfengo1
4325b737e7
【FIX】Change the name of sparse attn from moba to plas ( #4006 ) ( #4076 )
...
* 【FIX】Change the name of sparse attn from moba to plas (#4006 )
* 更新文档
* 【docs】 update readme (#4000 )
* 更新文档
* update readme
* update docs
* 【FIX】Change the name of sparse attn from moba to plas (#3845 )
* 更新文档
* 更新文档
* 更新文档
* 更新文档
* 修改moba为plas
* code style
* update ci
* code style
* update ci
* code style
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* fix max_num_seqs
* fix test load attn
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-23 10:26:40 +08:00
YuBaoku
2745f37017
[CI] enhance clean port and add waiting time ( #4152 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-17 20:31:49 +08:00
Zhang Yulong
cd09913552
Update test_w4a8_model.py ( #4125 )
2025-09-16 20:43:10 +08:00
chenjian
67e6d8c691
[Feature] Set prefix caching as default ( #3814 )
...
* Set prefix caching as default
* Set prefix caching as default
* Set prefix caching as default
* skip dynamic load scene
* fix kill bug
* fix kill bug
* fix kill bug
* fix
* fix
* fix ci
2025-09-16 20:34:27 +08:00
co63oc
17a27170bc
fix typos ( #4093 )
2025-09-15 18:33:30 +08:00
bukejiyu
29ed617f0f
[v1 loader]qwen Offline fp8 ( #4036 )
...
* support offline fp8
* update ut
* update ut
* update ut
* fix
* update
* update
2025-09-15 13:44:11 +08:00
SuperNova
d60f7c4661
fix import tests.utils error in tests/model_loader/test_load_mtp.py ( #4027 )
...
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-09-11 16:47:16 +08:00
bukejiyu
2650f58740
[docs] Update environment variables documentation ( #3957 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-10 21:17:06 -07:00
Sunny-bot1
3b1da6e4dd
support v1 loader for machete ( #3999 )
2025-09-10 10:21:33 +08:00
YuanRisheng
b3fac5bde1
[V1 Loader] Ernie kv cache quant support v1 loader ( #3899 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support c8 for ernie
* add unittest
* support vl
* fix c8
2025-09-09 05:25:08 -07:00
Zhang Yulong
2359c8d21c
update ci ( #3962 )
...
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-09-09 10:09:13 +08:00
bukejiyu
e52ce1c4b1
cache feature ( #3857 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-07 18:52:46 +08:00
Zhang Yulong
349aa6348b
add cache queue port ( #3904 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* add cache queue port
* add cache queue port
* add cache queue port
2025-09-05 21:17:06 +08:00
Yuan Xiaolan
2cf55168ca
load hadamard_block_size from config ( #3797 )
2025-09-05 17:07:58 +08:00
AIbin
41aee08982
【Inference Optimize】Update MergedReplicatedLinear for DSK qkv_a_proj_with_mqa. ( #3673 )
...
* support MergedReplicatedLinear
* update MergedReplicatedLinear to support DSK_wint4 V1_load
* update model name
* update linear class
* fix
* fix v0 moe_bias load
---------
Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com >
2025-09-04 21:16:05 -07:00
co63oc
6ac7cea81b
fix test_load_mtp ( #3780 )
2025-09-02 10:21:02 +08:00
YuanRisheng
6566e29807
Add loader test for mtp ( #3724 )
...
* add test for mtp
* fix unittest
* fix
2025-09-01 10:55:49 +08:00
lizexu123
455205f991
[Features] support hugging face qwen3 moe ( #3649 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* split ut
* qwen3-30B-A3B
* fix
* add test
* add test_torch_model.py
* fix test_torch_model.py
* delete print
* fix moe
* delete init.py
* fix
* fix
---------
Co-authored-by: bukejiyu <395822456@qq.com >
Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com >
2025-08-30 15:26:05 +08:00
Yuan Xiaolan
c71ee0831c
add w4afp8 offline script ( #3636 )
2025-08-29 17:56:05 +08:00
周周周
17b414c2df
MoE Default use triton's blockwise fp8 in TP Case ( #3678 )
2025-08-29 11:07:30 +08:00
bukejiyu
73cf6096da
fix ( #3676 )
...
* fix
* update
2025-08-28 17:06:32 +08:00
bukejiyu
3200a80de3
[v1 loader]support fp8 ( #3593 )
...
* support fp8
* update ci
2025-08-26 02:42:46 -07:00
lizexu123
c43a4bec00
[Features] support hugging face qwen3 dense and qwen2 model ( #3574 )
...
* support qwen2 and qwen3 hugging face
* fix moe
* defualt_v1 loader
* hugging_face_format deprecated
* modify hugging_face_foramt to model_format
* model_format auto
* fix environemt
* fix bug
* fix qwen3-0.6 bug
* model_format is str
* fix
2025-08-26 10:54:53 +08:00
bukejiyu
bdbac0aa3d
support qwen2 weight only ( #3571 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
2025-08-24 11:14:34 +08:00
bukejiyu
77514e3e1e
[V1 Loader] support weight_only ( #3413 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
* support wint4/wint8
* delete smoe case
* update ci
* print log
2025-08-23 13:13:41 +08:00
YuanRisheng
85fbf5455a
[V1 Loader]Ernie VL support loader v1 ( #3494 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* ernie vl support new loader
* add unittest
* fix test
2025-08-22 11:16:57 +08:00