Yuanle Liu
ed2dcec829
add ignore=all for deepgemm ( #4118 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-15 21:52:00 +08:00
Jiang-Jia-Jun
a04365a0c7
Update api_server.py
2025-09-15 21:31:33 +08:00
YuanRisheng
03b3d6175d
fix mtp ( #4105 )
2025-09-15 20:26:07 +08:00
co63oc
17a27170bc
fix typos ( #4093 )
2025-09-15 18:33:30 +08:00
bukejiyu
113e330030
fix bf16 and add comments ( #4106 )
2025-09-15 17:23:07 +08:00
freeliuzc
69aa2781a1
[MTP]Support mtp reshard ( #4099 )
...
* support rl reshard
* modify model name
2025-09-15 17:13:53 +08:00
freeliuzc
46911f903d
[MTP]update hybrid-mtp-with-ngram ( #4047 )
2025-09-15 17:13:31 +08:00
Yuanle Liu
b1b33211e8
[CUDAGraph] Support multi output buffers and merge some fixes from feature/exp_0908 ( #4062 )
...
* refine cudagraph
* refine cudagraph
* typo
* fix
* fix plugins
* fix
* update
* update
* update
2025-09-15 16:21:30 +08:00
zhupengyang
9409665713
[xpu] support ep ( #4067 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-15 13:53:11 +08:00
bukejiyu
29ed617f0f
[v1 loader]qwen Offline fp8 ( #4036 )
...
* support offline fp8
* update ut
* update ut
* update ut
* fix
* update
* update
2025-09-15 13:44:11 +08:00
Sunny-bot1
b1a5b756a3
[Optimize] Support WINT8 and group scale for Machete ( #3905 )
2025-09-15 12:01:34 +08:00
Echo-Nie
4408dc7f67
【Hackathon 9th No.49】add test_pre_cache_len_concat ( #3847 )
...
* add test_pre_cache_len_concat
* fix according review, add ref_pre_cache_len_concat
2025-09-15 11:20:14 +08:00
co63oc
ef4a1aa2da
【Hackathon 9th No.61、65】add test_draft_model_update ( #3940 )
...
* add draft_model_update test
* fix
* fix
* fix
* fix
* fix
2025-09-15 11:19:50 +08:00
Zero Rains
f213ae1e86
[Bug Fix]fix the bug for cache_messager signal loss ( #3879 )
...
* fix the bug for real size 0 in cudagraph
* fix cache_messager
2025-09-15 11:16:24 +08:00
qwes5s5
553adb299e
【FastDeploy CLI】collect-env subcommand ( #4044 )
...
* collect-env subcommand
* trigger ci
---------
Co-authored-by: K11OntheBoat <your_email@example.com >
2025-09-15 10:31:23 +08:00
zhouchong
958abebeab
Support offline inference with streaming output ( #4071 )
...
* Support offline inference with streaming output
* add unit test
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-09-15 10:27:03 +08:00
YUNSHEN XIE
4871f18dad
fix(CE): update concurrency to stop CE tasks from canceling each other ( #4083 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-12 19:16:26 +08:00
Ayakouji
987609c894
[BugFix] Fix image_feature 0-Size causing insert failed ( #4042 )
...
* update
* fix image_feature
2025-09-12 19:13:08 +08:00
xiaolei373
9ac539471d
[format] Valid para format error info ( #4035 )
...
* feat(log):add_request_and_response_log
* 报错信息与OpenAI对齐
2025-09-12 19:05:17 +08:00
YuanRisheng
88ea565aba
[BugFix]Fix load kv cache quant scale ( #4077 )
...
* fix kv cache
* fix kv_cache
* fix kv cache
2025-09-12 17:44:03 +08:00
co63oc
c86b3357ce
【Hackathon 9th No.78】add test_chat.py ( #3958 )
2025-09-12 16:53:27 +08:00
Echo-Nie
06f4b49ca3
【Hackathon 9th No.25】add test_fused_get_rotary_embedding ( #3892 )
...
* add test_fused_get_rotary_embedding
* 增加基于 NumPy 的基准实现
* 添加,开源软件的版权和许可声明
2025-09-12 15:38:43 +08:00
SuperNova
805f29a06c
[Feature] refactor metax_gpu attention and moe and remove some useless code ( #3688 )
...
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-09-12 14:40:25 +08:00
ltd0924
cab7a633fe
[CI] add multi api server test ( #4049 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [BugFix] fix max streaming tokens invalid
* fix scheduler bug
* fix scheduler bug
* Update multi_api_server.py
* Create test_multi_api_server.py
* fix
2025-09-12 11:18:38 +08:00
qwes5s5
58e0785bab
[metrics] update metrics markdown file ( #4061 )
...
* adjust md
* trigger ci
---------
Co-authored-by: K11OntheBoat <your_email@example.com >
2025-09-12 11:13:43 +08:00
co63oc
8466219ec8
fix typos ( #3840 )
...
* fix typos
* ci
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-09-12 11:04:38 +08:00
RichardWooSJTU
82dab8a91a
Add token processor plugin support ( #4059 )
...
* Add token processor plugin support
* fix import
* fix import
2025-09-12 10:17:23 +08:00
chenjian
37f1632732
[Optimize] optimize prefix cache in develop ( #3890 )
...
* optimize prefix cache in release22
* fix
* fix
* fix
* add ci for v1
* add unit test
---------
Co-authored-by: xiegegege <46314656+xiegegege@users.noreply.github.com >
2025-09-12 10:15:59 +08:00
chen
4859f40b20
[Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) ( #4051 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-09-11 20:08:09 +08:00
lddfym
2056a428bd
[bug fix] Fix the placeholder in qwen prompt and add some unittests ( #4065 )
...
* fix the placeholder in qwen prompt
* fix the placeholder in qwen prompt
* add soem unittests for qwen_vl_processor
2025-09-11 20:00:02 +08:00
memoryCoderC
850465e8ed
[Feature] add cli command chat,complete ( #4037 )
2025-09-11 19:53:14 +08:00
zhuzixuan
a47976e82d
[Echo] Support more types of prompt echo ( #4022 )
...
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
* wenxin-tools-700 When the prompt type is list[int] or list[list[int]], it needs to support echoing after decoding.
---------
Co-authored-by: luukunn <83932082+luukunn@users.noreply.github.com >
2025-09-11 19:34:44 +08:00
xiaoxiaohehe001
abdcef30aa
[BugFix] mm_post_fix ( #4005 )
...
* mm_post_fix
* mm_post_fix_1
2025-09-11 19:09:46 +08:00
Zhang Yulong
d2ec7f6aa2
update ci ( #4064 )
...
* update ci
* update ci
2025-09-11 18:36:25 +08:00
YuBaoku
fec58639db
[CI] skip test_structured_outputs* temporarily ( #4055 )
2025-09-11 18:07:50 +08:00
YuanRisheng
d2d04c2d5e
[setup optimize]Support git submodule ( #4033 )
...
* support git submodule
* update setup
* fix ci network
* fix clone
* revert clone linux
* delete args
* fix ci
* update
2025-09-11 17:41:16 +08:00
SuperNova
d60f7c4661
fix import tests.utils error in tests/model_loader/test_load_mtp.py ( #4027 )
...
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-09-11 16:47:16 +08:00
CSWYF3634076
e4c64a71cc
[BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix ( #3921 )
2025-09-11 15:08:24 +08:00
bukejiyu
2650f58740
[docs] Update environment variables documentation ( #3957 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-10 21:17:06 -07:00
co63oc
2af0f671b1
【Hackathon 9th No.55】add test_update_inputs_v1.py ( #3992 )
2025-09-11 11:34:22 +08:00
AIbin
a7392a0ff9
【Inference Optimize】DeepSeek-V3-model MLA Optimize ( #3886 )
...
* support MLA chunk_size auto search & cuda_graph
2025-09-11 10:46:09 +08:00
chen
637d96c6ae
[Feature] Support zai-org/GLM-4.5-Air BF16 model ( #3928 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* support glm45_air
2025-09-10 19:36:10 +08:00
freeliuzc
7ee100903f
support rope_3d in spec mode ( #4034 )
2025-09-10 03:15:05 -07:00
ltd0924
684e93269b
[Fix] fix multi api server log dir ( #3967 )
...
* [BugFix] fix max streaming tokens invalid
* fix scheduler bug
* fix scheduler bug
* Update multi_api_server.py
2025-09-10 17:15:30 +08:00
wanrui
276f73cf83
【Hackathon 9th No.28】add test_cutlass_fp8_fp8_fp8_dual_gemm_fused ( #3935 )
...
* add test_cutlass_fp8_fp8_fp8_dual_gemm_fused
* fix the version
* fix code style
---------
Co-authored-by: Tao Luo <luotao02@baidu.com >
2025-09-10 14:57:49 +08:00
RAM
d3e4ae3d49
[Executor] Adjust signal sending order in RL training ( #3773 )
...
* Adjust processing order
* fix bug
* fix update_parameters bug
* refine code
2025-09-10 13:24:20 +08:00
Ayakouji
453487d5b0
[Feat] ernie4_5_vl_moe support CudaGraph ( #3226 )
...
* delete dynamic control flow for decode
* coda-style
* fix scatter/gather typos and use input stream instead default stream
* support 0-Size Tensor
* update runner and model
* using static mem address as input
* fix mem leak
* refine code
* update mm_buffer
* fix typo
* fix buffersize
* fix unk token
* refine code
* refine
* support other arch
* open cudagraph in vlci
* fix
* update
* update
* update
* fix cmd
* update
---------
Co-authored-by: aquagull <hongyuh@qq.com >
Co-authored-by: Yuanle Liu <yuanlehome@163.com >
2025-09-10 13:11:57 +08:00
zhupengyang
9d0074a91a
[xpu] add ep custom ops ( #3911 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-10 12:22:50 +08:00
Yuanle Liu
c3b2a60fb8
[BugFix] Fix the abnormal memory usage caused by shape errors in the triton moe backend ( #4026 )
...
* fix device_id to in
* fix triton_moe bug
2025-09-09 20:05:54 -07:00
周周周
dbab579299
clean code ( #4020 )
2025-09-10 10:56:15 +08:00