Echo-Nie
ee1ea43e36
[Docs] Fix broken commitID ( #5008 )
...
* fix commitID
* Update nvidia_gpu.md
2025-11-14 10:39:41 +08:00
essos
191a597d9f
[CI]【Hackathon 9th Sprint No.56】NO.56 功能模块 fastdeploy/multimodal/utils.py 单测补充 ( #4954 )
...
* update test utils
* update test utils code
* update test file name
2025-11-14 10:37:27 +08:00
Juncai
36822fa49c
[PD Disaggregation] remove splitwise deployment on single node and refine the code ( #4891 )
...
* remove splitwise deployment on single node and refine the code
* up
* up
* up
* add test
* up
2025-11-14 09:56:53 +08:00
kxz2002
9703108c28
[BugFix] adjust max_tokens and min_tokens when continue to generate tokens ( #5010 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* fix max and min tokens initial commit
* fix double subtraction
* add unit tests
2025-11-13 23:52:54 +08:00
carryyu
6c3d1da62f
fix conflicts
2025-11-13 20:30:29 +08:00
yangjianfengo1
ae7bee8122
【New Feature】W4afp8 supports per group quantization ( #4987 )
...
* w4afp8 支持per group
* code style
* fix transpose
* revert fast hardmard
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com >
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 19:17:27 +08:00
Echo-Nie
a5e949d9d0
[Feature] Enhance build script, add pre_wheel logic ( #4729 )
...
* Enhance build script, add pre_wheel logic
Updated copyright year and added precompiled wheel installation logic.
* update the nvidia_gpu.md, add pre_wheel description
* fix zh .md
* update the url, automatically detect CUDA and SM
* Fix GPU architecture string formatting in build.sh
* Change default for FD_USE_PRECOMPILED to 0
* fix build.sh
* add ./dist, pre-wheel path
* simplify the process,just save the whl
* del pre_wheel dir
* fix function name, extract_ops_from_precompiled_wheel
* fix docs
* add default commitID in docs
---------
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 19:03:52 +08:00
Sunny-bot1
05da8e34c0
[BugFix][Metax] Fix metax compile issue in get_block_shape_and_split_kv_block ( #5000 )
...
* fix metax compile
* fix
2025-11-13 00:55:06 -08:00
zccjjj
88da9d9788
[XPU] [CI] Change CI ep test from offline to online ( #4885 )
...
* change CI ep test from offline to online
* add ep all2all ci's changes, from offline to online
* change env var in ep-all2all ci test
* add expected response for ep8tp8 all2all
* Adapt to CI refactoring and support dual-concurrent code execution
* Adapt to CI refactoring and support dual-concurrent, second
* Explicitly specify the #port
* change the startup method of all2all
* Modify the command of all2all
* Update assertion to check multiple keywords
* Update assertion to check multiple keywords
* Update run_w4a8.py
* Update run_w4a8.py
---------
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-13 16:15:45 +08:00
bukejiyu
4a0d881e15
update ( #4985 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-13 15:58:01 +08:00
周周周
6c4ebc5fee
[worker_process.py]modify some var name ( #4749 )
2025-11-13 14:21:27 +08:00
Yonghua Li
6c5ab727c1
[BugFix] fix num_requests_running after clear_data ( #4927 )
...
* [BugFix] fix num_requests_running after clear_data
* [fix] fix tasks_list and stop flags not cleared when _free_blocks failed
2025-11-13 13:50:21 +08:00
Sunny-bot1
5b24013d46
skip DtoH capture ( #4988 )
2025-11-13 10:57:44 +08:00
Jiang-Jia-Jun
8329338d37
Update nvidia_gpu.md
2025-11-13 10:25:22 +08:00
ltd0924
303c986cc7
[FDConfig] add block number verfied ( #4983 )
...
* Update config.py
* fix
* update unit test
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-11-13 09:48:44 +08:00
YuBaoku
1c0b0b08b7
[CI] set DG_NVCC_OVERRIDE_CPP_STANDARD in test_quantized_linear ( #4995 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-12 23:03:21 +08:00
Yuanle Liu
2272160faf
fix mtp tsp ( #4990 )
2025-11-12 22:05:19 +08:00
ming1753
3148dbca06
[BugFix] fix VL fp8 bug when moe token_num is 0 ( #4928 )
...
* [BugFix] fix VL fp8 bug when moe token_num is 0
* fix bug
* format
* fix bug
2025-11-12 21:19:36 +08:00
Jiang-Jia-Jun
c8140326fa
Update nvidia_gpu.md
2025-11-12 20:50:09 +08:00
bukejiyu
f0189292df
[CI] fix test_model_cache ( #4982 )
...
* ci
* update
2025-11-12 20:26:49 +08:00
qwes5s5
a2d06118e1
[Logprobs]Support prompt_logprobs and max_logprobs ( #4897 )
...
* add prompt logprobs
* trigger ci
* fix unitest
* Update fastdeploy/config.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/llm.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix max_logprobs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 19:29:48 +08:00
Lucas
da7863ae85
[XPU] fix text_image_gather_scatter when image_token_num == token_num && text_token_num == 1 ( #4882 )
2025-11-12 17:13:22 +08:00
JYChen
a1218076dc
remove load default_v1 since already been as default ( #4980 )
2025-11-12 16:49:48 +08:00
xiaozude
c45b3ccb52
[Metax] optimize flash mla ( #4915 )
2025-11-12 16:43:46 +08:00
MingkunZhang
9d9f5df8d0
[Metax] support default_v1 loader & thinking model ( #4956 )
...
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-12 16:32:26 +08:00
BossPi
bde6e2f931
[BugFix] Avoid loading training file ( #4966 )
...
* bug fix
don't put scheduler.pdparams into model weights
* run pre-commit
2025-11-12 15:49:14 +08:00
plusNew001
c7b589d75b
[CI][XPU] Fix EP Case Bug ( #4976 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update health check endpoint to use port variable
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update installation method for paddlepaddle-xpu
Revert to installing paddlepaddle-xpu from the official repository.
* Modify XPU_VISIBLE_DEVICES based on GPU_ID
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 15:23:28 +08:00
bukejiyu
6e2e2fcd29
xpu ( #4969 )
2025-11-12 15:12:59 +08:00
ltd0924
5bf48de999
[KVCache] support unified cache backend ( #4903 )
...
* [Feature] support unified cache backend
* fix
* fix
* fix
* fix
* Update metax_model_runner.py
* fix
* update
* Update test_moba_attention_backend.py
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-11-12 14:54:52 +08:00
yzwu
76e60e98f8
[Iluvatar][CI] fix safetensors_rust.SafetensorError: framework paddle is invalid ( #4972 )
2025-11-12 14:13:40 +08:00
Sunny-bot1
35bd2afab3
[Benchmark] Add GEMM & MoE kernel bench ( #4809 )
2025-11-12 11:56:40 +08:00
YuBaoku
8a96944a0a
[CI] Update PORT range to avoid conflict with system ports ( #4953 )
2025-11-12 11:17:49 +08:00
Jiang-Jia-Jun
09cd6c5d3e
Modify README
2025-11-12 11:03:23 +08:00
YuBaoku
9c52d9eb8f
[CI] remove useless tests in docker_build ( #4974 )
...
* [CI] fix
* [CI] fix apt_sources error of focal in docker_build
* [CI] remove useless tests in docker_build
2025-11-12 10:55:09 +08:00
Echo-Nie
ff653503ff
[Docs] Add License in Unittest ( #4957 )
...
* add copyright
* add CopyRight
2025-11-12 10:44:09 +08:00
Echo-Nie
2aabaecbc2
[CI] Add five unittest ( #4958 )
...
* add unittest
* Update test_logger.py
2025-11-12 10:43:33 +08:00
plusNew001
a5103eb198
[CI][XPU] Change Paddle Version to Nightly ( #4973 )
...
* Update health check endpoint to use port variable
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update installation method for paddlepaddle-xpu
Revert to installing paddlepaddle-xpu from the official repository.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 10:29:16 +08:00
bukejiyu
b09ebb2813
refactor pt loading ( #4532 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-11 21:30:39 +08:00
YuBaoku
4c911ecb74
[CI] fix apt_sources error of focal in docker_build ( #4961 )
...
* [CI] fix
* [CI] fix apt_sources error of focal in docker_build
2025-11-11 20:35:06 +08:00
plusNew001
f20f29fc79
[CI][XPU]Update health check endpoint to use port variable ( #4965 )
...
* Update health check endpoint to use port variable
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-11 20:19:53 +08:00
周周周
da6b4c10e5
[ATTENTION] make buffer alloc as a function ( #4945 )
2025-11-11 19:17:08 +08:00
yzwu
08b96baa4a
[Iluvatar][Doc] Add ERNIE-4.5-VL-28B-A3B-Thinking doc ( #4955 )
2025-11-11 19:15:19 +08:00
chen
896ef565cc
[Others] Add Tests for GPU Model Runner and Logprobs Output ( #4913 )
2025-11-11 18:37:33 +08:00
kxz2002
a83250ae3f
[CI] Update test_api_key.py ( #4948 )
...
* fix test_api_key
* fix test_api_key
2025-11-11 16:49:54 +08:00
K11OntheBoat
76be598129
replace paddle.max by numpy to avoid useless error log ( #4893 )
...
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”>
2025-11-11 16:28:05 +08:00
SunLei
3098aee05f
[Perf] Support tensor transmission between work and engine with zero-copy to improve efficiency ( #4839 )
...
* feat(zmq): support tensor transmission with zero-copy for improved efficiency
* perf: zmq.send disable copy
* zmq recv data for debug
* convert logprobs tensor to cpu
2025-11-11 15:43:11 +08:00
plusNew001
8b61f01c68
[CI][XPU]Update run_ci_xpu.sh to lock paddlepaddle-xpu version ( #4949 )
...
Temporarily lock paddlepaddle-xpu version due to framework update.
2025-11-11 15:38:05 +08:00
Lucas
5280b9e0b4
[XPU] fix xpu deployment md ( #4941 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-11 14:39:52 +08:00
yinwei
215cda2f80
[XPU][Doc]Update XPU release2.3 note ( #4939 )
...
* update doc
* update
* update
* udpate
2025-11-11 11:57:49 +08:00
Jiang-Jia-Jun
3f09ebf3da
Update model names in FastDeploy v2.3 release notes
2025-11-11 11:53:26 +08:00