Jiang-Jia-Jun
c8140326fa
Update nvidia_gpu.md
2025-11-12 20:50:09 +08:00
bukejiyu
f0189292df
[CI] fix test_model_cache ( #4982 )
...
* ci
* update
2025-11-12 20:26:49 +08:00
qwes5s5
a2d06118e1
[Logprobs]Support prompt_logprobs and max_logprobs ( #4897 )
...
* add prompt logprobs
* trigger ci
* fix unitest
* Update fastdeploy/config.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/entrypoints/llm.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/engine/sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update tests/engine/test_sampling_params.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* fix max_logprobs
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 19:29:48 +08:00
Lucas
da7863ae85
[XPU] fix text_image_gather_scatter when image_token_num == token_num && text_token_num == 1 ( #4882 )
2025-11-12 17:13:22 +08:00
JYChen
a1218076dc
remove load default_v1 since already been as default ( #4980 )
2025-11-12 16:49:48 +08:00
xiaozude
c45b3ccb52
[Metax] optimize flash mla ( #4915 )
2025-11-12 16:43:46 +08:00
MingkunZhang
9d9f5df8d0
[Metax] support default_v1 loader & thinking model ( #4956 )
...
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com >
2025-11-12 16:32:26 +08:00
BossPi
bde6e2f931
[BugFix] Avoid loading training file ( #4966 )
...
* bug fix
don't put scheduler.pdparams into model weights
* run pre-commit
2025-11-12 15:49:14 +08:00
plusNew001
c7b589d75b
[CI][XPU] Fix EP Case Bug ( #4976 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update health check endpoint to use port variable
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update installation method for paddlepaddle-xpu
Revert to installing paddlepaddle-xpu from the official repository.
* Modify XPU_VISIBLE_DEVICES based on GPU_ID
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 15:23:28 +08:00
bukejiyu
6e2e2fcd29
xpu ( #4969 )
2025-11-12 15:12:59 +08:00
ltd0924
5bf48de999
[KVCache] support unified cache backend ( #4903 )
...
* [Feature] support unified cache backend
* fix
* fix
* fix
* fix
* Update metax_model_runner.py
* fix
* update
* Update test_moba_attention_backend.py
---------
Co-authored-by: ltd0924 <luotingdan@baidu.com >
2025-11-12 14:54:52 +08:00
yzwu
76e60e98f8
[Iluvatar][CI] fix safetensors_rust.SafetensorError: framework paddle is invalid ( #4972 )
2025-11-12 14:13:40 +08:00
Sunny-bot1
35bd2afab3
[Benchmark] Add GEMM & MoE kernel bench ( #4809 )
2025-11-12 11:56:40 +08:00
YuBaoku
8a96944a0a
[CI] Update PORT range to avoid conflict with system ports ( #4953 )
2025-11-12 11:17:49 +08:00
Jiang-Jia-Jun
09cd6c5d3e
Modify README
2025-11-12 11:03:23 +08:00
YuBaoku
9c52d9eb8f
[CI] remove useless tests in docker_build ( #4974 )
...
* [CI] fix
* [CI] fix apt_sources error of focal in docker_build
* [CI] remove useless tests in docker_build
2025-11-12 10:55:09 +08:00
Echo-Nie
ff653503ff
[Docs] Add License in Unittest ( #4957 )
...
* add copyright
* add CopyRight
2025-11-12 10:44:09 +08:00
Echo-Nie
2aabaecbc2
[CI] Add five unittest ( #4958 )
...
* add unittest
* Update test_logger.py
2025-11-12 10:43:33 +08:00
plusNew001
a5103eb198
[CI][XPU] Change Paddle Version to Nightly ( #4973 )
...
* Update health check endpoint to use port variable
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update installation method for paddlepaddle-xpu
Revert to installing paddlepaddle-xpu from the official repository.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-12 10:29:16 +08:00
bukejiyu
b09ebb2813
refactor pt loading ( #4532 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-11 21:30:39 +08:00
YuBaoku
4c911ecb74
[CI] fix apt_sources error of focal in docker_build ( #4961 )
...
* [CI] fix
* [CI] fix apt_sources error of focal in docker_build
2025-11-11 20:35:06 +08:00
plusNew001
f20f29fc79
[CI][XPU]Update health check endpoint to use port variable ( #4965 )
...
* Update health check endpoint to use port variable
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-11 20:19:53 +08:00
周周周
da6b4c10e5
[ATTENTION] make buffer alloc as a function ( #4945 )
2025-11-11 19:17:08 +08:00
yzwu
08b96baa4a
[Iluvatar][Doc] Add ERNIE-4.5-VL-28B-A3B-Thinking doc ( #4955 )
2025-11-11 19:15:19 +08:00
chen
896ef565cc
[Others] Add Tests for GPU Model Runner and Logprobs Output ( #4913 )
2025-11-11 18:37:33 +08:00
kxz2002
a83250ae3f
[CI] Update test_api_key.py ( #4948 )
...
* fix test_api_key
* fix test_api_key
2025-11-11 16:49:54 +08:00
K11OntheBoat
76be598129
replace paddle.max by numpy to avoid useless error log ( #4893 )
...
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com ”>
2025-11-11 16:28:05 +08:00
SunLei
3098aee05f
[Perf] Support tensor transmission between work and engine with zero-copy to improve efficiency ( #4839 )
...
* feat(zmq): support tensor transmission with zero-copy for improved efficiency
* perf: zmq.send disable copy
* zmq recv data for debug
* convert logprobs tensor to cpu
2025-11-11 15:43:11 +08:00
plusNew001
8b61f01c68
[CI][XPU]Update run_ci_xpu.sh to lock paddlepaddle-xpu version ( #4949 )
...
Temporarily lock paddlepaddle-xpu version due to framework update.
2025-11-11 15:38:05 +08:00
Lucas
5280b9e0b4
[XPU] fix xpu deployment md ( #4941 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-11 14:39:52 +08:00
yinwei
215cda2f80
[XPU][Doc]Update XPU release2.3 note ( #4939 )
...
* update doc
* update
* update
* udpate
2025-11-11 11:57:49 +08:00
Jiang-Jia-Jun
3f09ebf3da
Update model names in FastDeploy v2.3 release notes
2025-11-11 11:53:26 +08:00
LiqinruiG
75294bcfb1
[Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction ( #4944 )
...
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-11 11:40:52 +08:00
Jiang-Jia-Jun
c0a4e2b63b
Update README.md
2025-11-11 11:38:30 +08:00
Jiang-Jia-Jun
7bedf2041a
Update README.md
2025-11-11 11:37:31 +08:00
yzwu
3707af7a4f
[Iluvatar] add vl into ci and support v1 loader ( #4774 )
2025-11-11 10:50:17 +08:00
Ryan
07a82afcae
add tie_word_embeddings for lmhead ( #4916 )
2025-11-11 10:46:35 +08:00
LiqinruiG
3f74281496
[Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction ( #4937 )
...
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-11 10:43:44 +08:00
yangjianfengo1
d7f14dba8b
uodate docx ( #4938 )
...
Co-authored-by: root <root@yq02-inf-sci-k8s-a100-aa2ni5-0018.yq02.baidu.com >
2025-11-11 10:28:46 +08:00
Yuanle Liu
3dc0ffa46d
[TSP] Support qwen3 moe tsp + cudagraph ( #4871 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3_moe tsp mode
* fix
* fix
* update
* update
* update
* fix
* support external_rmsnorm
* update
* fix
2025-11-10 23:37:51 +08:00
chenjian
fb2eb403ab
[Opti] Unlimit zmq message lens limit ( #4465 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-11-10 21:38:02 +08:00
chen
927bd74075
[Docs] add doc for glm ( #4933 )
...
* add doc for glm
* del v1 loader
* delete mtp
2025-11-10 21:21:33 +08:00
plusNew001
3665c283b5
[XPU] [CI]Change CI to multi-concurrency ( #4866 )
...
* Refactor GPU ID logic in CI workflow
Updated GPU ID assignment logic and removed unused port calculations.
* Refactor GPU device and port configuration
* Update engine_worker_queue_port calculation logic
* Refactor XPU_VISIBLE_DEVICES export logic
* Adjust service port based on GPU ID
* Adjust service HTTP port based on GPU ID
* Adjust service_http_port based on GPU_ID
* Add import for os module in run_45T.py
* Update run_45vl.py
* Import os module in run_w4a8.py
Added import for os module to use environment variables.
* Remove duplicate import of os module
* Remove duplicate import of os module
* Update run_45T.py
* Update run_w4a8.py
* fix bug
* fix bug
* Update run_w4a8.py
* Fix directory change command in run_ci_xpu.sh
2025-11-10 21:09:48 +08:00
Sunny-bot1
59d2edde29
[BugFix] Add support for weight shape constraints and group size selection in Machete ( #4911 )
2025-11-10 20:57:35 +08:00
kxz2002
2dfbcf3cc9
[BugFix] Fix inference_start_time ( #4922 )
...
* fix inference_start_time
* fix inference_start_time
2025-11-10 19:28:44 +08:00
LiqinruiG
aa79e6185a
[Docs] Improve reasoning_out docs ( #4901 )
...
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
* [Docs] Improve reasoning_out docs
---------
Co-authored-by: liqinrui <liqinrui@baidu.com >
2025-11-10 19:20:38 +08:00
qw86972190
07b21d241d
[XPU]Update documentation ( #4917 )
...
* [XPU]Update documentation
* [XPU]Update documentation
* [XPU]Update documentation
* [XPU]Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
* [XPU][Docs] Update documentation
2025-11-10 19:11:42 +08:00
周周周
54536267db
[DeepEP] support P async_finish ( #4899 )
2025-11-10 18:24:02 +08:00
chenjian
78895e2c7d
[Bug Fix] fix bug for PD EP ( #4823 )
...
* fix bug for PD EP
* fix
* optimize perf for engine worker queue
* fix bug
* fix internode ll two stage
* fix for ci
* fix bug
2025-11-10 15:33:29 +08:00
Echo-Nie
112623e33e
init version, exist some bugs, waiting fix ( #4906 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-10 14:16:09 +08:00