Commit Graph

3824 Commits

Author SHA1 Message Date
Jiang-Jia-Jun
c8140326fa Update nvidia_gpu.md 2025-11-12 20:50:09 +08:00
bukejiyu
f0189292df [CI] fix test_model_cache (#4982)
* ci

* update
2025-11-12 20:26:49 +08:00
qwes5s5
a2d06118e1 [Logprobs]Support prompt_logprobs and max_logprobs (#4897)
* add prompt logprobs

* trigger ci

* fix unitest

* Update fastdeploy/config.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/entrypoints/llm.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/engine/sampling_params.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/engine/test_sampling_params.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/engine/test_sampling_params.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix max_logprobs

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-12 19:29:48 +08:00
Lucas
da7863ae85 [XPU] fix text_image_gather_scatter when image_token_num == token_num && text_token_num == 1 (#4882) 2025-11-12 17:13:22 +08:00
JYChen
a1218076dc remove load default_v1 since already been as default (#4980) 2025-11-12 16:49:48 +08:00
xiaozude
c45b3ccb52 [Metax] optimize flash mla (#4915) 2025-11-12 16:43:46 +08:00
MingkunZhang
9d9f5df8d0 [Metax] support default_v1 loader & thinking model (#4956)
Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>
2025-11-12 16:32:26 +08:00
BossPi
bde6e2f931 [BugFix] Avoid loading training file (#4966)
* bug fix

don't put scheduler.pdparams into model weights

* run pre-commit
2025-11-12 15:49:14 +08:00
plusNew001
c7b589d75b [CI][XPU] Fix EP Case Bug (#4976)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update health check endpoint to use port variable

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update installation method for paddlepaddle-xpu

Revert to installing paddlepaddle-xpu from the official repository.

* Modify XPU_VISIBLE_DEVICES based on GPU_ID

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-12 15:23:28 +08:00
bukejiyu
6e2e2fcd29 xpu (#4969) 2025-11-12 15:12:59 +08:00
ltd0924
5bf48de999 [KVCache] support unified cache backend (#4903)
* [Feature] support unified cache backend

* fix

* fix

* fix

* fix

* Update metax_model_runner.py

* fix

* update

* Update test_moba_attention_backend.py

---------

Co-authored-by: ltd0924 <luotingdan@baidu.com>
2025-11-12 14:54:52 +08:00
yzwu
76e60e98f8 [Iluvatar][CI] fix safetensors_rust.SafetensorError: framework paddle is invalid (#4972) 2025-11-12 14:13:40 +08:00
Sunny-bot1
35bd2afab3 [Benchmark] Add GEMM & MoE kernel bench (#4809) 2025-11-12 11:56:40 +08:00
YuBaoku
8a96944a0a [CI] Update PORT range to avoid conflict with system ports (#4953) 2025-11-12 11:17:49 +08:00
Jiang-Jia-Jun
09cd6c5d3e Modify README 2025-11-12 11:03:23 +08:00
YuBaoku
9c52d9eb8f [CI] remove useless tests in docker_build (#4974)
* [CI] fix

* [CI] fix apt_sources error of focal in docker_build

* [CI] remove useless tests in docker_build
2025-11-12 10:55:09 +08:00
Echo-Nie
ff653503ff [Docs] Add License in Unittest (#4957)
* add copyright

* add CopyRight
2025-11-12 10:44:09 +08:00
Echo-Nie
2aabaecbc2 [CI] Add five unittest (#4958)
* add unittest

* Update test_logger.py
2025-11-12 10:43:33 +08:00
plusNew001
a5103eb198 [CI][XPU] Change Paddle Version to Nightly (#4973)
* Update health check endpoint to use port variable

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update installation method for paddlepaddle-xpu

Revert to installing paddlepaddle-xpu from the official repository.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-12 10:29:16 +08:00
bukejiyu
b09ebb2813 refactor pt loading (#4532)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-11-11 21:30:39 +08:00
YuBaoku
4c911ecb74 [CI] fix apt_sources error of focal in docker_build (#4961)
* [CI] fix

* [CI] fix apt_sources error of focal in docker_build
2025-11-11 20:35:06 +08:00
plusNew001
f20f29fc79 [CI][XPU]Update health check endpoint to use port variable (#4965)
* Update health check endpoint to use port variable

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update scripts/run_ci_xpu.sh

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-11 20:19:53 +08:00
周周周
da6b4c10e5 [ATTENTION] make buffer alloc as a function (#4945) 2025-11-11 19:17:08 +08:00
yzwu
08b96baa4a [Iluvatar][Doc] Add ERNIE-4.5-VL-28B-A3B-Thinking doc (#4955) 2025-11-11 19:15:19 +08:00
chen
896ef565cc [Others] Add Tests for GPU Model Runner and Logprobs Output (#4913) 2025-11-11 18:37:33 +08:00
kxz2002
a83250ae3f [CI] Update test_api_key.py (#4948)
* fix test_api_key

* fix test_api_key
2025-11-11 16:49:54 +08:00
K11OntheBoat
76be598129 replace paddle.max by numpy to avoid useless error log (#4893)
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com”>
2025-11-11 16:28:05 +08:00
SunLei
3098aee05f [Perf] Support tensor transmission between work and engine with zero-copy to improve efficiency (#4839)
* feat(zmq): support tensor transmission with zero-copy for improved efficiency

* perf: zmq.send disable copy

* zmq recv data for debug

* convert logprobs tensor to cpu
2025-11-11 15:43:11 +08:00
plusNew001
8b61f01c68 [CI][XPU]Update run_ci_xpu.sh to lock paddlepaddle-xpu version (#4949)
Temporarily lock paddlepaddle-xpu version due to framework update.
2025-11-11 15:38:05 +08:00
Lucas
5280b9e0b4 [XPU] fix xpu deployment md (#4941)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-11 14:39:52 +08:00
yinwei
215cda2f80 [XPU][Doc]Update XPU release2.3 note (#4939)
* update doc

* update

* update

* udpate
2025-11-11 11:57:49 +08:00
Jiang-Jia-Jun
3f09ebf3da Update model names in FastDeploy v2.3 release notes 2025-11-11 11:53:26 +08:00
LiqinruiG
75294bcfb1 [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction (#4944)
* [Docs] Improve reasoning_out docs

* [Docs] Improve reasoning_out docs

* [Docs] Improve reasoning_out docs

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

---------

Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-11 11:40:52 +08:00
Jiang-Jia-Jun
c0a4e2b63b Update README.md 2025-11-11 11:38:30 +08:00
Jiang-Jia-Jun
7bedf2041a Update README.md 2025-11-11 11:37:31 +08:00
yzwu
3707af7a4f [Iluvatar] add vl into ci and support v1 loader (#4774) 2025-11-11 10:50:17 +08:00
Ryan
07a82afcae add tie_word_embeddings for lmhead (#4916) 2025-11-11 10:46:35 +08:00
LiqinruiG
3f74281496 [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking instruction (#4937)
* [Docs] Improve reasoning_out docs

* [Docs] Improve reasoning_out docs

* [Docs] Improve reasoning_out docs

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

* [Docs] add ERNIE-4.5-VL-28B-A3B-Thinking  instruction

---------

Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-11 10:43:44 +08:00
yangjianfengo1
d7f14dba8b uodate docx (#4938)
Co-authored-by: root <root@yq02-inf-sci-k8s-a100-aa2ni5-0018.yq02.baidu.com>
2025-11-11 10:28:46 +08:00
Yuanle Liu
3dc0ffa46d [TSP] Support qwen3 moe tsp + cudagraph (#4871)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support qwen3_moe tsp mode

* fix

* fix

* update

* update

* update

* fix

* support external_rmsnorm

* update

* fix
2025-11-10 23:37:51 +08:00
chenjian
fb2eb403ab [Opti] Unlimit zmq message lens limit (#4465)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-11-10 21:38:02 +08:00
chen
927bd74075 [Docs] add doc for glm (#4933)
* add doc for glm

* del v1 loader

* delete mtp
2025-11-10 21:21:33 +08:00
plusNew001
3665c283b5 [XPU] [CI]Change CI to multi-concurrency (#4866)
* Refactor GPU ID logic in CI workflow

Updated GPU ID assignment logic and removed unused port calculations.

* Refactor GPU device and port configuration

* Update engine_worker_queue_port calculation logic

* Refactor XPU_VISIBLE_DEVICES export logic

* Adjust service port based on GPU ID

* Adjust service HTTP port based on GPU ID

* Adjust service_http_port based on GPU_ID

* Add import for os module in run_45T.py

* Update run_45vl.py

* Import os module in run_w4a8.py

Added import for os module to use environment variables.

* Remove duplicate import of os module

* Remove duplicate import of os module

* Update run_45T.py

* Update run_w4a8.py

* fix bug

* fix bug

* Update run_w4a8.py

* Fix directory change command in run_ci_xpu.sh
2025-11-10 21:09:48 +08:00
Sunny-bot1
59d2edde29 [BugFix] Add support for weight shape constraints and group size selection in Machete (#4911) 2025-11-10 20:57:35 +08:00
kxz2002
2dfbcf3cc9 [BugFix] Fix inference_start_time (#4922)
* fix inference_start_time

* fix inference_start_time
2025-11-10 19:28:44 +08:00
LiqinruiG
aa79e6185a [Docs] Improve reasoning_out docs (#4901)
* [Docs] Improve reasoning_out docs

* [Docs] Improve reasoning_out docs

* [Docs] Improve reasoning_out docs

---------

Co-authored-by: liqinrui <liqinrui@baidu.com>
2025-11-10 19:20:38 +08:00
qw86972190
07b21d241d [XPU]Update documentation (#4917)
* [XPU]Update documentation

* [XPU]Update documentation

* [XPU]Update documentation

* [XPU]Update documentation

* [XPU][Docs] Update documentation

* [XPU][Docs] Update documentation

* [XPU][Docs] Update documentation

* [XPU][Docs] Update documentation

* [XPU][Docs] Update documentation

* [XPU][Docs] Update documentation
2025-11-10 19:11:42 +08:00
周周周
54536267db [DeepEP] support P async_finish (#4899) 2025-11-10 18:24:02 +08:00
chenjian
78895e2c7d [Bug Fix] fix bug for PD EP (#4823)
* fix bug for PD EP

* fix

* optimize perf for engine worker queue

* fix bug

* fix internode ll two stage

* fix for ci

* fix bug
2025-11-10 15:33:29 +08:00
Echo-Nie
112623e33e init version, exist some bugs, waiting fix (#4906)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-11-10 14:16:09 +08:00