RuohengMa
8178e3fc6a
[XPU] add speculate_step_system_cache ( #5397 )
...
* [XPU] add speculate_step_system_cache
* [XPU] add speculate_step_system_cache
---------
Co-authored-by: cmcamdy <1027740945@qq.com >
2025-12-09 14:40:11 +08:00
Lucas
8f2b85362d
[XPU] support moe_expert_ffn TGEMM selection ( #5375 )
2025-12-05 17:49:40 +08:00
cmcamdy
86b6430582
fix split_rope_cache_kv_encoder in mix mtp ( #5384 )
2025-12-05 14:33:17 +08:00
qw86972190
6048ea37bd
[XPU]add enable_logprob ( #5279 )
...
* [XPU]Update document
* [XPU]Update documentation
* [XPU]add enable_logprob
* Fix code style issues
* “doc”
* “docs”
* “doc”
* Fix code style via pre-commit
---------
Co-authored-by: root <root@gajl-bbc-onlinec-com-1498354.gajl.baidu.com >
2025-12-02 15:32:28 +08:00
cmcamdy
3149aed750
fix_gather_next_token ( #5311 )
2025-12-01 18:00:30 +08:00
cmcamdy
5a67a6d960
[XPU] support kernel for mtp(base) ( #4748 )
...
* [XPU] support kernel for mtp(base)
* [XPU] support kernel for mtp(base)
* format
* format
* format
* fix gather next token
* fix step && add test
* fix
* mv pre/post process
* add adjust batch / gather next token for mtp
* fix code style
* fix mtp kenrel name
* fix mtp kernel test
* mv xpu pre/post process
* mv xpu pre/post process
2025-11-27 15:05:44 +08:00
zccjjj
ea3bc5b4ca
[XPU] Fix the error in MoeExpertFFN operator when valid_token_num=0 ( #5196 )
2025-11-25 10:07:20 +08:00
ddchenhao66
e70e2279ce
[PD Disaggregation][XPU] Add XPU support for PD disaggregation ( #5113 )
...
* [XPU] xpu support PD disaggregation
* [XPU] fix the issue of cache KV transfer process startup failure on non-zero XPU cards
* [XPU] xpu support PD disaggregation in v1 scheduler
---------
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-11-21 14:09:01 +08:00
Yonghua Li
43097a512a
[BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol ( #5132 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [fix] fix v1 scheduler profile run for append attention in prefill node
* [fix] skip send_signal if kv signal not inited for gpu and xpu
* [fix] extend fix to flash_attn & mla_attn
* [fix] fix v1 pd run in ipc transfer protocol
* [ci] add test for v1 pd profile run using ipc transfer protocol
* [style] fix code style check
* [style] fix code style again
* [fix] fix profile run
* [update] remove --num-gpu-blocks-override in example script
* [chore] rename forward_meta is_profiling to is_dummy_or_profile_run
2025-11-20 21:39:22 +08:00
Lucas
da7863ae85
[XPU] fix text_image_gather_scatter when image_token_num == token_num && text_token_num == 1 ( #4882 )
2025-11-12 17:13:22 +08:00
ddchenhao66
bffa08b74b
[XPU] fix thinking bug where output only contains reasoning_content ( #4761 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-11-04 14:32:35 +08:00
ddchenhao66
3cbca75cc8
[XPU] xpu support neox style ROPE ( #4719 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-31 18:14:25 +08:00
ddchenhao66
c92eeed45d
[XPU] Update the return value of TextImageGatherScatter ( #4636 )
...
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-29 16:17:01 +08:00
Lucas
5c6105f4a2
[XPU] bind some OPs for VL model with pybind ( #4522 )
2025-10-27 10:50:08 +08:00
zhupengyang
3a43dbf82d
[XPU] merge apply_tp, ops support token_num = 0 ( #4507 )
2025-10-23 19:09:58 +08:00
ddchenhao66
5443b2cffb
[XPU] xpu support think length limit ( #4539 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [XPU] xpu support think length limit
* [XPU] xpu c++ code files format
---------
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-23 15:58:11 +08:00
zhupengyang
3a6883ac1a
c++ code format ( #4527 )
2025-10-22 17:59:50 +08:00
Lucas
99564349a7
[XPU] bind block_attn kernel with pybind ( #4499 )
2025-10-21 10:58:52 +08:00
zhupengyang
26ff2f8683
[XPU] refine fused moe ( #4219 )
2025-10-16 19:04:07 +08:00
Lucas
bdc0207277
[XPU] fix VL multi-batch accuracy issue ( #4394 )
2025-10-15 17:27:43 +08:00
yinwei
20c7b741f4
[XPU] Support W4A8C8-TP4-300B Model ( #4068 )
...
* support w4a8
* delete ep block attn
* delete moe_topk_select
* update note
* update
* delte useless info
* update
* add some note
* fix some format
* update scale info
* add ans baseline
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-10-10 15:41:32 +08:00
Lucas
87179cb744
[XPU] support XPU VL model inference ( #4030 )
...
* [XPU] support XPU VL model inference
* fix image op import and device check
* rebase develop
* fix perf
2025-09-25 14:34:15 +08:00
yyssys
d6e59447f5
[XPU] Enable XPU V1 mode based on environment variable ( #4213 )
...
* Enable XPU V1 mode based on environment variable
* add default param to xft_moe_fc_block_eb for latest xvllm compatibility; update run_ci_xpu to use latest xvllm
2025-09-24 10:29:48 +08:00
zhupengyang
9409665713
[xpu] support ep ( #4067 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-15 13:53:11 +08:00
co63oc
8466219ec8
fix typos ( #3840 )
...
* fix typos
* ci
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-09-12 11:04:38 +08:00
zhupengyang
9d0074a91a
[xpu] add ep custom ops ( #3911 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-09-10 12:22:50 +08:00
plusNew001
3790505319
[XPU] Update XPU stable xvllm and xtdk version for 2.2 ( #3853 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Run Accuracy Tests (push) Has been cancelled
CI Images Build / Run Stable Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
* Add debug environment variable exports
Added debug environment variable exports for CLANG_PATH and XVLLM_PATH.
* Lock paddlepaddle-xpu version in CI script
Temporarily lock paddlepaddle-xpu version due to framework update issues.
* Update no_proxy environment variable in CI workflow
* Install lsof tool in run_ci_xpu.sh
* Update dependency versions for stable release
* Update paddlepaddle-xpu installation command
2025-09-03 23:21:00 +08:00
co63oc
d6369b4d51
fix typos ( #3684 )
2025-09-01 17:50:17 +08:00
lengxia
137e539456
[Feature][XPU] add custom kernels for mtp ( #3537 )
2025-08-25 10:14:17 +08:00
yinwei
f2a528f9ae
[XPU] Support kvblock centralized management ( #3017 )
2025-07-29 10:40:55 +08:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
周周周
1339e56282
[XPU] Remove padding_offsets from get_padding_offset.cu ( #2911 )
2025-07-18 14:16:44 +08:00
yulangz
17314ee126
[XPU] Update doc and add scripts for downloading dependencies ( #2845 )
...
* [XPU] update xvllm download
* update supported models
* fix xpu model runner in huge memory with small model
* update doc
2025-07-16 11:05:56 +08:00
Yuanle Liu
61b3997b85
refactor rl get_name_mappings_to_training ( #2847 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* refactor rl get_name_mappings_to_training
* fix tp>1
* change variable name(ffn1->up_gate_proj/ffn2->down_proj)
* change variable name(linear_weight->weight/linear_bias->bias)
* add rl names mapping for vl
* fix ernie 0.3B error
* fix develop code
* fix
2025-07-15 07:31:42 -07:00
Jiang-Jia-Jun
92c2cfa2e7
Sync v2.0 version of code to github repo
2025-06-29 23:29:37 +00:00