Commit Graph

6 Commits

Author SHA1 Message Date
Juncai
d67388a479 [PD Disaggregation] Distinguish the pipelines for sending kv signal in different prefill (#5514)
* Distinguish the pipelines for sending kv signal in different prefill

* up
2025-12-12 14:05:36 +08:00
Yonghua Li
43097a512a [BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol (#5132)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [fix] fix v1 scheduler profile run for append attention in prefill node

* [fix] skip send_signal if kv signal not inited for gpu and xpu

* [fix] extend fix to flash_attn & mla_attn

* [fix] fix v1 pd run in ipc transfer protocol

* [ci] add test for v1 pd profile run using ipc transfer protocol

* [style] fix code style check

* [style] fix code style again

* [fix] fix profile run

* [update] remove --num-gpu-blocks-override in example script

* [chore] rename forward_meta is_profiling to is_dummy_or_profile_run
2025-11-20 21:39:22 +08:00
Zero Rains
25698d56d1 polish code with new pre-commit rule (#2923) 2025-07-19 23:19:27 +08:00
freeliuzc
d49f8fb30a [Feature][MTP] Support cacheKV transfer in per_chunk mode (#2890)
* support chunk_prefill both normal and speculative_decoding(mtp)

* optimize pd-disaggregation config

* fix bug
2025-07-17 17:58:08 +08:00
Jiang-Jia-Jun
92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72 [LLM] First commit the llm deployment code 2025-06-09 19:20:15 +08:00