Yonghua Li
|
f4119d51b4
|
[PD Disaggregation] support DP via v1 router and decouple DP and EP (#5197)
* [fix] support DP via v1 router and decouple DP and EP
* [fix] fix scripts
* [fix] reset model path
* [fix] dp use get_output_ep, fix router port type, update scripts
* [merge] merge with latest code
* [chore] remove some debug log
* [fix] fix code style check
* [fix] fix test_multi_api_server for log_dir name
* [chore] reduce logs
* Apply suggestions from code review
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-12-04 15:38:43 +08:00 |
|
Juncai
|
0925d44f18
|
[PD Disaggregation] support different tp_size for prefill and decode (#5296)
* up
* up
* up
* fix
|
2025-12-01 17:50:20 +08:00 |
|
Daci
|
7dc06cac6e
|
[BugFix] race condition [is_fetching] causing multiple fetch requests (#5238)
* RouterArgs port str -> int
* fix race condition [is_fetching] causing multiple fetch requests
* bugfix: Delete duplicate input_ids tensor creation
|
2025-11-28 13:41:36 +08:00 |
|
Juncai
|
f9b0545a7f
|
[PD Disaggregation] [Refine] Refine splitwise deployment (#5151)
* Refine splitwise deployment
* up
|
2025-11-21 15:30:24 +08:00 |
|
Juncai
|
08ca0f6aea
|
[Feature] [PD] add simple router and refine splitwise deployment (#4709)
* add simple router and refine splitwise deployment
* fix
|
2025-11-06 14:56:02 +08:00 |
|