FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Author	SHA1	Message	Date
李泳桦	d18a637a17	[feat] add metrics for yiyan adapter (#3219 ) * [feat] add metrics for yiyan adapter * [fix] fix metrics num_requests_waiting and num_requests_running * [fix] fix metrics gpu_cache_usage_perc * [refactor] change where requests_number increases * [chore] rename xxx_block_num as xxx_gpu_block_num, and update their values accordingly * [chore] delete useless code	2025-08-21 16:58:10 +08:00
chenjian	aba94169dc	[Feature] Support batched tokens for EP (#3415 ) * Support batched tokens for EP * Support batched tokens for EP * Support batched tokens for EP * Support batched tokens for EP * Support batched tokens for EP and fix bug * Support batched tokens for EP and fix bug * Support batched tokens for EP and fix bug * Support batched tokens for EP and fix bug	2025-08-18 11:43:36 +08:00
chenjian	7573802a88	[Feature] Support mtp ep in fd (#3340 ) * [Optimize] Add metrics for analysing perf * Fix bug in mtp	2025-08-11 21:49:44 +08:00
chenjian	110f33a530	[Bug fix] Test td cache messager (#3242 ) * support disable cache task in decode node * fix busg * Update engine.py * Update expert_service.py * Update splitwise_connector.py * Optimize log for debug * Optimize log for debug * fix bug --------- Co-authored-by: ltd0924 <ltd0924@sina.com> Co-authored-by: ltd0924 <32387785+ltd0924@users.noreply.github.com>	2025-08-06 15:52:45 +08:00
ltd0924	b20ffe3697	[Feature] optimize expert parallel (#3196 ) * optimize * Update expert_service.py * Update worker_process.py * optimize	2025-08-05 17:34:24 +08:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00

7 Commits