FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-06 00:57:33 +08:00

Author	SHA1	Message	Date
李泳桦	d18a637a17	[feat] add metrics for yiyan adapter (#3219 ) * [feat] add metrics for yiyan adapter * [fix] fix metrics num_requests_waiting and num_requests_running * [fix] fix metrics gpu_cache_usage_perc * [refactor] change where requests_number increases * [chore] rename xxx_block_num as xxx_gpu_block_num, and update their values accordingly * [chore] delete useless code	2025-08-21 16:58:10 +08:00
chenjian	d2f6c3b998	[Bug fix] Fix bug for seq_len_encoder is 1 (#3467 )	2025-08-19 15:21:32 +08:00
ltd0924	b20ffe3697	[Feature] optimize expert parallel (#3196 ) * optimize * Update expert_service.py * Update worker_process.py * optimize	2025-08-05 17:34:24 +08:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00
jiangjiajun	684703fd72	[LLM] First commit the llm deployment code	2025-06-09 19:20:15 +08:00