FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-31 11:56:44 +08:00

Author	SHA1	Message	Date
EnflameGCU	d0f4d6ba3a	[GCU] Support gcu platform (#2702 ) baseline: `e7fa57ebae` Co-authored-by: yongqiangma <xing.wo@163.com>	2025-07-08 13:00:52 +08:00
gaoziyuan	26d5d737dd	【Fearture】support qwen2 some func (#2740 ) * add rl qwen model support * fix * fix	2025-07-08 12:03:04 +08:00
Ryan	fefbd65cf8	[SOT] Remove BreakGraph with `paddle.maximum` (#2731 ) * rm if with clip * clip -> maximum * int64 -> int32	2025-07-08 11:44:25 +08:00
ming1753	1eb8ea7328	[Bug fix] fix complie bug when sm < 89 (#2738 )	2025-07-08 11:24:52 +08:00
ming1753	ef6649a577	[Optimize] Optimize tensorwise fp8 performance (#2729 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [Optimize] Optimize tensorwise fp8 performance	2025-07-07 20:06:28 +08:00
liddk1121	1b54a2831e	Adapt for iluvatar gpu (#2684 )	2025-07-07 16:53:14 +08:00
YUNSHEN XIE	2579e8fea8	support FastDeploy version setting (#2725 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-07 14:50:11 +08:00
Yuanle Liu	91528f1af9	remove redundant install whl of fastdeploy (#2726 ) * remove redundant install * remove redundant install	2025-07-06 23:49:37 -07:00
lddfym	4e293e50fa	Check if the controller port is available (#2724 )	2025-07-07 13:24:55 +08:00
chen	66b321d9ec	Update eb45-0.3B cuda memory (#2686 )	2025-07-07 11:31:15 +08:00
ltd0924	68b4755587	[LLM] support multi node deploy (#2708 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [LLM] support multi node deploy * Update engine.py * fix bugs * fix * [LLM] support multi node deploy * [LLM] support multi node deploy --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-07-06 10:33:51 +08:00
LQX	04a8e1ef2b	修改XPU CI, test=model (#2721 )	2025-07-06 10:19:04 +08:00
Ting	a6e9161045	fix bug. (#2718 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-05 08:19:19 +08:00
Ting	90ef28d982	spec token map lazy. (#2715 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-05 00:14:54 +08:00
YuBaoku	b37585e693	[BugFix] fix paddle_git_commit_id error (#2714 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * set git identity to avoid merge failure in CI * add ci cases * [CI] Add validation for MTP and CUDAGraph * [BugFix] fix paddle_git_commit_id error	2025-07-04 22:16:37 +08:00
lizexu123	9cb08e71e8	add support QWQ enable_thinking (#2706 ) * add support QWQ enable_thinking * add stream=True * fix stream=true * fix qwen --------- Co-authored-by: lizexu <lizexu@baidu.com>	2025-07-04 20:55:23 +08:00
YuBaoku	dacc46f04c	[CI] Add validation for MTP and CUDAGraph (#2710 ) * set git identity to avoid merge failure in CI * add ci cases * [CI] Add validation for MTP and CUDAGraph	2025-07-04 18:13:54 +08:00
Jiang-Jia-Jun	09ded7715f	Update mkdocs.yml	2025-07-04 17:55:52 +08:00
LQX	11cfdf5d89	添加XPU CI, test=model (#2701 ) * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model	2025-07-04 16:16:06 +08:00
GoldPancake	e7fa57ebae	Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * fix mtp eh_proj layer * fix mtp update_cfg function * fix stringdoc * simplify class name	2025-07-04 14:15:04 +08:00
gaoziyuan	a5ae88ded9	[feature]add fd whl version info (#2698 )	2025-07-04 14:12:42 +08:00
ltd0924	87e638498c	[RL] update reschedule finish reason (#2709 )	2025-07-04 13:47:36 +08:00
freeliuzc	667547be59	support chunk_prefill in MTP (#2705 )	2025-07-04 11:55:48 +08:00
LiqinruiG	b38823bc66	modify reasoning_output docs (#2696 )	2025-07-04 11:30:02 +08:00
Divano	050d9658a5	Update requirements.txt	2025-07-04 09:53:03 +08:00
Divano	be5cabaf80	add quick benchmark (#2703 ) 测试脚本不需要过CI	2025-07-04 09:32:36 +08:00
Yuanle Liu	240bdac2a4	[feat] support fa3 backend for pd disaggregated (#2695 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * delete use_fast_ffn	2025-07-03 22:33:27 +08:00
ltd0924	00863c43fd	[Bug] fix logger format (#2689 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-03 19:58:03 +08:00
kevin	3d3bccdf79	[doc] update docs (#2690 )	2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun	9fd74f75bd	Update dynamic_weight_manager.py	2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun	05c670e593	[Sync] Update to latest code (#2679 ) * [Sync] Update to latest code * Add new code files * Add new code files * update code * Try to fix build.sh * Try to fix build.sh * Update code * Update requirements.txt * Update code --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun	d222248d00	Update README.md	2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun	e5b94d4117	Update README.md	2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun	87e2e58a22	Update gh-pages.yml	2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun	de20e5a992	Update Dockerfile.xpu Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun	2f9c0618f0	Update Dockerfile.gpu	2025-07-03 10:14:39 +08:00
Yuanle Liu	9a14ab6572	add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl (#2682 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-02 05:32:20 -07:00
Divano	d1cb3ed571	Update gh-pages.yml (#2680 )	2025-07-02 17:36:18 +08:00
handiz	b8a8a19689	add wint2 performance (#2673 )	2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun	97ac82834f	Update nvidia_gpu.md	2025-07-02 16:54:14 +08:00
Jiang-Jia-Jun	685265a97d	Update nvidia_gpu.md	2025-07-02 15:43:35 +08:00
Jiang-Jia-Jun	fc4d643634	Update nvidia_gpu.md	2025-07-02 15:39:48 +08:00
YuBaoku	bb880c8d7c	Update CI test cases (#2671 ) * set git identity to avoid merge failure in CI * add ci cases	2025-07-02 15:08:39 +08:00
liddk1121	865e856a94	update iluvatar gpu fastdeploy whl (#2675 )	2025-07-02 14:47:21 +08:00
Jiang-Jia-Jun	9f4a65d817	Update README.md Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-02 10:04:58 +08:00
YuBaoku	e3aac0c5b8	set git identity to avoid merge failure in CI (#2665 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-01 19:06:46 +08:00
AIbin	a197dcd729	【Inference Optimize】Support ERNIE-4_5-300B-A47B-2BITS-Paddle model TP2/TP4 Inference (#2666 ) * Support TP2&TP4 Wint * Support TP2&TP4 Wint2 Inference	2025-07-01 18:29:11 +08:00
freeliuzc	2b7f74d427	fix docs (#2669 ) Co-authored-by: liuzichang01 <liuzichang01@baidu.com>	2025-07-01 18:02:44 +08:00
Jiang-Jia-Jun	164b83ab0b	[Doc] Update nvidia gpu installation description	2025-07-01 15:22:19 +08:00
Jiang-Jia-Jun	01d5d66d95	[Doc] Update nvidia gpu installation description	2025-07-01 15:20:40 +08:00

... 2 3 4 5 6 ...

2780 Commits