FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 00:33:03 +08:00

Author	SHA1	Message	Date
gaoziyuan	4021d66ea5	【Feature】add fd plugins && rm model_classes (#3123 ) * add fd plugins && rm model_classed * fix reviews * add docs * fix * fix unitest ci	2025-08-03 19:53:20 -07:00
ApplEOFDiscord	b71cbb466d	[Feature] remove dependency on enable_mm and refine multimodal's code (#3014 ) * remove dependency on enable_mm * fix codestyle check error * fix codestyle check error * update docs * resolve conflicts on model config * fix unit test error * fix code style check error --------- Co-authored-by: shige <1021937542@qq.com> Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-08-01 20:01:18 +08:00
ming1753	fc5f43c6bc	[Docs] Optimal Deployment (#2768 )	2025-08-01 11:56:27 +08:00
LiqinruiG	25005fee30	[Doc] add chat_template_kwagrs and update params docs (#3103 ) * add chat_template_kwagrs and update params docs * add chat_template_kwagrs and update params docs * update enable_thinking * pre-commit * update test case --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-07-31 19:44:06 +08:00
JYChen	1ef38b1563	[doc] best practice for eb45 text models (#3002 ) * [doc] best practice for eb45 text models * fix docs	2025-07-31 17:21:55 +08:00
Jiang-Jia-Jun	4498058722	Update README.md	2025-07-31 15:33:12 +08:00
Jiang-Jia-Jun	66304cf921	Update sampling.md	2025-07-31 15:02:57 +08:00
yinwei	5b9aec1f10	xpu release 2.0.3 (#3105 )	2025-07-31 14:26:07 +08:00
Jiang-Jia-Jun	998968f1e8	[Doc] Update parameters of serving	2025-07-30 22:35:01 +08:00
JYChen	bd29b2aaca	add stop_seqs doc (#3090 )	2025-07-30 20:36:18 +08:00
李泳桦	b242150f94	[feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client (#3058 ) * [feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client * [fix] delete ci test case for enable_thinking * [fix] add reasoning_parser when server starts * [fix] fix ci consistency test error with reasoning parser * [doc] update docs related to metadata * [fix] cancel enable_thinking default value	2025-07-30 19:25:20 +08:00
Zero Rains	4dc130c5a9	[Doc] add repetition early stopping doc (#3078 ) * add repetition early stop doc * add the early_stop.md	2025-07-29 22:01:57 -07:00
lddfym	5ca684c762	update doc: load_balance.md (#3008 ) * update doc of load_balance * update doc: load_balance.md	2025-07-30 10:27:56 +08:00
Sunny-bot1	9c962343f2	[Docs] add sampling docs (#2973 ) * add sampling docs * add minp sampling docs * update sample docs * update * update * add bad words desc * update	2025-07-30 02:24:16 +08:00
Jiang-Jia-Jun	286802a070	Update ernie-4.5.md	2025-07-29 10:10:09 +08:00
Jiang-Jia-Jun	6ce3a8a497	Update index.md	2025-07-25 10:32:47 +08:00
Yzc216	980126b83a	[Feature] multi source download (#3005 ) * multi-source download * multi-source download * huggingface download revision * requirement * style * add revision arg * test * pre-commit * Change default download * change requirements.txt * modify English Documentation * documentation	2025-07-24 17:42:09 +08:00
lizexu123	67990e0572	[Feature] support min_p_sampling (#2872 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * Fastdeploy support min_p * add test_min_p * fix * min_p_sampling * update * delete vl_gpu_model_runner.py * fix * Align usage of min_p with vLLM * fix * modified unit test * fix test_min_sampling * pre-commit all files * fix * fix * fix * fix xpu_model_runner.py	2025-07-20 23:17:59 -07:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
RAM	bbe2c5c968	Update GraphOptimizationBackend docs (#2898 )	2025-07-17 21:38:18 +08:00
yulangz	c8c280c4d3	[XPU][Doc] fix typo (#2892 )	2025-07-17 19:13:54 +08:00
yulangz	7dfd2ea052	[XPU][doc] Update minimal fastdeploy required (#2863 ) * [XPU][doc] update minimal fastdeploy required	2025-07-17 11:33:22 +08:00
yulangz	17314ee126	[XPU] Update doc and add scripts for downloading dependencies (#2845 ) * [XPU] update xvllm download * update supported models * fix xpu model runner in huge memory with small model * update doc	2025-07-16 11:05:56 +08:00
zhenwenDang	5fc659b900	[Docs] add enable_logprob parameter description (#2850 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * add enable_logprob parameter description * add enable_logprob parameter description * add enable_logprob parameter description * add enable_logprob parameter description * add enable_logprob parameter description * add enable_logprob parameter description --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-07-15 19:47:45 +08:00
AIbin	b7858c22d9	【Update Docs】update supported_models doc (#2836 ) * update supported_models doc	2025-07-14 16:01:34 +08:00
Sunny-bot1	240d6236bc	[Fix]fix top_k_top_p sampling (#2801 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * fix topk-topp * update * add base_non_truncated	2025-07-10 22:35:10 +08:00
LiqinruiG	ce5adec877	[Doc] modify offline-inerence docs (#2800 ) * modify offline-inerence docs * [bug] remove tool_call_content	2025-07-10 19:41:12 +08:00
yulangz	830de5a925	[XPU] Supports TP4 deployment on 4,5,6,7 (#2794 ) * 支持通过 XPU_VISIBLE_DEVICES 指定 4,5,6,7 卡运行 * 修改 XPU 文档中多卡说明	2025-07-10 16:48:08 +08:00
Sunny-bot1	1e2319cbef	Rename top_p_sampling to top_k_top_p_sampling (#2791 )	2025-07-10 00:09:25 -07:00
Sunny-bot1	e45050cae3	[Feature] support top_k_top_p sampling (#2753 ) * support top_k_top_p sampling * fix * add api param * add api para * fix * fix * fix * fix * fix * fix * fix	2025-07-09 20:58:58 -07:00
LiqinruiG	54affdc44b	[Doc] modify offline_inference docs (#2787 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * modify reasoning_output docs * modify offline inference docs * modify offline inference docs * modify offline_inference docs * modify offline_inference docs	2025-07-10 01:06:14 +08:00
LiqinruiG	4ccd1696ab	[Doc] modify offline inference docs (#2747 ) * modify reasoning_output docs * modify offline inference docs * modify offline inference docs	2025-07-09 20:53:26 +08:00
chen	888780ffde	[Feature] block_wise_fp8 support triton_moe_backend (#2767 )	2025-07-09 19:22:47 +08:00
lifulll	1f28bdf994	dcu adapter ernie45t (#2756 ) Co-authored-by: lifu <lifu@sugon.com> Co-authored-by: yongqiangma <xing.wo@163.com>	2025-07-09 18:56:27 +08:00
zhink	b89180f1cd	[Feature] support custom all-reduce (#2758 ) * [Feature] support custom all-reduce * add vllm adapted	2025-07-09 16:00:27 +08:00
EnflameGCU	d0f4d6ba3a	[GCU] Support gcu platform (#2702 ) baseline: `e7fa57ebae` Co-authored-by: yongqiangma <xing.wo@163.com>	2025-07-08 13:00:52 +08:00
chen	66b321d9ec	Update eb45-0.3B cuda memory (#2686 )	2025-07-07 11:31:15 +08:00
LiqinruiG	b38823bc66	modify reasoning_output docs (#2696 )	2025-07-04 11:30:02 +08:00
kevin	3d3bccdf79	[doc] update docs (#2690 )	2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun	d222248d00	Update README.md	2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun	e5b94d4117	Update README.md	2025-07-03 15:28:05 +08:00
handiz	b8a8a19689	add wint2 performance (#2673 )	2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun	97ac82834f	Update nvidia_gpu.md	2025-07-02 16:54:14 +08:00
Jiang-Jia-Jun	685265a97d	Update nvidia_gpu.md	2025-07-02 15:43:35 +08:00
Jiang-Jia-Jun	fc4d643634	Update nvidia_gpu.md	2025-07-02 15:39:48 +08:00
liddk1121	865e856a94	update iluvatar gpu fastdeploy whl (#2675 )	2025-07-02 14:47:21 +08:00
Jiang-Jia-Jun	9f4a65d817	Update README.md Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-02 10:04:58 +08:00
freeliuzc	2b7f74d427	fix docs (#2669 ) Co-authored-by: liuzichang01 <liuzichang01@baidu.com>	2025-07-01 18:02:44 +08:00
Jiang-Jia-Jun	164b83ab0b	[Doc] Update nvidia gpu installation description	2025-07-01 15:22:19 +08:00
Jiang-Jia-Jun	01d5d66d95	[Doc] Update nvidia gpu installation description	2025-07-01 15:20:40 +08:00

1 2 3 4 5 ...

511 Commits