FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Author	SHA1	Message	Date
Jiang-Jia-Jun	a4fdb3970b	[BugFix] Fix vocab size error for ernie model (#2785 ) * [BugFix] Fix vocab size error for ernie model * [BugFix] Fix vocab size error for ernie model --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-10 01:05:51 +08:00
Jiang-Jia-Jun	2a86928657	[BugFix Revert] Fix vocab size error for ernie model	2025-07-09 22:14:54 +08:00
Jiang-Jia-Jun	b1c53fa779	[BugFix] Fix vocab size error for ernie model	2025-07-09 22:13:41 +08:00
lizexu123	da20cf681e	[Bug fix] Fixed the garbled text issues in Qwen3-8B (#2783 )	2025-07-09 22:03:57 +08:00
lifulll	1f28bdf994	dcu adapter ernie45t (#2756 ) Co-authored-by: lifu <lifu@sugon.com> Co-authored-by: yongqiangma <xing.wo@163.com>	2025-07-09 18:56:27 +08:00
RAM	03a74995b8	Clear dead code And supplementary notes (#2757 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * 1.supplementary notes 2.delete dead code * fix bug of forward meta * Global modification of forward meta * fix vl model_runner bug	2025-07-09 16:17:34 +08:00
zhink	b89180f1cd	[Feature] support custom all-reduce (#2758 ) * [Feature] support custom all-reduce * add vllm adapted	2025-07-09 16:00:27 +08:00
yulangz	0350831c2b	fix xpu offline demo garbled output (#2763 )	2025-07-09 14:51:20 +08:00
Ryan	c4718fd693	Enable SOT D2St in Multimodal Model (#2735 )	2025-07-09 12:26:18 +08:00
GoldPancake	f7cad30a38	[Feature] Add speculative decoding simulation benchmark. (#2751 ) * Add speculative decoding simulation benchmark * Fix the name of the parameter	2025-07-09 12:08:43 +08:00
lizexu123	525be243e7	[Bug fix] Fixed the garbled text issues in Qwen3-8B (#2737 ) * fix qwen3.py * update * update lm_head tie_word_embeddings * update tie_word_embeddings * fix * fix tie_word_embedding not in config.json --------- Co-authored-by: lizexu <lizexu@baidu.com>	2025-07-07 23:15:27 -07:00
EnflameGCU	d0f4d6ba3a	[GCU] Support gcu platform (#2702 ) baseline: `e7fa57ebae` Co-authored-by: yongqiangma <xing.wo@163.com>	2025-07-08 13:00:52 +08:00
gaoziyuan	26d5d737dd	【Fearture】support qwen2 some func (#2740 ) * add rl qwen model support * fix * fix	2025-07-08 12:03:04 +08:00
liddk1121	1b54a2831e	Adapt for iluvatar gpu (#2684 )	2025-07-07 16:53:14 +08:00
ltd0924	68b4755587	[LLM] support multi node deploy (#2708 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * [LLM] support multi node deploy * Update engine.py * fix bugs * fix * [LLM] support multi node deploy * [LLM] support multi node deploy --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-07-06 10:33:51 +08:00
freeliuzc	667547be59	support chunk_prefill in MTP (#2705 )	2025-07-04 11:55:48 +08:00
Yuanle Liu	240bdac2a4	[feat] support fa3 backend for pd disaggregated (#2695 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * delete use_fast_ffn	2025-07-03 22:33:27 +08:00
Jiang-Jia-Jun	05c670e593	[Sync] Update to latest code (#2679 ) * [Sync] Update to latest code * Add new code files * Add new code files * update code * Try to fix build.sh * Try to fix build.sh * Update code * Update requirements.txt * Update code --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00
jiangjiajun	684703fd72	[LLM] First commit the llm deployment code	2025-06-09 19:20:15 +08:00

20 Commits