FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-05 08:37:06 +08:00

Author	SHA1	Message	Date
lizexu123	525be243e7	[Bug fix] Fixed the garbled text issues in Qwen3-8B (#2737 ) * fix qwen3.py * update * update lm_head tie_word_embeddings * update tie_word_embeddings * fix * fix tie_word_embedding not in config.json --------- Co-authored-by: lizexu <lizexu@baidu.com>	2025-07-07 23:15:27 -07:00
gaoziyuan	26d5d737dd	【Fearture】support qwen2 some func (#2740 ) * add rl qwen model support * fix * fix	2025-07-08 12:03:04 +08:00
Ryan	fefbd65cf8	[SOT] Remove BreakGraph with `paddle.maximum` (#2731 ) * rm if with clip * clip -> maximum * int64 -> int32	2025-07-08 11:44:25 +08:00
liddk1121	1b54a2831e	Adapt for iluvatar gpu (#2684 )	2025-07-07 16:53:14 +08:00
GoldPancake	e7fa57ebae	Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * fix mtp eh_proj layer * fix mtp update_cfg function * fix stringdoc * simplify class name	2025-07-04 14:15:04 +08:00
Yuanle Liu	240bdac2a4	[feat] support fa3 backend for pd disaggregated (#2695 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * delete use_fast_ffn	2025-07-03 22:33:27 +08:00
Jiang-Jia-Jun	05c670e593	[Sync] Update to latest code (#2679 ) * [Sync] Update to latest code * Add new code files * Add new code files * update code * Try to fix build.sh * Try to fix build.sh * Update code * Update requirements.txt * Update code --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00
jiangjiajun	684703fd72	[LLM] First commit the llm deployment code	2025-06-09 19:20:15 +08:00

9 Commits