FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Author	SHA1	Message	Date
bukejiyu	1539fd6056	[BugFix]Set default OMP_NUM_THREADS=3 and fix extra GPU memory usage in DeepSeek (#5219 ) * fix bug * update * update * update * fix copy * update	2025-11-28 14:22:04 +08:00
chen	35f85baf09	[BugFix]fix v1 loader lm head fp32 (#5270 )	2025-11-27 20:12:56 +08:00
Yuanle Liu	7ac25935c7	[Optimization] default compile rdma, reduce cudagraph buffer size in mm, fix some config bug (#5121 ) * default compile rdma, reduce cudagraph buffer size in mm, fix some config logic * update * update * fix bug * enhance rdma compile * fix	2025-11-20 17:19:47 +08:00
bukejiyu	a82f25ea7b	[RL]Resolve shape mismatch problems in RL-related modules (#5032 ) * RL fix * update	2025-11-19 11:12:48 +08:00
bukejiyu	b09ebb2813	refactor pt loading (#4532 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details	2025-11-11 21:30:39 +08:00
Yuanle Liu	3dc0ffa46d	[TSP] Support qwen3 moe tsp + cudagraph (#4871 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * support qwen3_moe tsp mode * fix * fix * update * update * update * fix * support external_rmsnorm * update * fix	2025-11-10 23:37:51 +08:00
CSWYF3634076	acd331780c	[V1 loader] Qwen25 VL support v1 loader and torch style safetensors load (#4388 ) * [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix * [Docs]offine infer add apply_chat_template add_generation_prompt parameter * [Model]qwen2.5VL support --use-cudagraph * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v2 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v3 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v4 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v5 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v6 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v7 * qwen25vl v1 loader * qwen25vl v1 loader v2 * qwen25vl v1 loader v3 * qwen25vl v1 loader fix tp2 weight PySafeSlice * qwen25vl v1 loader no test * qwen25vl v1 loader add unit test * qwen25vl v1 loader add unit test v2 * qwen25vl v1 loader add torch unit test v3 * qwen25vl v1 loader add torch unit test v4 * qwen25vl v1 loader add torch unit test v5 * qwen25vl v1 loader add torch unit test v6	2025-10-27 10:54:15 +08:00
Ayakouji	a3e0a15495	fix seqlen sync (#4442 )	2025-10-17 14:37:52 +08:00
YuanRisheng	a2ec2c4152	[FDConfig]Remove max_model_len in FDConfig (#4350 ) * modify max_model_len * fix unittest * fix unittest --------- Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>	2025-10-11 14:04:17 +08:00
CSWYF3634076	5ff10c8ced	[Model] Qwen2.5VL support --use-cudagraph and unit testing (#4087 ) * [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix * [Docs]offine infer add apply_chat_template add_generation_prompt parameter * [Model]qwen2.5VL support --use-cudagraph * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v2 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v3 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v4 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v5 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v6 * [Model]qwen2.5VL support --use-cudagraph buffer and qwenvl test v7	2025-09-24 19:45:01 +08:00
Yuanle Liu	b455fd39f3	register_model_class compatible with plugins (#4236 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-09-24 11:17:12 +08:00
lizexu123	c86945ef49	[Feature] support pool (#3827 ) * support pool * update pooling * add pooler_config and check * update * support AutoWeightsLoader load weight * fix * update * delete print * update pre-commit * fix * fix xpu * fix ModelRegistry->model_registry * fix Copilot review * fix pooler.py * delete StepPooler * fix abstract * fix default_loader_v1 * fix Pre Commit * support torch qwen3 dense * add test and fix torch-qwen * fix * fix * adapter ci: * fix review * fix pooling_params.py * fix * fix tasks.py 2025 * fix print and logger * Modefy ModelRegistry and delete AutoWeightsLoader * fix logger * fix test_embedding * fix ci bug * ernie4_5 model_registry * fix test * support Qwen3-Embedding-0.6B tp=1 load * fix extra code * fix * delete fix vocab_size * delete prepare_params_dict * fix:	2025-09-22 14:09:09 +08:00
zhouchong	ccd52b5596	[Model]support qwen2_5_vl (#3557 ) * adapt qwen_2_5_vl model * adapt qwen_2_5_vl VIT model * adapt qwen2_5_vl images_embeds * adapt qwen2_5_vl 3D rope * adapt qwen2_5_vl 3D rope v2 * adapt qwen2_5_vl processor * adapt qwen2_5_vl bypass resampler_model * adapt qwen2_5_vl 绕过部分ernie逻辑 * adapt qwen2_5_vl 绕过部分ernie逻辑 v2 * adapt qwen2_5_vl 权重加载与命名修改 * adapt qwen2_5_vl 非必须think_end_id * adapt qwen2_5_vl 区分多种模型的extract_vision_features * fix:adapt qwen2_5_vl model * adapt qwen2_5_vl norm * adapt qwen2_5_vl processor 更新 * adapt qwen2_5_vl image and video success * adapt qwen2_5_vl 部分整理代码 * adapt qwen2_5_vl 支持多卡 * adapt qwen2_5_vl on latest develop * adapt qwen2_5_vl RL * adapt qwen2_5_vl 整理代码 * support noex rope3d * adapt qwen2_5_vl add init.py * adapt qwen2_5_vl add init.py v2 * adapt qwen2_5_vl remove space * adapt qwen2_5_vl remove space v2 * adapt qwen2_5_vl pre-commit * adapt qwen2_5_vl update * adapt qwen2_5_vl pre-commit v2 * adapt qwen2_5_vl modify comments * adapt qwen2_5_vl fix indentation * adapt qwen2_5_vl fix indentation v2 --------- Co-authored-by: wangyafeng <wangyafeng@baidu.com> Co-authored-by: xiaoxiaohehe001 <49090790+xiaoxiaohehe001@users.noreply.github.com> Co-authored-by: CSWYF3634076 <58356743+CSWYF3634076@users.noreply.github.com>	2025-08-29 18:28:39 +08:00

13 Commits