【Feature】add fd plugins && rm model_classes (#3123)

* add fd plugins && rm model_classed * fix reviews * add docs * fix * fix unitest ci
2025-09-27 04:46:16 +08:00 · 2025-08-04 10:53:20 +08:00
parent 1582814905
commit 4021d66ea5
25 changed files with 524 additions and 59 deletions
--- a/.github/workflows/_unit_test_coverage.yml
+++ b/.github/workflows/_unit_test_coverage.yml
@@ -103,6 +103,13 @@ jobs:
            python -m pip install coverage
            python -m pip install diff-cover
            python -m pip install ${fd_wheel_url}
            if [ -d "test/plugins" ]; then
                cd test/plugins
                python setup.py install
                cd ../..
            else
                echo "Warning: test/plugins directory not found, skipping setup.py install"
            fi
            export COVERAGE_FILE=/workspace/FastDeploy/coveragedata/.coverage
            export COVERAGE_RCFILE=/workspace/FastDeploy/scripts/.coveragerc
            TEST_EXIT_CODE=0
--- a/docs/features/plugins.md
+++ b/docs/features/plugins.md
@@ -0,0 +1,85 @@
 # FastDeploy Plugin Mechanism Documentation
 FastDeploy supports a plugin mechanism that allows users to extend functionality without modifying the core code. Plugins are automatically discovered and loaded through Python's `entry_points` mechanism.
 ## How Plugins Work
 Plugins are essentially registration functions that are automatically called when FastDeploy starts. The system uses the `load_plugins_by_group` function to ensure that all processes (including child processes in distributed training scenarios) have loaded the required plugins before official operations begin.
 ## Plugin Discovery Mechanism
 FastDeploy uses Python's `entry_points` mechanism to discover and load plugins. Developers need to register their plugins in the specified entry point group in their project.
 ### Example: Creating a Plugin
 #### 1. How Plugin Work
 Assuming you have a custom model class `MyModelForCasualLM` and a pretrained class `MyPretrainedModel`, you can write the following registration function:
 ```python
 # File: fd_add_dummy_model/__init__.py or fd_add_dummy_model/register.py
 from fastdeploy.model_registry import ModelRegistry
 from my_custom_model import MyModelForCasualLM, MyPretrainedModel
 def register():
    if "MyModelForCasualLM" not in ModelRegistry.get_supported_archs():
        ModelRegistry.register_model_class(MyModelForCasualLM)
        ModelRegistry.register_pretrained_model(MyPretrainedModel)
 ```
 #### 2. Register Plugin in `setup.py`
 ```python
 # setup.py
 from setuptools import setup
 setup(
    name="fastdeploy-plugins",
    version="0.1",
    packages=["fd_add_dummy_model"],
    entry_points={
        "fastdeploy.model_register_plugins": [
            "fd_add_dummy_model = fd_add_dummy_model:register",
        ],
    },
 )
 ```
 ## Plugin Structure
 Plugins consist of three components:
 | Component | Description |
 |-----------|-------------|
 | **Plugin Group** | The functional group to which the plugin belongs, for example:<br> - `fastdeploy.model_register_plugins`: for model registration<br> - `fastdeploy.model_runner_plugins`: for model runner registration<br> Users can customize groups as needed. |
 | **Plugin Name** | The unique identifier for each plugin (e.g., `fd_add_dummy_model`), which can be controlled via the `FD_PLUGINS` environment variable to determine whether to load the plugin. |
 | **Plugin Value** | Format is `module_name:function_name`, pointing to the entry function that executes the registration logic. |
 ## Controlling Plugin Loading Behavior
 By default, FastDeploy loads all registered plugins. To load only specific plugins, you can set the environment variable:
 ```bash
 export FD_PLUGINS=fastdeploy-plugins
 ```
 Multiple plugin names can be separated by commas:
 ```bash
 export FD_PLUGINS=plugin_a,plugin_b
 ```
 ## Reference Example
 Please refer to the example plugin implementation in the project directory:
 ```
 ./test/plugins/
 ```
 It contains a complete plugin structure and `setup.py` configuration example.
 ## Summary
 Through the plugin mechanism, users can easily add custom models or functional modules to FastDeploy without modifying the core source code. This not only enhances system extensibility but also facilitates third-party developers in extending functionality.
 For further plugin development, please refer to the `model_registry` and `plugin_loader` modules in the FastDeploy source code.
--- a/docs/zh/features/plugins.md
+++ b/docs/zh/features/plugins.md
@@ -0,0 +1,85 @@
 # FastDeploy 插件机制说明文档
 FastDeploy 支持插件机制，允许用户在不修改核心代码的前提下扩展功能。插件通过 Python 的 `entry_points` 机制实现自动发现与加载。
 ## 插件工作原理
 插件本质上是在 FastDeploy 启动时被自动调用的注册函数。系统使用 `load_plugins_by_group` 函数确保所有进程（包括分布式训练场景下的子进程）在正式运行前都已加载所需的插件。
 ## 插件发现机制
 FastDeploy 利用 Python 的 `entry_points` 机制来发现并加载插件。开发者需在自己的项目中将插件注册到指定的 entry point 组中。
 ### 示例：创建一个插件
 #### 1. 编写插件逻辑
 假设你有一个自定义模型类 `MyModelForCasualLM` 和预训练类 `MyPretrainedModel`，你可以编写如下注册函数：
 ```python
 # 文件：fd_add_dummy_model/__init__.py
 from fastdeploy.model_registry import ModelRegistry
 from my_custom_model import MyModelForCasualLM, MyPretrainedModel
 def register():
    if "MyModelForCasualLM" not in ModelRegistry.get_supported_archs():
        ModelRegistry.register_model_class(MyModelForCasualLM)
        ModelRegistry.register_pretrained_model(MyPretrainedModel)
 ```
 #### 2. 注册插件到 `setup.py`
 ```python
 # setup.py
 from setuptools import setup
 setup(
    name="fastdeploy-plugins",
    version="0.1",
    packages=["fd_add_dummy_model"],
    entry_points={
        "fastdeploy.model_register_plugins": [
            "fd_add_dummy_model = fd_add_dummy_model:register",
        ],
    },
 )
 ```
 ## 插件结构说明
 插件由三部分组成：
 | 组件 | 说明 |
 |------|------|
 | **插件组（Group）** | 插件所属的功能分组，例如：<br> - `fastdeploy.model_register_plugins`: 用于注册模型<br> - `fastdeploy.model_runner_plugins`: 用于注册模型运行器<br> 用户可根据需要自定义分组。 |
 | **插件名（Name）** | 每个插件的唯一标识名（如 `fd_add_dummy_model`），可通过环境变量 `FD_PLUGINS` 控制是否加载该插件。 |
 | **插件值（Value）** | 格式为 `模块名:函数名`，指向实际执行注册逻辑的入口函数。 |
 ## 控制插件加载行为
 默认情况下，FastDeploy 会加载所有已注册的插件。若只想加载特定插件，可以设置环境变量：
 ```bash
 export FD_PLUGINS=fastdeploy-plugins
 ```
 多个插件名之间可以用逗号分隔：
 ```bash
 export FD_PLUGINS=plugin_a,plugin_b
 ```
 ## 参考示例
 请参见项目目录下的示例插件实现：
 ```
 ./test/plugins/
 ```
 其中包含完整的插件结构和 `setup.py` 配置示例。
 ## 总结
 通过插件机制，用户可以轻松地为 FastDeploy 添加自定义模型或功能模块，而无需修改核心源码。这不仅提升了系统的可扩展性，也方便了第三方开发者进行功能拓展。
 如需进一步开发插件，请参考 FastDeploy 源码中的 `model_registry` 和 `plugin_loader` 模块。
--- a/fastdeploy/init.py
+++ b/fastdeploy/init.py
@@ -22,11 +22,10 @@ import sys
 os.environ["GLOG_minloglevel"] = "2"
 # suppress log from aistudio
 os.environ["AISTUDIO_LOG"] = "critical"
 import typing
 from fastdeploy.engine.sampling_params import SamplingParams
 from fastdeploy.entrypoints.llm import LLM
 from fastdeploy.utils import version
 __all__ = ["LLM", "SamplingParams", "version"]
 try:
    import use_triton_in_paddle
@@ -86,3 +85,27 @@ def _patch_fastsafetensors():
 _patch_fastsafetensors()
 MODULE_ATTRS = {"ModelRegistry": ".model_executor.models.model_base:ModelRegistry", "version": ".utils:version"}
 if typing.TYPE_CHECKING:
    from fastdeploy.model_executor.models.model_base import ModelRegistry
 else:
    def __getattr__(name: str) -> typing.Any:
        from importlib import import_module
        if name in MODULE_ATTRS:
            try:
                module_name, attr_name = MODULE_ATTRS[name].split(":")
                module = import_module(module_name, __package__)
                return getattr(module, attr_name)
            except ModuleNotFoundError:
                print(f"Module {MODULE_ATTRS[name]} not found.")
        else:
            print(f"module {__package__} has no attribute {name}")
 __all__ = ["LLM", "SamplingParams", "ModelRegistry", "version"]
--- a/fastdeploy/envs.py
+++ b/fastdeploy/envs.py
@@ -80,6 +80,8 @@ environment_variables: dict[str, Callable[[], Any]] = {
    "EXPORTER_OTLP_HEADERS": lambda: os.getenv("EXPORTER_OTLP_HEADERS"),
    # enable kv cache block scheduler v1 (no need for kv_cache_ratio)
    "ENABLE_V1_KVCACHE_SCHEDULER": lambda: int(os.getenv("ENABLE_V1_KVCACHE_SCHEDULER", "0")),
    # Whether to use PLUGINS.
    "FD_PLUGINS": lambda: None if "FD_PLUGINS" not in os.environ else os.environ["FD_PLUGINS"].split(","),
 }
--- a/fastdeploy/model_executor/model_loader/default_loader.py
+++ b/fastdeploy/model_executor/model_loader/default_loader.py
@@ -24,7 +24,6 @@ from fastdeploy.model_executor.load_weight_utils import (
    measure_time,
 )
 from fastdeploy.model_executor.model_loader.base_loader import BaseModelLoader
 from fastdeploy.model_executor.model_loader.utils import get_pretrain_cls
 from fastdeploy.model_executor.models.model_base import ModelRegistry
 from fastdeploy.platforms import current_platform
@@ -52,7 +51,7 @@ class DefaultModelLoader(BaseModelLoader):
    @measure_time
    def load_weights(self, model, fd_config: FDConfig, architectures: str) -> None:
-        model_class = get_pretrain_cls(architectures)
+        model_class = ModelRegistry.get_pretrain_cls(architectures)
        state_dict = load_composite_checkpoint(
            fd_config.model_config.model,
            model_class,
--- a/fastdeploy/model_executor/model_loader/utils.py
+++ b/fastdeploy/model_executor/model_loader/utils.py
@@ -1,43 +0,0 @@
 """
 # Copyright (c) 2025  PaddlePaddle Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License"
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """
 from paddleformers.transformers import PretrainedModel
 from fastdeploy.model_executor.models.deepseek_v3 import DeepSeekV3PretrainedModel
 from fastdeploy.model_executor.models.ernie4_5_moe import Ernie4_5_PretrainedModel
 from fastdeploy.model_executor.models.ernie4_5_mtp import Ernie4_5_MTPPretrainedModel
 from fastdeploy.model_executor.models.ernie4_5_vl.ernie4_5_vl_moe import (
    Ernie4_5_VLPretrainedModel,
 )
 from fastdeploy.model_executor.models.qwen2 import Qwen2PretrainedModel
 from fastdeploy.model_executor.models.qwen3 import Qwen3PretrainedModel
 from fastdeploy.model_executor.models.qwen3moe import Qwen3MoePretrainedModel
 MODEL_CLASSES = {
    "Ernie4_5_MoeForCausalLM": Ernie4_5_PretrainedModel,
    "Ernie4_5_MTPForCausalLM": Ernie4_5_MTPPretrainedModel,
    "Qwen2ForCausalLM": Qwen2PretrainedModel,
    "Qwen3ForCausalLM": Qwen3PretrainedModel,
    "Qwen3MoeForCausalLM": Qwen3MoePretrainedModel,
    "Ernie4_5_ForCausalLM": Ernie4_5_PretrainedModel,
    "DeepseekV3ForCausalLM": DeepSeekV3PretrainedModel,
    "Ernie4_5_VLMoeForConditionalGeneration": Ernie4_5_VLPretrainedModel,
 }
 def get_pretrain_cls(architectures: str) -> PretrainedModel:
    """get_pretrain_cls"""
    return MODEL_CLASSES[architectures]
--- a/fastdeploy/model_executor/models/init.py
+++ b/fastdeploy/model_executor/models/init.py
@@ -19,6 +19,8 @@ import inspect
 import os
 from pathlib import Path
 from paddleformers.transformers import PretrainedModel
 from .model_base import ModelForCasualLM, ModelRegistry
@@ -44,7 +46,14 @@ def auto_models_registry(dir_path, register_path="fastdeploy.model_executor.mode
            for attr_name in dir(module):
                attr = getattr(module, attr_name)
                if inspect.isclass(attr) and issubclass(attr, ModelForCasualLM) and attr is not ModelForCasualLM:
-                    ModelRegistry.register(attr)
+                    ModelRegistry.register_model_class(attr)
                if (
                    inspect.isclass(attr)
                    and issubclass(attr, PretrainedModel)
                    and attr is not PretrainedModel
                    and hasattr(attr, "arch_name")
                ):
                    ModelRegistry.register_pretrained_model(attr)
        except ImportError:
            raise ImportError(f"{module_file=} import error")
--- a/fastdeploy/model_executor/models/deepseek_v3.py
+++ b/fastdeploy/model_executor/models/deepseek_v3.py
@@ -673,6 +673,10 @@ class DeepSeekV3PretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "DeepseekV3ForCausalLM"
    @classmethod
    def _get_tensor_parallel_mappings(cls, config, is_split=True):
--- a/fastdeploy/model_executor/models/ernie4_5_moe.py
+++ b/fastdeploy/model_executor/models/ernie4_5_moe.py
@@ -460,9 +460,9 @@ class Ernie4_5_ForCausalLM(Ernie4_5_MoeForCausalLM):
        return "Ernie4_5_ForCausalLM"
-class Ernie4_5_PretrainedModel(PretrainedModel):
+class Ernie4_5_MoePretrainedModel(PretrainedModel):
    """
-    Ernie4_5_PretrainedModel
+    Ernie4_5_MoePretrainedModel
    """
    config_class = FDConfig
@@ -473,6 +473,10 @@ class Ernie4_5_PretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "Ernie4_5_MoeForCausalLM"
    weight_infos = [
        WeightMeta(
            f".layers.{{{layerid.LAYER_ID}}}.self_attn.qkv_proj.weight",
@@ -594,3 +598,16 @@ class Ernie4_5_PretrainedModel(PretrainedModel):
            config.prefix_name,
        )
        return mappings
 class Ernie4_5_PretrainedModel(Ernie4_5_MoePretrainedModel):
    """
    Ernie4_5_PretrainedModel
    """
    @classmethod
    def arch_name(self):
        """
        Model Architecture Name
        """
        return "Ernie4_5_ForCausalLM"
--- a/fastdeploy/model_executor/models/ernie4_5_mtp.py
+++ b/fastdeploy/model_executor/models/ernie4_5_mtp.py
@@ -46,6 +46,10 @@ class Ernie4_5_MTPPretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "Ernie4_5_MTPForCausalLM"
    @classmethod
    def _get_tensor_parallel_mappings(cls, config, is_split=True):
        """
--- a/fastdeploy/model_executor/models/ernie4_5_vl/ernie4_5_vl_moe.py
+++ b/fastdeploy/model_executor/models/ernie4_5_vl/ernie4_5_vl_moe.py
@@ -605,7 +605,7 @@ class Ernie4_5_VLMoeForConditionalGeneration(ModelForCasualLM):
 class Ernie4_5_VLPretrainedModel(PretrainedModel):
    """
-    Ernie4_5_PretrainedModel
+    Ernie4_5_MoePretrainedModel
    """
    config_class = FDConfig
@@ -616,6 +616,10 @@ class Ernie4_5_VLPretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "Ernie4_5_VLMoeForConditionalGeneration"
    from fastdeploy.model_executor.models.tp_utils import TensorSplitMode as tsm
    from fastdeploy.model_executor.models.utils import LayerIdPlaceholder as layerid
    from fastdeploy.model_executor.models.utils import WeightMeta
--- a/fastdeploy/model_executor/models/model_base.py
+++ b/fastdeploy/model_executor/models/model_base.py
@@ -20,6 +20,7 @@ from typing import Dict, Union
 import numpy as np
 import paddle
 from paddle import nn
 from paddleformers.transformers import PretrainedModel
 class ModelRegistry:
@@ -27,21 +28,46 @@ class ModelRegistry:
    Used to register and retrieve model classes.
    """
-    _registry = {}
+    _arch_to_model_cls = {}
    _arch_to_pretrained_model_cls = {}
    @classmethod
-    def register(cls, model_class):
+    def register_model_class(cls, model_class):
        """register model class"""
        if issubclass(model_class, ModelForCasualLM) and model_class is not ModelForCasualLM:
-            cls._registry[model_class.name()] = model_class
+            cls._arch_to_model_cls[model_class.name()] = model_class
        return model_class
    @classmethod
    def register_pretrained_model(cls, pretrained_model):
        """register pretrained model class"""
        if (
            issubclass(pretrained_model, PretrainedModel)
            and pretrained_model is not PretrainedModel
            and hasattr(pretrained_model, "arch_name")
        ):
            cls._arch_to_pretrained_model_cls[pretrained_model.arch_name()] = pretrained_model
        return pretrained_model
    @classmethod
    def get_pretrain_cls(cls, architectures: str):
        """get_pretrain_cls"""
        return cls._arch_to_pretrained_model_cls[architectures]
    @classmethod
    def get_class(cls, name):
        """get model class"""
-        if name not in cls._registry:
+        if name not in cls._arch_to_model_cls:
            raise ValueError(f"Model '{name}' is not registered!")
-        return cls._registry[name]
+        return cls._arch_to_model_cls[name]
    @classmethod
    def get_supported_archs(cls):
        assert len(cls._arch_to_model_cls) == len(
            cls._arch_to_model_cls
        ), "model class / pretrained model registry num is not same"
        return [key for key in cls._arch_to_model_cls.keys()]
 class ModelForCasualLM(nn.Layer, ABC):
--- a/fastdeploy/model_executor/models/qwen2.py
+++ b/fastdeploy/model_executor/models/qwen2.py
@@ -355,6 +355,10 @@ class Qwen2PretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "Qwen2ForCausalLM"
    @classmethod
    def _get_tensor_parallel_mappings(cls, config: ModelConfig, is_split=True):
--- a/fastdeploy/model_executor/models/qwen3.py
+++ b/fastdeploy/model_executor/models/qwen3.py
@@ -334,6 +334,10 @@ class Qwen3PretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "Qwen3ForCausalLM"
    @classmethod
    def _get_tensor_parallel_mappings(cls, config, is_split=True):
--- a/fastdeploy/model_executor/models/qwen3moe.py
+++ b/fastdeploy/model_executor/models/qwen3moe.py
@@ -324,6 +324,10 @@ class Qwen3MoePretrainedModel(PretrainedModel):
        """
        return None
    @classmethod
    def arch_name(self):
        return "Qwen3MoeForCausalLM"
    @classmethod
    def _get_tensor_parallel_mappings(cls, config, is_split=True):
        # TODO not support TP split now, next PR will support TP.
--- a/fastdeploy/plugins/init.py
+++ b/fastdeploy/plugins/init.py
@@ -0,0 +1,20 @@
 """
 # Copyright (c) 2025  PaddlePaddle Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License"
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """
 from .model_register import load_model_register_plugins
 from .model_runner import load_model_runner_plugins
 __all__ = ["load_model_register_plugins", "load_model_runner_plugins"]
--- a/fastdeploy/plugins/model_register/init.py
+++ b/fastdeploy/plugins/model_register/init.py
@@ -0,0 +1,33 @@
 """
 # Copyright (c) 2025  PaddlePaddle Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License"
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """
 from fastdeploy.plugins.utils import load_plugins_by_group, plugins_loaded
 # make sure one process only loads plugins once
 PLUGINS_GROUP = "fastdeploy.model_register_plugins"
 def load_model_register_plugins():
    """load_model_runner_plugins"""
    global plugins_loaded
    if plugins_loaded:
        return
    plugins_loaded = True
    plugins = load_plugins_by_group(group=PLUGINS_GROUP)
    # general plugins, we only need to execute the loaded functions
    for func in plugins.values():
        func()
--- a/fastdeploy/plugins/model_runner/init.py
+++ b/fastdeploy/plugins/model_runner/init.py
@@ -0,0 +1,32 @@
 """
 # Copyright (c) 2025  PaddlePaddle Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License"
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """
 from fastdeploy.plugins.utils import load_plugins_by_group, plugins_loaded
 # use for modle runner
 PLUGINS_GROUP = "fastdeploy.model_runner_plugins"
 def load_model_runner_plugins():
    """load_model_runner_plugins"""
    global plugins_loaded
    if plugins_loaded:
        return
    plugins_loaded = True
    plugins = load_plugins_by_group(group=PLUGINS_GROUP)
    assert len(plugins) == 1, "Only one plugin is allowed to be loaded."
    return next(iter(plugins.values()))
--- a/fastdeploy/plugins/utils.py
+++ b/fastdeploy/plugins/utils.py
@@ -0,0 +1,61 @@
 """
 # Copyright (c) 2025  PaddlePaddle Authors. All Rights Reserved.
 #
 # Licensed under the Apache License, Version 2.0 (the "License"
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #     http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 """
 from typing import Any, Callable
 from fastdeploy import envs
 from fastdeploy.utils import llm_logger as logger
 plugins_loaded = False
 def load_plugins_by_group(group: str) -> dict[str, Callable[[], Any]]:
    import sys
    if sys.version_info < (3, 10):
        from importlib_metadata import entry_points
    else:
        from importlib.metadata import entry_points
    allowed_plugins = envs.FD_PLUGINS
    discovered_plugins = entry_points(group=group)
    if len(discovered_plugins) == 0:
        logger.info("No plugins for group %s found.", group)
        return {}
    logger.info("Available plugins for group %s:", group)
    for plugin in discovered_plugins:
        logger.info("- %s -> %s", plugin.name, plugin.value)
    if allowed_plugins is None:
        logger.info(
            "All plugins in this group will be loaded. " "You can set `FD_PLUGINS` to control which plugins to load."
        )
    plugins = dict[str, Callable[[], Any]]()
    for plugin in discovered_plugins:
        if allowed_plugins is None or plugin.name in allowed_plugins:
            if allowed_plugins is not None:
                logger.info("Loading plugin %s", plugin.name)
            try:
                func = plugin.load()
                plugins[plugin.name] = func
            except Exception:
                logger.exception("Failed to load plugin %s", plugin.name)
    return plugins
--- a/fastdeploy/rl/rollout_model.py
+++ b/fastdeploy/rl/rollout_model.py
@@ -22,7 +22,7 @@ from paddle import nn
 from fastdeploy.config import FDConfig
 from fastdeploy.model_executor.models.ernie4_5_moe import (
    Ernie4_5_MoeForCausalLM,
-    Ernie4_5_PretrainedModel,
+    Ernie4_5_MoePretrainedModel,
 )
 from fastdeploy.model_executor.models.ernie4_5_vl.ernie4_5_vl_moe import (
    Ernie4_5_VLMoeForConditionalGeneration,
@@ -126,7 +126,7 @@ class Ernie4_5_MoeForCausalLMRL(Ernie4_5_MoeForCausalLM, BaseRLModel):
    Ernie4_5_MoeForCausalLMRL
    """
-    _get_tensor_parallel_mappings = Ernie4_5_PretrainedModel._get_tensor_parallel_mappings
+    _get_tensor_parallel_mappings = Ernie4_5_MoePretrainedModel._get_tensor_parallel_mappings
    def __init__(self, fd_config: FDConfig):
        """
--- a/fastdeploy/worker/worker_process.py
+++ b/fastdeploy/worker/worker_process.py
@@ -748,4 +748,7 @@ def run_worker_proc() -> None:
 if __name__ == "__main__":
    from fastdeploy.plugins.model_register import load_model_register_plugins
    load_model_register_plugins()
    run_worker_proc()
--- a/test/plugins/fd_add_dummy_model/init.py
+++ b/test/plugins/fd_add_dummy_model/init.py
@@ -0,0 +1,35 @@
 from paddleformers.transformers import PretrainedModel
 from fastdeploy import ModelRegistry
 from fastdeploy.model_executor.models.model_base import ModelForCasualLM
 class MyPretrainedModel(PretrainedModel):
    @classmethod
    def arch_names(cls):
        return "MyModelForCasualLM"
 class MyModelForCasualLM(ModelForCasualLM):
    def __init__(self, fd_config):
        """
        Args:
            fd_config : Configurations for the LLM model.
        """
        super().__init__(fd_config)
        print("init done")
    @classmethod
    def name(cls):
        return "MyModelForCasualLM"
    def compute_logits(self, logits):
        logits[:, 0] += 1.0
        return logits
 def register():
    if "MyModelForCasualLM" not in ModelRegistry.get_supported_archs():
        ModelRegistry.register_model_class(MyModelForCasualLM)
        ModelRegistry.register_pretrained_model(MyPretrainedModel)
--- a/test/plugins/setup.py
+++ b/test/plugins/setup.py
@@ -0,0 +1,15 @@
 from setuptools import setup
 setup(
    name="fastdeploy-plugins",
    version="0.1",
    packages=["fd_add_dummy_model"],
    entry_points={
        "fastdeploy.model_register_plugins": [
            "fd_add_dummy_model = fd_add_dummy_model:register",
        ],
        # 'fastdeploy.model_runner_plugins': [
        #     "model_runner = model_runner:get_runner"
        # ]
    },
 )
--- a/test/plugins/test_model_registry.py
+++ b/test/plugins/test_model_registry.py
@@ -0,0 +1,32 @@
 import unittest
 from fastdeploy import ModelRegistry
 from fastdeploy.plugins import load_model_register_plugins
 class TestModelRegistryPlugins(unittest.TestCase):
    def test_plugin_registers_one_architecture(self):
        """Test that loading plugins registers exactly one new architecture."""
        initial_archs = set(ModelRegistry.get_supported_archs())
        print("Supported architectures before loading plugins:", sorted(initial_archs))
        # Load plugins
        load_model_register_plugins()
        final_archs = set(ModelRegistry.get_supported_archs())
        print("Supported architectures after loading plugins:", sorted(final_archs))
        added_archs = final_archs - initial_archs
        added_count = len(added_archs)
        # verify
        self.assertEqual(
            added_count,
            1,
            f"Expected exactly 1 new architecture to be registered by plugins, "
            f"but {added_count} were added: {sorted(added_archs)}",
        )
 if __name__ == "__main__":
    unittest.main()