[Optimization] support mm prefill batch (#5313)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled

* support mm prefill batch

* update code

* update code

* update code

* update code

* fix encoder cache bug

* update code

* update code

* fix bug

* fix paddle ocr bug

* fix xpu bug

* update code
This commit is contained in:
kevin
2025-12-11 22:21:14 +08:00
committed by GitHub
parent 7116982995
commit 954a145d57
14 changed files with 769 additions and 296 deletions

View File

@@ -21,18 +21,6 @@ from fastdeploy.utils import get_logger
logger = get_logger("prefix_cache_manager", "cache_manager.log")
DISABLE_PREFIX_CACHE_MM_MODEL: set[str] = {
"Ernie5ForCausalLM",
}
def is_mm_model_disable_prefix_cache(model_config):
"""
check if the model architecture is in DISABLE_PREFIX_CACHE_MM_MODEL
"""
return model_config._architecture in DISABLE_PREFIX_CACHE_MM_MODEL
class CacheStatus(Enum):
"""
cache status enum class