mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-12-24 13:28:13 +08:00
[Optim] Remove limitation of number of kvcache blocks (#5612)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [Optim] Remove limitation of number of kvcache blocks * Update fastdeploy/envs.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/worker/iluvatar_worker.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Add docs * Update fastdeploy/worker/worker_process.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix ci case --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
@@ -88,4 +88,7 @@ environment_variables: dict[str, Callable[[], Any]] = {
|
||||
|
||||
# cache_transfer_manager 进程残留时连续错误阈值
|
||||
"FD_CACHE_PROC_ERROR_COUNT": lambda: int(os.getenv("FD_CACHE_PROC_ERROR_COUNT", "10")),}
|
||||
|
||||
# KVCache Block块分配值的上限。此变量限制引擎分配的块数上限。当为默认值-1时表示不设限
|
||||
"FD_MAX_KVCACHE_BLOCKS": lambda: int(os.getenv("FD_MAX_KVCACHE_BLOCKS", "-1")),
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user