[Optim] Remove limitation of number of kvcache blocks (#5612)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled

* [Optim] Remove limitation of number of kvcache blocks

* Update fastdeploy/envs.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/worker/iluvatar_worker.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Add docs

* Update fastdeploy/worker/worker_process.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix ci case

---------

Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Jiang-Jia-Jun
2025-12-23 11:18:29 +08:00
committed by GitHub
parent 4a74f5ab9b
commit 9da89a374b
7 changed files with 20 additions and 12 deletions

View File

@@ -879,7 +879,7 @@ def test_structured_outputs_grammar(openai_client):
def test_profile_reset_block_num():
"""测试profile reset_block_num功能与baseline diff不能超过5%"""
log_file = "./log/config.log"
baseline = 40000
baseline = 65565
if not os.path.exists(log_file):
pytest.fail(f"Log file not found: {log_file}")

View File

@@ -636,7 +636,7 @@ def test_chat_with_reasoning_max_tokens(openai_client):
def test_profile_reset_block_num():
"""测试profile reset_block_num功能与baseline diff不能超过5%"""
log_file = "./log/config.log"
baseline = 40000
baseline = 65565
if not os.path.exists(log_file):
pytest.fail(f"Log file not found: {log_file}")