[Docx] add language (en/cn) switch links (#4470)

* add install docs

* 修改文档

* 修改文档
This commit is contained in:
yangjianfengo1
2025-10-17 15:47:41 +08:00
committed by GitHub
parent a3e0a15495
commit ba5c2b7e37
106 changed files with 206 additions and 0 deletions

View File

@@ -1,3 +1,5 @@
[简体中文](../zh/features/chunked_prefill.md)
# Chunked Prefill
Chunked Prefill employs a segmentation strategy that breaks down Prefill requests into smaller subtasks, which are then batched together with Decode requests. This approach better balances compute-intensive (Prefill) and memory-intensive (Decode) operations, optimizes GPU resource utilization, reduces computational overhead and memory footprint per Prefill, thereby lowering peak memory usage and avoiding out-of-memory issues.