mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-12-24 13:28:13 +08:00
[Docx] add language (en/cn) switch links (#4470)
* add install docs * 修改文档 * 修改文档
This commit is contained in:
@@ -1,3 +1,5 @@
|
||||
[简体中文](../zh/features/prefix_caching.md)
|
||||
|
||||
# Prefix Caching
|
||||
|
||||
Prefix Caching is a technique to optimize the inference efficiency of generative models. Its core idea is to cache intermediate computation results (KV Cache) of input sequences, avoiding redundant computations and thereby accelerating response times for multiple requests sharing the same prefix.
|
||||
|
||||
Reference in New Issue
Block a user