[LLM] First commit the llm deployment code

2025-12-24 13:28:13 +08:00 · 2025-06-09 19:20:15 +08:00
parent 8513414112
commit 149c79699d
11814 changed files with 127294 additions and 1293102 deletions
--- a/docs/code_guide.md
+++ b/docs/code_guide.md
@@ -0,0 +1,22 @@
+# 代码说明
+下边按照目录结构来介绍一下每个FastDeploy的代码结构及代码功能。
+- custom_ops：存放FastDeploy运行大模型所使用到的C++算子，不同硬件下的算子放置到对应的目录下（cpu_ops/gpu_ops），根目录下的setup_*.py文件用来编译上述C++代码的算子。
+- dockerfiles：存放运行FastDeploy的环境镜像dockerfile。
+- docs：FastDeploy代码库有关的说明文档。
+- fastdeploy
+  - agent：大模型服务启动使用到的脚本
+  - engine：管理大模型整体执行引擎类有关代码
+  - entrypoints：用户入口调用接口
+  - input：用户输入处理模块，包括预处理，多模态输入处理，tokenize等功能
+  - metrics:监控系统的一些指标 耗时之类等功能
+  - model_executor
+    -
+    - layers：大模型组网需要用到的layer模块
+    - model_runner：模型推理执行模块
+    - models：FastDeploy内置的大模型类模块
+    - ops：由custom_ops编译后可供python调用的算子模块，不同硬件平台的算子放置到对应的目录里
+  - output：大模型输出有关处理
+  - platforms：与底层硬件功能支持有关的平台模块
+  - scheduler：大模型请求调度模块
+- scripts：FastDeploy用于执行功能的辅助脚本，比如编译，单测执行，代码风格纠正等
+- test：项目单测验证使用到的代码