mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-10-06 17:17:14 +08:00
Graceful shut down (#3785)
* feat(log):add_request_and_response_log * 优雅退出-接口增加退出时长参数
This commit is contained in:
71
docs/best_practices/graceful_shutdown_service.md
Normal file
71
docs/best_practices/graceful_shutdown_service.md
Normal file
@@ -0,0 +1,71 @@
|
|||||||
|
# Graceful Service Node Shutdown Solution
|
||||||
|
|
||||||
|
## 1. Core Objective
|
||||||
|
Achieve graceful shutdown of service nodes, ensuring no in-flight user requests are lost during service termination while maintaining overall cluster availability.
|
||||||
|
|
||||||
|
## 2. Solution Overview
|
||||||
|
This solution combines **Nginx reverse proxy**, **Gunicorn server**, **Uvicorn server**, and **FastAPI** working in collaboration to achieve the objective.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## 3. Component Introduction
|
||||||
|
|
||||||
|
### 1. Nginx: Traffic Entry Point and Load Balancer
|
||||||
|
- **Functions**:
|
||||||
|
- Acts as a reverse proxy, receiving all external client requests and distributing them to upstream Gunicorn worker nodes according to load balancing policies.
|
||||||
|
- Actively monitors backend node health status through health check mechanisms.
|
||||||
|
- Enables instantaneous removal of problematic nodes from the service pool through configuration management, achieving traffic switching.
|
||||||
|
|
||||||
|
### 2. Gunicorn: WSGI HTTP Server (Process Manager)
|
||||||
|
- **Functions**:
|
||||||
|
- Serves as the master process, managing multiple Uvicorn worker child processes.
|
||||||
|
- Receives external signals (e.g., `SIGTERM`) and coordinates the graceful shutdown process for all child processes.
|
||||||
|
- Daemonizes worker processes and automatically restarts them upon abnormal termination, ensuring service robustness.
|
||||||
|
|
||||||
|
### 3. Uvicorn: ASGI Server (Worker Process)
|
||||||
|
- **Functions**:
|
||||||
|
- Functions as a Gunicorn-managed worker, actually handling HTTP requests.
|
||||||
|
- Runs the FastAPI application instance, processing specific business logic.
|
||||||
|
- Implements the ASGI protocol, supporting asynchronous request processing for high performance.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Advantages
|
||||||
|
|
||||||
|
1. **Nginx**:
|
||||||
|
- Can quickly isolate faulty nodes, ensuring overall service availability.
|
||||||
|
- Allows configuration updates without downtime using `nginx -s reload`, making it transparent to users.
|
||||||
|
|
||||||
|
2. **Gunicorn** (Compared to Uvicorn's native multi-worker mode):
|
||||||
|
- **Mature Process Management**: Built-in comprehensive process spawning, recycling, and management logic, eliminating the need for custom implementation.
|
||||||
|
- **Process Daemon Capability**: The Gunicorn Master automatically forks new Workers if they crash, whereas in Uvicorn's `--workers` mode, any crashed process is not restarted and requires an external daemon.
|
||||||
|
- **Rich Configuration**: Offers numerous parameters for adjusting timeouts, number of workers, restart policies, etc.
|
||||||
|
|
||||||
|
3. **Uvicorn**:
|
||||||
|
- Extremely fast, built on uvloop and httptools.
|
||||||
|
- Natively supports graceful shutdown: upon receiving a shutdown signal, it stops accepting new connections and waits for existing requests to complete before exiting.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Graceful Shutdown Procedure
|
||||||
|
|
||||||
|
When a specific node needs to be taken offline, the steps are as follows:
|
||||||
|
|
||||||
|
1. **Nginx Monitors Node Health Status**:
|
||||||
|
- Monitors the node's health status by periodically sending health check requests to it.
|
||||||
|
|
||||||
|
2. **Removal from Load Balancing**:
|
||||||
|
- Modify the Nginx configuration to mark the target node as `down` and reload the Nginx configuration.
|
||||||
|
- Subsequently, all new requests will no longer be sent to the target node.
|
||||||
|
|
||||||
|
3. **Gunicorn Server**:
|
||||||
|
- Monitors for stop signals. Upon receiving a stop signal (e.g., `SIGTERM`), it relays this signal to all Uvicorn child processes.
|
||||||
|
|
||||||
|
4. **Sending the Stop Signal**:
|
||||||
|
- Send a `SIGTERM` signal to the Uvicorn process on the target node, triggering Uvicorn's graceful shutdown process.
|
||||||
|
|
||||||
|
5. **Waiting for Request Processing**:
|
||||||
|
- Wait for a period slightly longer than `timeout_graceful_shutdown` before forcefully terminating the service, allowing the node sufficient time to complete processing all received requests.
|
||||||
|
|
||||||
|
6. **Shutdown Completion**:
|
||||||
|
- The node has now processed all remaining requests and exited safely.
|
BIN
docs/best_practices/images/graceful_shutdown.png
Normal file
BIN
docs/best_practices/images/graceful_shutdown.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 180 KiB |
71
docs/zh/best_practices/graceful_shutdown_service.md
Normal file
71
docs/zh/best_practices/graceful_shutdown_service.md
Normal file
@@ -0,0 +1,71 @@
|
|||||||
|
# 服务节点优雅关闭方案
|
||||||
|
|
||||||
|
## 1. 核心目标
|
||||||
|
实现服务节点的优雅关闭,确保在停止服务时不丢失任何正在处理的用户请求,同时不影响整个集群的可用性。
|
||||||
|
|
||||||
|
## 2. 实现方案说明
|
||||||
|
该方案通过结合 **Nginx 反向代理**、**Gunicorn 服务器**、**Uvicorn 服务器** 和 **FastAPI** 协作来实现目标。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## 3. 组件介绍
|
||||||
|
|
||||||
|
### 1. Nginx:流量入口与负载均衡器
|
||||||
|
- **功能**:
|
||||||
|
- 作为反向代理,接收所有外部客户端请求并按负载均衡策略分发到上游(Upstream)的 Gunicorn 工作节点。
|
||||||
|
- 通过健康检查机制主动监控后端节点的健康状态。
|
||||||
|
- 通过配置管理,能够瞬时地将问题节点从服务池中摘除,实现流量切换。
|
||||||
|
|
||||||
|
### 2. Gunicorn:WSGI HTTP 服务器(进程管理器)
|
||||||
|
- **功能**:
|
||||||
|
- 作为主进程(Master Process),负责管理多个 Uvicorn 工作子进程(Worker Process)。
|
||||||
|
- 接收外部信号(如 `SIGTERM`),并协调所有子进程的优雅关闭流程。
|
||||||
|
- 守护工作进程,在进程异常退出时自动重启,保证服务健壮性。
|
||||||
|
|
||||||
|
### 3. Uvicorn:ASGI 服务器(工作进程)
|
||||||
|
- **功能**:
|
||||||
|
- 作为 Gunicorn 管理的 Worker,实际负责处理 HTTP 请求。
|
||||||
|
- 运行 FastAPI 应用实例,处理具体的业务逻辑。
|
||||||
|
- 实现 ASGI 协议,支持异步请求处理,高性能。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 优势
|
||||||
|
|
||||||
|
1. **Nginx**:
|
||||||
|
- 能够快速隔离故障节点,保证整体服务的可用性。
|
||||||
|
- 通过 `nginx -s reload` 可不停机更新配置,对用户无感知。
|
||||||
|
|
||||||
|
2. **Gunicorn**(相比于 Uvicorn 原生的多 Worker):
|
||||||
|
- **成熟的进程管理**:内置了完善的进程生成、回收、管理逻辑,无需自己实现。
|
||||||
|
- **进程守护能力**:Gunicorn Master 会在 Worker 异常退出后自动 fork 新 Worker,而 Uvicorn `--workers` 模式下任何进程崩溃都不会被重新拉起,需要外部守护进程。
|
||||||
|
- **配置丰富**:提供大量参数用于调整超时、Worker 数量、重启策略等。
|
||||||
|
|
||||||
|
3. **Uvicorn**:
|
||||||
|
- 基于 uvloop 和 httptools,速度极快。
|
||||||
|
- 原生支持优雅关闭:在收到关闭信号后,会停止接受新连接,并等待现有请求处理完成后再退出。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 优雅关闭流程
|
||||||
|
|
||||||
|
当需要下线某个特定节点时,步骤如下:
|
||||||
|
|
||||||
|
1. **Nginx 监控节点状态是否健康**:
|
||||||
|
- 通过向节点定时发送 health 请求,监控节点的健康状态。
|
||||||
|
|
||||||
|
2. **从负载均衡中摘除**:
|
||||||
|
- 修改 Nginx 配置,将该节点标记为 `down` 状态,并重载 Nginx 配置。
|
||||||
|
- 此后,所有新请求将不再被发送到目标节点。
|
||||||
|
|
||||||
|
3. **Gunicorn 服务器**:
|
||||||
|
- 监控停止信号,收到停止信号(如 `SIGTERM` 信号)时,会把此信号向所有的 Uvicorn 子进程发送。
|
||||||
|
|
||||||
|
4. **发送停止信号**:
|
||||||
|
- 向目标节点的 Uvicorn 进程发送 `SIGTERM` 信号,触发 Uvicorn 的优雅关闭流程。
|
||||||
|
|
||||||
|
5. **等待请求处理**:
|
||||||
|
- 等待一段稍长于 `timeout_graceful_shutdown` 的时间后强制终止服务,让该节点有充足的时间完成所有已接收请求的处理。
|
||||||
|
|
||||||
|
6. **关闭完成**:
|
||||||
|
- 此时,该节点已经处理完所有存量请求并安全退出。
|
BIN
docs/zh/best_practices/images/graceful_shutdown.png
Normal file
BIN
docs/zh/best_practices/images/graceful_shutdown.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 180 KiB |
@@ -77,9 +77,17 @@ parser.add_argument(
|
|||||||
help="max waiting time for connection, if set value -1 means no waiting time limit",
|
help="max waiting time for connection, if set value -1 means no waiting time limit",
|
||||||
)
|
)
|
||||||
parser.add_argument("--max-concurrency", default=512, type=int, help="max concurrency")
|
parser.add_argument("--max-concurrency", default=512, type=int, help="max concurrency")
|
||||||
|
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
"--enable-mm-output", action="store_true", help="Enable 'multimodal_content' field in response output. "
|
"--enable-mm-output", action="store_true", help="Enable 'multimodal_content' field in response output. "
|
||||||
)
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--timeout-graceful-shutdown",
|
||||||
|
default=0,
|
||||||
|
type=int,
|
||||||
|
help="timeout for graceful shutdown in seconds (used by uvicorn)",
|
||||||
|
)
|
||||||
|
|
||||||
parser = EngineArgs.add_cli_args(parser)
|
parser = EngineArgs.add_cli_args(parser)
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
|
||||||
@@ -431,6 +439,7 @@ def launch_api_server() -> None:
|
|||||||
workers=args.workers,
|
workers=args.workers,
|
||||||
log_config=UVICORN_CONFIG,
|
log_config=UVICORN_CONFIG,
|
||||||
log_level="info",
|
log_level="info",
|
||||||
|
timeout_graceful_shutdown=args.timeout_graceful_shutdown,
|
||||||
) # set log level to error to avoid log
|
) # set log level to error to avoid log
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
api_server_logger.error(f"launch sync http server error, {e}, {str(traceback.format_exc())}")
|
api_server_logger.error(f"launch sync http server error, {e}, {str(traceback.format_exc())}")
|
||||||
|
Reference in New Issue
Block a user