Graceful shut down (#3785)

* feat(log):add_request_and_response_log

* 优雅退出-接口增加退出时长参数
This commit is contained in:
xiaolei373
2025-09-04 19:33:50 +08:00
committed by GitHub
parent 88d44a2c93
commit ed97cf8396
5 changed files with 151 additions and 0 deletions

View File

@@ -0,0 +1,71 @@
# Graceful Service Node Shutdown Solution
## 1. Core Objective
Achieve graceful shutdown of service nodes, ensuring no in-flight user requests are lost during service termination while maintaining overall cluster availability.
## 2. Solution Overview
This solution combines **Nginx reverse proxy**, **Gunicorn server**, **Uvicorn server**, and **FastAPI** working in collaboration to achieve the objective.
![graceful_shutdown](images/graceful_shutdown.png)
## 3. Component Introduction
### 1. Nginx: Traffic Entry Point and Load Balancer
- **Functions**:
- Acts as a reverse proxy, receiving all external client requests and distributing them to upstream Gunicorn worker nodes according to load balancing policies.
- Actively monitors backend node health status through health check mechanisms.
- Enables instantaneous removal of problematic nodes from the service pool through configuration management, achieving traffic switching.
### 2. Gunicorn: WSGI HTTP Server (Process Manager)
- **Functions**:
- Serves as the master process, managing multiple Uvicorn worker child processes.
- Receives external signals (e.g., `SIGTERM`) and coordinates the graceful shutdown process for all child processes.
- Daemonizes worker processes and automatically restarts them upon abnormal termination, ensuring service robustness.
### 3. Uvicorn: ASGI Server (Worker Process)
- **Functions**:
- Functions as a Gunicorn-managed worker, actually handling HTTP requests.
- Runs the FastAPI application instance, processing specific business logic.
- Implements the ASGI protocol, supporting asynchronous request processing for high performance.
---
## Advantages
1. **Nginx**:
- Can quickly isolate faulty nodes, ensuring overall service availability.
- Allows configuration updates without downtime using `nginx -s reload`, making it transparent to users.
2. **Gunicorn** (Compared to Uvicorn's native multi-worker mode):
- **Mature Process Management**: Built-in comprehensive process spawning, recycling, and management logic, eliminating the need for custom implementation.
- **Process Daemon Capability**: The Gunicorn Master automatically forks new Workers if they crash, whereas in Uvicorn's `--workers` mode, any crashed process is not restarted and requires an external daemon.
- **Rich Configuration**: Offers numerous parameters for adjusting timeouts, number of workers, restart policies, etc.
3. **Uvicorn**:
- Extremely fast, built on uvloop and httptools.
- Natively supports graceful shutdown: upon receiving a shutdown signal, it stops accepting new connections and waits for existing requests to complete before exiting.
---
## Graceful Shutdown Procedure
When a specific node needs to be taken offline, the steps are as follows:
1. **Nginx Monitors Node Health Status**:
- Monitors the node's health status by periodically sending health check requests to it.
2. **Removal from Load Balancing**:
- Modify the Nginx configuration to mark the target node as `down` and reload the Nginx configuration.
- Subsequently, all new requests will no longer be sent to the target node.
3. **Gunicorn Server**:
- Monitors for stop signals. Upon receiving a stop signal (e.g., `SIGTERM`), it relays this signal to all Uvicorn child processes.
4. **Sending the Stop Signal**:
- Send a `SIGTERM` signal to the Uvicorn process on the target node, triggering Uvicorn's graceful shutdown process.
5. **Waiting for Request Processing**:
- Wait for a period slightly longer than `timeout_graceful_shutdown` before forcefully terminating the service, allowing the node sufficient time to complete processing all received requests.
6. **Shutdown Completion**:
- The node has now processed all remaining requests and exited safely.

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

View File

@@ -0,0 +1,71 @@
# 服务节点优雅关闭方案
## 1. 核心目标
实现服务节点的优雅关闭,确保在停止服务时不丢失任何正在处理的用户请求,同时不影响整个集群的可用性。
## 2. 实现方案说明
该方案通过结合 **Nginx 反向代理**、**Gunicorn 服务器**、**Uvicorn 服务器** 和 **FastAPI** 协作来实现目标。
![graceful_shutdown](images/graceful_shutdown.png)
## 3. 组件介绍
### 1. Nginx流量入口与负载均衡器
- **功能**
- 作为反向代理接收所有外部客户端请求并按负载均衡策略分发到上游Upstream的 Gunicorn 工作节点。
- 通过健康检查机制主动监控后端节点的健康状态。
- 通过配置管理,能够瞬时地将问题节点从服务池中摘除,实现流量切换。
### 2. GunicornWSGI HTTP 服务器(进程管理器)
- **功能**
- 作为主进程Master Process负责管理多个 Uvicorn 工作子进程Worker Process
- 接收外部信号(如 `SIGTERM`),并协调所有子进程的优雅关闭流程。
- 守护工作进程,在进程异常退出时自动重启,保证服务健壮性。
### 3. UvicornASGI 服务器(工作进程)
- **功能**
- 作为 Gunicorn 管理的 Worker实际负责处理 HTTP 请求。
- 运行 FastAPI 应用实例,处理具体的业务逻辑。
- 实现 ASGI 协议,支持异步请求处理,高性能。
---
## 优势
1. **Nginx**
- 能够快速隔离故障节点,保证整体服务的可用性。
- 通过 `nginx -s reload` 可不停机更新配置,对用户无感知。
2. **Gunicorn**(相比于 Uvicorn 原生的多 Worker
- **成熟的进程管理**:内置了完善的进程生成、回收、管理逻辑,无需自己实现。
- **进程守护能力**Gunicorn Master 会在 Worker 异常退出后自动 fork 新 Worker而 Uvicorn `--workers` 模式下任何进程崩溃都不会被重新拉起,需要外部守护进程。
- **配置丰富**提供大量参数用于调整超时、Worker 数量、重启策略等。
3. **Uvicorn**
- 基于 uvloop 和 httptools速度极快。
- 原生支持优雅关闭:在收到关闭信号后,会停止接受新连接,并等待现有请求处理完成后再退出。
---
## 优雅关闭流程
当需要下线某个特定节点时,步骤如下:
1. **Nginx 监控节点状态是否健康**
- 通过向节点定时发送 health 请求,监控节点的健康状态。
2. **从负载均衡中摘除**
- 修改 Nginx 配置,将该节点标记为 `down` 状态,并重载 Nginx 配置。
- 此后,所有新请求将不再被发送到目标节点。
3. **Gunicorn 服务器**
- 监控停止信号,收到停止信号(如 `SIGTERM` 信号)时,会把此信号向所有的 Uvicorn 子进程发送。
4. **发送停止信号**
- 向目标节点的 Uvicorn 进程发送 `SIGTERM` 信号,触发 Uvicorn 的优雅关闭流程。
5. **等待请求处理**
- 等待一段稍长于 `timeout_graceful_shutdown` 的时间后强制终止服务,让该节点有充足的时间完成所有已接收请求的处理。
6. **关闭完成**
- 此时,该节点已经处理完所有存量请求并安全退出。

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

View File

@@ -77,9 +77,17 @@ parser.add_argument(
help="max waiting time for connection, if set value -1 means no waiting time limit", help="max waiting time for connection, if set value -1 means no waiting time limit",
) )
parser.add_argument("--max-concurrency", default=512, type=int, help="max concurrency") parser.add_argument("--max-concurrency", default=512, type=int, help="max concurrency")
parser.add_argument( parser.add_argument(
"--enable-mm-output", action="store_true", help="Enable 'multimodal_content' field in response output. " "--enable-mm-output", action="store_true", help="Enable 'multimodal_content' field in response output. "
) )
parser.add_argument(
"--timeout-graceful-shutdown",
default=0,
type=int,
help="timeout for graceful shutdown in seconds (used by uvicorn)",
)
parser = EngineArgs.add_cli_args(parser) parser = EngineArgs.add_cli_args(parser)
args = parser.parse_args() args = parser.parse_args()
@@ -431,6 +439,7 @@ def launch_api_server() -> None:
workers=args.workers, workers=args.workers,
log_config=UVICORN_CONFIG, log_config=UVICORN_CONFIG,
log_level="info", log_level="info",
timeout_graceful_shutdown=args.timeout_graceful_shutdown,
) # set log level to error to avoid log ) # set log level to error to avoid log
except Exception as e: except Exception as e:
api_server_logger.error(f"launch sync http server error, {e}, {str(traceback.format_exc())}") api_server_logger.error(f"launch sync http server error, {e}, {str(traceback.format_exc())}")