[Others] api server exits when worker process is dead (#3271)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled

* [fix] fix terminal hangs when worker process is dead

* [chore] change sleep time of monitor

* [chore] remove redundant comments
This commit is contained in:
李泳桦
2025-10-27 10:23:48 +08:00
committed by GitHub
parent ebae69b1f8
commit cdc40cdc2a
2 changed files with 26 additions and 0 deletions

View File

@@ -376,6 +376,7 @@ class LLMEngine:
exit sub services
"""
self.running = False
llm_logger.info("Engine shut down, exiting sub services...")
if hasattr(self, "cache_manager_processes"):
self.engine.resource_manager.cache_manager.shm_cache_task_flag_broadcast.clear()
@@ -394,6 +395,7 @@ class LLMEngine:
if hasattr(self, "get_profile_block_num_signal"):
self.get_profile_block_num_signal.clear()
if hasattr(self, "worker_proc") and self.worker_proc is not None:
try:
pgid = os.getpgid(self.worker_proc.pid)
@@ -403,6 +405,7 @@ class LLMEngine:
if hasattr(self, "zmq_server") and self.zmq_server is not None:
self.zmq_server.close()
if hasattr(self, "dp_processed"):
for p in self.dp_processed:
console_logger.info(f"Waiting for worker {p.pid} to exit")