[4/6,v2] server/process: Avoid hanging if a parser process is terminated

Message ID 20220328103809.1423865-1-richard.purdie@linuxfoundation.org
State New
Headers show
Series None | expand

Commit Message

Richard Purdie March 28, 2022, 10:38 a.m. UTC
If a parser process is terminated while holding a write lock, then it
will lead to a deadlock (see
https://docs.python.org/3/library/multiprocessing.html#multiprocessing.Process.terminate).

With SIGTERM, we don't want to terminate holding the lock. We also don't
want a SIGINT to cause a partial write to the event stream.

I tried using signal masks to avoid this but it doesn't work, see
https://bugs.python.org/issue47139

Instead, add a signal handler and catch the calls around the critical section
which is heavier but safe and seems to avoid all the deadlocks I was able to
trigger.

Some ideas from Peter Kjellerstedt <peter.kjellerstedt@axis.com>

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/server/process.py | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Patch

diff --git a/lib/bb/server/process.py b/lib/bb/server/process.py
index efc3f04b4c..f59a0ed5f6 100644
--- a/lib/bb/server/process.py
+++ b/lib/bb/server/process.py
@@ -20,6 +20,7 @@  import os
 import sys
 import time
 import select
+import signal
 import socket
 import subprocess
 import errno
@@ -736,11 +737,25 @@  class ConnectionWriter(object):
         self.wlock = multiprocessing.Lock()
         # Why bb.event needs this I have no idea
         self.event = self
+        self.signal_received = []
+
+    def catch_sig(self, signum, frame):
+        self.signal_received.append(signum)
 
     def send(self, obj):
         obj = multiprocessing.reduction.ForkingPickler.dumps(obj)
+        # We must not terminate holding this lock. For SIGTERM, raising afterwards avoids this.
+        # For SIGINT, we don't want to have written partial data to the pipe.
+        # pthread_sigmask block/unblock would be nice but doesn't work, https://bugs.python.org/issue47139
+        oldsigterm = signal.signal(signal.SIGTERM, self.catch_sig)
+        oldsigint = signal.signal(signal.SIGINT, self.catch_sig)
         with self.wlock:
             self.writer.send_bytes(obj)
+        signal.signal(signal.SIGTERM, oldsigterm)
+        signal.signal(signal.SIGINT, oldsigint)
+        for sig in [signal.SIGTERM, signal.SIGINT]:
+            if sig in self.signal_received:
+                os.kill(os.getpid(), sig)
 
     def fileno(self):
         return self.writer.fileno()