[4/6,v2] server/process: Avoid hanging if a parser process is terminated

Richard Purdie March 28, 2022, 10:38 a.m. UTC
If a parser process is terminated while holding a write lock, then it
will lead to a deadlock (see

With SIGTERM, we don't want to terminate holding the lock. We also don't
want a SIGINT to cause a partial write to the event stream.

I tried using signal masks to avoid this but it doesn't work, see

Instead, add a signal handler and catch the calls around the critical section
which is heavier but safe and seems to avoid all the deadlocks I was able to

Some ideas from Peter Kjellerstedt <peter.kjellerstedt@axis.com>

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
 lib/bb/server/process.py | 15 +++++++++++++++
 1 file changed, 15 insertions(+)


diff --git a/lib/bb/server/process.py b/lib/bb/server/process.py
index efc3f04b4c..f59a0ed5f6 100644
--- a/lib/bb/server/process.py
+++ b/lib/bb/server/process.py
@@ -20,6 +20,7 @@  import os
 import sys
 import time
 import select
+import signal
 import socket
 import subprocess
 import errno
@@ -736,11 +737,25 @@  class ConnectionWriter(object):
         self.wlock = multiprocessing.Lock()
         # Why bb.event needs this I have no idea
         self.event = self
+        self.signal_received = []
+    def catch_sig(self, signum, frame):
+        self.signal_received.append(signum)
     def send(self, obj):
         obj = multiprocessing.reduction.ForkingPickler.dumps(obj)
+        # We must not terminate holding this lock. For SIGTERM, raising afterwards avoids this.
+        # For SIGINT, we don't want to have written partial data to the pipe.
+        # pthread_sigmask block/unblock would be nice but doesn't work, https://bugs.python.org/issue47139
+        oldsigterm = signal.signal(signal.SIGTERM, self.catch_sig)
+        oldsigint = signal.signal(signal.SIGINT, self.catch_sig)
         with self.wlock:
+        signal.signal(signal.SIGTERM, oldsigterm)
+        signal.signal(signal.SIGINT, oldsigint)
+        for sig in [signal.SIGTERM, signal.SIGINT]:
+            if sig in self.signal_received:
+                os.kill(os.getpid(), sig)
     def fileno(self):
         return self.writer.fileno()