[1/2] utils: Optimise signal/sigmask performance

Message ID	20250718135437.247096-1-richard.purdie@linuxfoundation.org
State	Accepted, archived
Commit	ee5fce67ce35b025c68aa61e2e758903269ee346
Headers	show Return-Path: <richard.purdie@linuxfoundation.org> ip: 209.85.221.43, mailfrom: richard.purdie@linuxfoundation.org) From: Richard Purdie <richard.purdie@linuxfoundation.org> To: bitbake-devel@lists.openembedded.org Subject: [PATCH 1/2] utils: Optimise signal/sigmask performance Date: Fri, 18 Jul 2025 14:54:36 +0100 Message-ID: <20250718135437.247096-1-richard.purdie@linuxfoundation.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[1/2] utils: Optimise signal/sigmask performance \| expand [1/2] utils: Optimise signal/sigmask performance [2/2] utils: Split profile reports into separate files

Richard Purdie July 18, 2025, 1:54 p.m. UTC

Running "time bitbake -pP idle" with a valid cache shows around 800,000
calls to enum creation from python's signal.py. We don't care about this
overhead and it adversely affects cache load time quite badly.

Try and use _signal directly, falling back to signal, which avoids
this overhead we don't need and makes cache loading much faster.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/utils.py | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

Joshua Watt July 18, 2025, 3:41 p.m. UTC | #1

LGTM

Reviewed-by: Joshua Watt <JPEWhacker@gmail.com>

On Fri, Jul 18, 2025 at 7:54 AM Richard Purdie via
lists.openembedded.org
<richard.purdie=linuxfoundation.org@lists.openembedded.org> wrote:
>
> Running "time bitbake -pP idle" with a valid cache shows around 800,000
> calls to enum creation from python's signal.py. We don't care about this
> overhead and it adversely affects cache load time quite badly.
>
> Try and use _signal directly, falling back to signal, which avoids
> this overhead we don't need and makes cache loading much faster.
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  lib/bb/utils.py | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/lib/bb/utils.py b/lib/bb/utils.py
> index f688f7dd687..016036dbce4 100644
> --- a/lib/bb/utils.py
> +++ b/lib/bb/utils.py
> @@ -2226,6 +2226,15 @@ def path_is_descendant(descendant, ancestor):
>
>      return False
>
> +# Recomputing the sets in signal.py is expensive (bitbake -pP idle)
> +# so try and use _signal directly to avoid it
> +valid_signals = signal.valid_signals()
> +try:
> +    import _signal
> +    sigmask = _signal.pthread_sigmask
> +except ImportError:
> +    sigmask = signal.pthread_sigmask
> +
>  # If we don't have a timeout of some kind and a process/thread exits badly (for example
>  # OOM killed) and held a lock, we'd just hang in the lock futex forever. It is better
>  # we exit at some point than hang. 5 minutes with no progress means we're probably deadlocked.
> @@ -2235,7 +2244,7 @@ def path_is_descendant(descendant, ancestor):
>  @contextmanager
>  def lock_timeout(lock):
>      try:
> -        s = signal.pthread_sigmask(signal.SIG_BLOCK, signal.valid_signals())
> +        s = sigmask(signal.SIG_BLOCK, valid_signals)
>          held = lock.acquire(timeout=5*60)
>          if not held:
>              bb.server.process.serverlog("Couldn't get the lock for 5 mins, timed out, exiting.\n%s" % traceback.format_stack())
> @@ -2243,17 +2252,17 @@ def lock_timeout(lock):
>          yield held
>      finally:
>          lock.release()
> -        signal.pthread_sigmask(signal.SIG_SETMASK, s)
> +        sigmask(signal.SIG_SETMASK, s)
>
>  # A version of lock_timeout without the check that the lock was locked and a shorter timeout
>  @contextmanager
>  def lock_timeout_nocheck(lock):
>      l = False
>      try:
> -        s = signal.pthread_sigmask(signal.SIG_BLOCK, signal.valid_signals())
> +        s = sigmask(signal.SIG_BLOCK, valid_signals)
>          l = lock.acquire(timeout=10)
>          yield l
>      finally:
>          if l:
>              lock.release()
> -        signal.pthread_sigmask(signal.SIG_SETMASK, s)
> +        sigmask(signal.SIG_SETMASK, s)
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#17794): https://lists.openembedded.org/g/bitbake-devel/message/17794
> Mute This Topic: https://lists.openembedded.org/mt/114221231/3616693
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [JPEWhacker@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>

chris.laplante@agilent.com July 28, 2025, 4:30 p.m. UTC | #2

Hi Richard,

> -----Original Message-----
> From: bitbake-devel@lists.openembedded.org <bitbake-
> devel@lists.openembedded.org> On Behalf Of Richard Purdie via
> lists.openembedded.org
> Sent: Friday, July 18, 2025 9:55 AM
> To: bitbake-devel@lists.openembedded.org
> Subject: [bitbake-devel] [PATCH 1/2] utils: Optimise signal/sigmask
> performance
> 
> 
> 
> External Sender - Use caution opening files, clicking links, or responding to
> requests.
> 
> 
> Running "time bitbake -pP idle" with a valid cache shows around 800,000 calls
> to enum creation from python's signal.py. We don't care about this overhead
> and it adversely affects cache load time quite badly.
> 
> Try and use _signal directly, falling back to signal, which avoids this overhead
> we don't need and makes cache loading much faster.
> 
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>

Coincidentally, for the past couple months I have occasionally been playing around with VizTracer (https://github.com/gaogaotiantian/viztracer) for profiling BitBake. Just today (on Scarthgap) while profiling the cache parsing, I also observed the issue with the sigmask. In case anyone is curious, here is what the results look like: https://i.imgur.com/BGdzSe5.png

Two questions: 
1. Would it be OK to backport this change to Scarthgap? (CC'd Steve as well)
2. Is there any interest in incorporating VizTracer into BitBake? 

Thanks,
Chris

chris.laplante@agilent.com July 28, 2025, 4:35 p.m. UTC | #3

> > -----Original Message-----
> > From: bitbake-devel@lists.openembedded.org <bitbake-
> > devel@lists.openembedded.org> On Behalf Of Richard Purdie via
> > lists.openembedded.org
> > Sent: Friday, July 18, 2025 9:55 AM
> > To: bitbake-devel@lists.openembedded.org
> > Subject: [bitbake-devel] [PATCH 1/2] utils: Optimise signal/sigmask
> > performance
> >
> >
> >
> > External Sender - Use caution opening files, clicking links, or
> > responding to requests.
> >
> >
> > Running "time bitbake -pP idle" with a valid cache shows around
> > 800,000 calls to enum creation from python's signal.py. We don't care
> > about this overhead and it adversely affects cache load time quite badly.
> >
> > Try and use _signal directly, falling back to signal, which avoids
> > this overhead we don't need and makes cache loading much faster.
> >
> > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> 
> Coincidentally, for the past couple months I have occasionally been playing
> around with VizTracer (https://github.com/gaogaotiantian/viztracer) for
> profiling BitBake. Just today (on Scarthgap) while profiling the cache parsing, I
> also observed the issue with the sigmask. In case anyone is curious, here is
> what the results look like: https://i.imgur.com/BGdzSe5.png
> 

By the way, at least on Scarthgap, this patch turned what used to be a 30s cache load (for a build with about 7 multiconfigs) into a near-instant process. So thanks :)

Thanks,
Chris

Ross Burton July 29, 2025, 10:38 a.m. UTC | #4

On 28 Jul 2025, at 17:30, Chris Laplante via lists.openembedded.org <chris.laplante=agilent.com@lists.openembedded.org> wrote:
> ly been playing around with VizTracer (https://github.com/gaogaotiantian/viztracer) for profiling BitBake. Just today (on Scarthgap) while profiling the cache parsing, I also observed the issue with the sigmask. In case anyone is curious, here is what the results look like: https://i.imgur.com/BGdzSe5.png

Me too!

Did you manage to get “viztracer bitbake …” to actually follow all of the forks and children correctly, or were you just wrapping the parse itself with the tracing?

Cheers,
Ross

chris.laplante@agilent.com July 29, 2025, 1:05 p.m. UTC | #5

Hi Ross,

> Me too!
> 
> Did you manage to get "viztracer bitbake ..." to actually follow all of the forks
> and children correctly, or were you just wrapping the parse itself with the
> tracing?

Neat! :) 

I have never gotten the command-line 'viztracer' to work well for bitbake. As I'm guessing you also found, it just can't follow what BitBake is doing despite all the documented support for concurrency. However, you can sort-of get it to "see everything" by manually wrapping the individual parts with unique file names, then post-processing to combine the results into a single file.

For example, in bin/bitbake-server, I just wrap the entire file with:

    import os
    from viztracer import VizTracer
    with VizTracer(output_file=f"server-{os.getpid()}.json") as tracer:

Then for bin/bitbake-worker:

    with VizTracer(output_file=f"worker-{os.getpid()}.json") as tracer:

etc. When bitbake finishes running, merge with:

viztracer --combine *.json

My captures are sometimes multiple GB in size, which the web viewer chokes on. For those cases, I use TraceProcessor to handle them: https://perfetto.dev/docs/visualization/large-traces

By the way, prior to using this capture method, I tried using their 'generic multi process support': https://viztracer.readthedocs.io/en/latest/concurrency.html#generic-multi-process-support. It was not worth the effort, because there are so many processes/threads in bitbake that it is difficult to propagate the init_kwargs through all the layers. 

Thanks,
Chris

Ross Burton July 29, 2025, 3:17 p.m. UTC | #6

On 29 Jul 2025, at 14:05, Chris Laplante via lists.openembedded.org <chris.laplante=agilent.com@lists.openembedded.org> wrote:
> I have never gotten the command-line 'viztracer' to work well for bitbake. As I'm guessing you also found, it just can't follow what BitBake is doing despite all the documented support for concurrency. However, you can sort-of get it to "see everything" by manually wrapping the individual parts with unique file names, then post-processing to combine the results into a single file.

I added the code to pass the state to the bitake-workers (see poky-contrib:ross/viztracer, absolutely not mergeable!) but it won’t follow into the parser pool, I’d not figured out why it wasn’t doing that yet.

Ross

chris.laplante@agilent.com July 29, 2025, 3:58 p.m. UTC | #7

> On 29 Jul 2025, at 14:05, Chris Laplante via lists.openembedded.org
> <chris.laplante=agilent.com@lists.openembedded.org> wrote:
> > I have never gotten the command-line 'viztracer' to work well for bitbake. As
> I'm guessing you also found, it just can't follow what BitBake is doing despite
> all the documented support for concurrency. However, you can sort-of get it
> to "see everything" by manually wrapping the individual parts with unique file
> names, then post-processing to combine the results into a single file.
> 
> I added the code to pass the state to the bitake-workers (see poky-
> contrib:ross/viztracer, absolutely not mergeable!) but it won’t follow into the
> parser pool, I’d not figured out why it wasn’t doing that yet.
> 
> Ross

I went searching for bug reports about multiprocessing.Process on viztracer's github and I found your issue report lol. 

Easiest thing to do here is pass down the tracer args through all the layers manually. Give the below patch a try.

Thanks,
Chris 




From e368baedb329d3a6f11f13bf930553faa6e692d2 Mon Sep 17 00:00:00 2001
From: Chris Laplante <chris.laplante@agilent.com>
Date: Tue, 29 Jul 2025 11:54:33 -0400
Subject: [PATCH] propogate viztracer traceargs down to the Parse processes

---
 bitbake/bin/bitbake-server       |  2 +-
 bitbake/lib/bb/cooker.py         | 20 +++++++++++++++-----
 bitbake/lib/bb/server/process.py |  4 ++--
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/bitbake/bin/bitbake-server b/bitbake/bin/bitbake-server
index 8e430e81ea0..f5c643b2834 100755
--- a/bitbake/bin/bitbake-server
+++ b/bitbake/bin/bitbake-server
@@ -63,5 +63,5 @@ logger = logging.getLogger("BitBake")
 handler = bb.event.LogHandler()
 logger.addHandler(handler)

-bb.server.process.execServer(lockfd, readypipeinfd, lockname, sockname, timeout, xmlrpcinterface, profile)
+bb.server.process.execServer(lockfd, readypipeinfd, lockname, sockname, timeout, xmlrpcinterface, profile, tracerargs)

diff --git a/bitbake/lib/bb/cooker.py b/bitbake/lib/bb/cooker.py
index 2eb64ef2372..edb4018cb46 100644
--- a/bitbake/lib/bb/cooker.py
+++ b/bitbake/lib/bb/cooker.py
@@ -129,7 +129,7 @@ class BBCooker:
     Manages one bitbake build run
     """

-    def __init__(self, featureSet=None, server=None):
+    def __init__(self, featureSet=None, server=None, tracerargs=None):
         self.recipecaches = None
         self.baseconfig_valid = False
         self.parsecache_valid = False
@@ -190,6 +190,7 @@ class BBCooker:
         signal.signal(signal.SIGHUP, self.sigterm_exception)

         bb.debug(1, "BBCooker startup complete %s" % time.time())
+        self.tracerargs = tracerargs

     def init_configdata(self):
         if not hasattr(self, "data"):
@@ -1631,7 +1632,7 @@ You can also remove the BB_HASHSERVE_UPSTREAM setting, but this may result in si
             for dirent in searchdirs:
                 self.add_filewatch([(dirent, bb.parse.cached_mtime_noerror(dirent))])

-            self.parser = CookerParser(self, mcfilelist, total_masked)
+            self.parser = CookerParser(self, mcfilelist, total_masked, self.tracerargs)
             self._parsecache_set(True)

         self.state = State.PARSING
@@ -1999,7 +2000,7 @@ class ParsingFailure(Exception):
         Exception.__init__(self, realexception, recipe)

 class Parser(multiprocessing.Process):
-    def __init__(self, jobs, next_job_id, results, quit, profile):
+    def __init__(self, jobs, next_job_id, results, quit, profile, tracer_args):
         self.jobs = jobs
         self.next_job_id = next_job_id
         self.results = results
@@ -2012,6 +2013,7 @@ class Parser(multiprocessing.Process):
         self.signal_received = []
         self.signal_threadlock = threading.Lock()
         self.exit = False
+        self.tracer_args = tracer_args

     def catch_sig(self, signum, frame):
         if self.queue_signals:
@@ -2027,6 +2029,12 @@ class Parser(multiprocessing.Process):
             self.exit = True

     def run(self):
+        if self.tracer_args:
+            import viztracer
+            init_kwargs = json.loads(self.tracer_args)
+            tracer = viztracer.VizTracer(**init_kwargs)
+            tracer.register_exit()
+            tracer.start()
         bb.utils.profile_function("parsing" in self.profile, self.realrun, "profile-parse-%s.log" % multiprocessing.current_process().name, process=False)

     def realrun(self):
@@ -2105,7 +2113,7 @@ class Parser(multiprocessing.Process):
             bb.event.LogHandler.filter = origfilter

 class CookerParser(object):
-    def __init__(self, cooker, mcfilelist, masked):
+    def __init__(self, cooker, mcfilelist, masked, tracerargs):
         self.mcfilelist = mcfilelist
         self.cooker = cooker
         self.cfgdata = cooker.data
@@ -2144,6 +2152,8 @@ class CookerParser(object):
                                  multiprocessing.cpu_count()), self.toparse)

         bb.cache.SiggenRecipeInfo.reset()
+        self.tracerargs = tracerargs
+
         self.start()
         self.haveshutdown = False
         self.syncthread = None
@@ -2162,7 +2172,7 @@ class CookerParser(object):
             # Have to pass in willparse at fork time so all parsing processes have the unpickleable data
             # then access it by index from the parse queue.
             for i in range(0, self.num_processes):
-                parser = Parser(self.willparse, next_job_id, self.result_queue, self.parser_quit, self.cooker.configuration.profile)
+                parser = Parser(self.willparse, next_job_id, self.result_queue, self.parser_quit, self.cooker.configuration.profile, self.tracerargs)
                 parser.start()
                 self.process_names.append(parser.name)
                 self.processes.append(parser)
diff --git a/bitbake/lib/bb/server/process.py b/bitbake/lib/bb/server/process.py
index a5ef30cb88b..2f5d9436a63 100644
--- a/bitbake/lib/bb/server/process.py
+++ b/bitbake/lib/bb/server/process.py
@@ -613,7 +613,7 @@ class BitBakeServer(object):

         os.execl(sys.executable, sys.executable, serverscript, "decafbad", str(self.bitbake_lock.fileno()), str(self.readypipein), self.logfile, self.bitbake_lock.name, self.sockname,  str(self.server_timeout or 0), str(list(self.profile)), str(self.xmlrpcinterface[0]), str(self.xmlrpcinterface[1]), tracer_args)

-def execServer(lockfd, readypipeinfd, lockname, sockname, server_timeout, xmlrpcinterface, profile):
+def execServer(lockfd, readypipeinfd, lockname, sockname, server_timeout, xmlrpcinterface, profile, tracerargs):

     import bb.cookerdata
     import bb.cooker
@@ -642,7 +642,7 @@ def execServer(lockfd, readypipeinfd, lockname, sockname, server_timeout, xmlrpc
         writer = ConnectionWriter(readypipeinfd)
         try:
             featureset = []
-            cooker = bb.cooker.BBCooker(featureset, server)
+            cooker = bb.cooker.BBCooker(featureset, server, tracerargs)
             cooker.configuration.profile = profile
         except bb.BBHandledException:
             return None
--
2.43.0

[1/2] utils: Optimise signal/sigmask performance

Commit Message

Comments

Patch