[4/9] bitbake/runqueue: rework 'bitbake -S printdiff' logic

Message ID	20231214134528.1973602-4-alex@linutronix.de
State	New
Headers	show Return-Path: <alex.kanavin@gmail.com> ip: 209.85.128.41, mailfrom: alex.kanavin@gmail.com) From: Alexander Kanavin <alex.kanavin@gmail.com> To: openembedded-core@lists.openembedded.org Cc: Alexander Kanavin <alex@linutronix.de> Subject: [PATCH 4/9] bitbake/runqueue: rework 'bitbake -S printdiff' logic Date: Thu, 14 Dec 2023 14:45:23 +0100 Message-Id: <20231214134528.1973602-4-alex@linutronix.de> In-Reply-To: <20231214134528.1973602-1-alex@linutronix.de> References: <20231214134528.1973602-1-alex@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[1/9] oeqa/selftest/sstatetests: re-work CDN tests, add local cache tests \| expand [1/9] oeqa/selftest/sstatetests: re-work CDN tests, add local cache tests [2/9] gnu-config: delete do_compile task [3/9] bitbake/runqueue: initialize RunQueueExecute before printdiff rather than after [4/9] bitbake/runqueue: rework 'bitbake -S printdiff' logic [5/9] selftest/sstatetests: fix up printdiff test to match rework of printdiff logic [6/9] sstatesig/find_siginfo: unify a disjointed API [7/9] bitbake-diffsigs/runqueue: adapt to reworked find_siginfo() [8/9] bitbake/runqueue: prioritize local stamps over sstate signatures in printdiff [9/9] bitbake/runqueue: add debugging for find_siginfo() calls

Message ID

20231214134528.1973602-4-alex@linutronix.de

State

New

Headers

From: Alexander Kanavin <alex.kanavin@gmail.com>
To: openembedded-core@lists.openembedded.org
Cc: Alexander Kanavin <alex@linutronix.de>
Subject: [PATCH 4/9] bitbake/runqueue: rework 'bitbake -S printdiff' logic
Date: Thu, 14 Dec 2023 14:45:23 +0100
Message-Id: <20231214134528.1973602-4-alex@linutronix.de>
In-Reply-To: <20231214134528.1973602-1-alex@linutronix.de>
References: <20231214134528.1973602-1-alex@linutronix.de>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

[1/9] oeqa/selftest/sstatetests: re-work CDN tests, add local cache tests | expand

Commit Message

Alexander Kanavin Dec. 14, 2023, 1:45 p.m. UTC

Previously printdiff code would iterate over tasks that were reported as invalid or absent,
trying to follow dependency chains that would reach the most basic invalid items in the tree.

While this works in tightly controlled local builds, it can lead to bizarre reports
against industrial-sized sstate caches, as the code would not consider whether the
overall target can be fulfilled from valid sstate objects, and instead report
missing sstate signature files that perhaps were never even created due to hash
equivalency providing shortcuts in builds.

This commit reworks the logic in two ways:

- start the iteration over final targets rather than missing objects
and stop at the first invalid object (1)

- if a given object can be fulfilled from sstate, recurse only into
its setscene dependencies; bitbake wouldn't care if dependencies
for the actual task are absent, and neither should printdiff

(1) there's a further improvement I'd like to make here: instead of
stopping, recurse into dependencies, with the goal of finding and reporting
the 'root' invalid objects. Otherwise tracking down the difference to its root
relies on finding the most 'recent' signature in stamps or sstate in a
different function later, and recursively comparing that to the current signature,
which is unreliable on real world caches. For the sake of fixing the test failures
I'd like to not delay the patchset any further for now.

[YOCTO #15289]

Signed-off-by: Alexander Kanavin <alex@linutronix.de>
---
 bitbake/lib/bb/runqueue.py | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py
index 1f59d18b71a..61effe24fae 100644
--- a/bitbake/lib/bb/runqueue.py
+++ b/bitbake/lib/bb/runqueue.py
@@ -1717,35 +1717,34 @@  class RunQueue:
                     valid_new.add(dep)
 
         invalidtasks = set()
-        for tid in self.rqdata.runtaskentries:
-            if tid not in valid_new and tid not in noexec:
-                invalidtasks.add(tid)
 
-        found = set()
-        processed = set()
-        for tid in invalidtasks:
+        toptasks = set(["{}:{}".format(t[3], t[2]) for t in self.rqdata.targets])
+        for tid in toptasks:
             toprocess = set([tid])
             while toprocess:
                 next = set()
                 for t in toprocess:
-                    for dep in self.rqdata.runtaskentries[t].depends:
-                        if dep in invalidtasks:
-                            found.add(tid)
-                        if dep not in processed:
-                            processed.add(dep)
+                    if t not in valid_new and t not in noexec:
+                        invalidtasks.add(t)
+                        continue
+                    if t in self.rqdata.runq_setscene_tids:
+                        for dep in self.rqexe.sqdata.sq_deps[t]:
                             next.add(dep)
+                        continue
+
+                    for dep in self.rqdata.runtaskentries[t].depends:
+                        next.add(dep)
+
                 toprocess = next
-                if tid in found:
-                    toprocess = set()
 
         tasklist = []
-        for tid in invalidtasks.difference(found):
+        for tid in invalidtasks:
             tasklist.append(tid)
 
         if tasklist:
             bb.plain("The differences between the current build and any cached tasks start at the following tasks:\n" + "\n".join(tasklist))
 
-        return invalidtasks.difference(found)
+        return invalidtasks
 
     def write_diffscenetasks(self, invalidtasks):

[4/9] bitbake/runqueue: rework 'bitbake -S printdiff' logic

Commit Message

Patch