Message ID | db083cfe9e33c9fd7ffeead7b8c6023a5d581976.1733344286.git.steve@sakoman.com |
---|---|
State | New |
Headers | show |
Series | [scarthgap,2.8,1/3] runqueue: Fix performance of multiconfigs with large overlap | expand |
Hi, Op 04-12-2024 om 21:33 schreef Steve Sakoman: > From: Richard Purdie <richard.purdie@linuxfoundation.org> > > There have been complaints about the performance of large multiconfig builds > for a while. The key missing data point was that the builds needed to have large > overlaps in sstate objects. This can be simulated by building the same things with > just different TMPDIRs. In runqueue/bitbake terms this equates to large numbers of > deferred tasks. > [..] > > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> > (cherry picked from commit 9c6c506757f2b3e28c8b20513b45da6b4659c95f) > Signed-off-by: Steve Sakoman <steve@sakoman.com> Our CI build uses a multiconfig setup to build for 10 machines with 3 different archs. On scarthgap with bitbake 2.8, that resulted in a really slow build: NOTE: Executing Tasks Bitbake still alive (no events for 600s). Active tasks: Bitbake still alive (no events for 1200s). Active tasks: Bitbake still alive (no events for 1800s). Active tasks: Bitbake still alive (no events for 2400s). Active tasks: Bitbake still alive (no events for 3000s). Active tasks: Bitbake still alive (no events for 3600s). Active tasks: Bitbake still alive (no events for 4200s). Active tasks: NOTE: Running task 1 of 79788 ... [...] After almost 14 hours it was at 'Running task 8881 of 79788', and I canceled the job. The cooker was running at 100% cpu, but there were hardly any bakers running. With this patch cherry-picked it builds successfully in 3h and 20min. (I did add INHERIT:remove = "create-spdx", since it triggers an sstate error causing the build to fail) So this makes a huge difference, at least for a setup like ours. Regards, Jeroen
diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py index 93079a977..744542b08 100644 --- a/lib/bb/runqueue.py +++ b/lib/bb/runqueue.py @@ -2195,6 +2195,9 @@ class RunQueueExecute: # Find the next setscene to run for nexttask in self.sorted_setscene_tids: if nexttask in self.sq_buildable and nexttask not in self.sq_running and self.sqdata.stamps[nexttask] not in self.build_stamps.values() and nexttask not in self.sq_harddep_deferred: + if nexttask in self.sq_deferred and self.sq_deferred[nexttask] not in self.runq_complete: + # Skip deferred tasks quickly before the 'expensive' tests below - this is key to performant multiconfig builds + continue if nexttask not in self.sqdata.unskippable and self.sqdata.sq_revdeps[nexttask] and \ nexttask not in self.sq_needed_harddeps and \ self.sqdata.sq_revdeps[nexttask].issubset(self.scenequeue_covered) and \ @@ -2224,8 +2227,7 @@ class RunQueueExecute: if t in self.runq_running and t not in self.runq_complete: continue if nexttask in self.sq_deferred: - if self.sq_deferred[nexttask] not in self.runq_complete: - continue + # Deferred tasks that were still deferred were skipped above so we now need to process logger.debug("Task %s no longer deferred" % nexttask) del self.sq_deferred[nexttask] valid = self.rq.validate_hashes(set([nexttask]), self.cooker.data, 0, False, summary=False)