diff mbox series

[3/8] checksum/fetch2: Switch from persist_data to a standard cache file

Message ID 20241008123627.252307-3-richard.purdie@linuxfoundation.org
State Accepted, archived
Commit fdc55bb649cb77456d0ac48a9600ef289a52af18
Headers show
Series [1/8] COW: Fix hardcoded magic numbers and work with python 3.13 | expand

Commit Message

Richard Purdie Oct. 8, 2024, 12:36 p.m. UTC
The sqlite connection handling is causing problems with python 3.13. The
connection can be closed at gc time which causing warnings and those
can appear at 'random' points and break output, causing weird failures
in different tinfoil tools and other tests.

Using sqlite as an IPC was never a great idea so drop that usage entirely
and just use the standard cache mechanism we already have for other
situations.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/cache.py           | 10 ++++++++++
 lib/bb/checksum.py        | 25 +++++++++++++++++++++++++
 lib/bb/cookerdata.py      |  5 +++--
 lib/bb/fetch2/__init__.py | 33 ++++++++++++++++++++-------------
 4 files changed, 58 insertions(+), 15 deletions(-)

Comments

chris.laplante@agilent.com Nov. 20, 2024, 4:09 a.m. UTC | #1
Hi Richard, Steve,

> -----Original Message-----
> From: bitbake-devel@lists.openembedded.org <bitbake-
> devel@lists.openembedded.org> On Behalf Of Richard Purdie via
> lists.openembedded.org
> Sent: Tuesday, October 8, 2024 8:36 AM
> To: bitbake-devel@lists.openembedded.org
> Subject: [bitbake-devel] [PATCH 3/8] checksum/fetch2: Switch from
> persist_data to a standard cache file
> 
> 
> The sqlite connection handling is causing problems with python 3.13. The
> connection can be closed at gc time which causing warnings and those can
> appear at 'random' points and break output, causing weird failures in different
> tinfoil tools and other tests.
> 
> Using sqlite as an IPC was never a great idea so drop that usage entirely and
> just use the standard cache mechanism we already have for other situations.

I revisited these warnings on Scarthgap (which doesn't yet have the persist_data changes), and was able to resolve them. Ultimately, it seemed the context managers were just not set up correctly for managing the Connection.

The __exit__ method of Connection only deals with transactions, and doesn't close any connection to the database: https://docs.python.org/3/library/sqlite3.html#how-to-use-the-connection-context-manager. But we were never using a context manager anyway when dealing with persist_data.

I was able to get rid of the warnings by adding this 'close' method to SQLTable:

    def close(self):
        self.connection.close()

And then changing usages of persist_data in fetch2 from something like this:

revs = bb.persist_data.persist('BB_URI_HEADREVS', d)

to this:

with contextlib.closing(bb.persist_data.persist('BB_URI_HEADREVS', d)) as revs:


I was wondering if you or Steve would be interested in a patch for this to apply to Scarthgap. If so, I'd be happy to send one in. 

Thanks,
Chris
Richard Purdie Nov. 20, 2024, 6:39 p.m. UTC | #2
On Wed, 2024-11-20 at 04:09 +0000, chris.laplante@agilent.com wrote:
> Hi Richard, Steve,
> 
> > -----Original Message-----
> > From: bitbake-devel@lists.openembedded.org <bitbake-
> > devel@lists.openembedded.org> On Behalf Of Richard Purdie via
> > lists.openembedded.org
> > Sent: Tuesday, October 8, 2024 8:36 AM
> > To: bitbake-devel@lists.openembedded.org
> > Subject: [bitbake-devel] [PATCH 3/8] checksum/fetch2: Switch from
> > persist_data to a standard cache file
> > 
> > 
> > The sqlite connection handling is causing problems with python
> > 3.13. The
> > connection can be closed at gc time which causing warnings and
> > those can
> > appear at 'random' points and break output, causing weird failures
> > in different
> > tinfoil tools and other tests.
> > 
> > Using sqlite as an IPC was never a great idea so drop that usage
> > entirely and
> > just use the standard cache mechanism we already have for other
> > situations.
> 
> I revisited these warnings on Scarthgap (which doesn't yet have the
> persist_data changes), and was able to resolve them. Ultimately, it
> seemed the context managers were just not set up correctly for
> managing the Connection.
> 
> The __exit__ method of Connection only deals with transactions, and
> doesn't close any connection to the database:
> https://docs.python.org/3/library/sqlite3.html#how-to-use-the-connection-context-manager
> . But we were never using a context manager anyway when dealing with
> persist_data.
> 
> I was able to get rid of the warnings by adding this 'close' method
> to SQLTable:
> 
>     def close(self):
>         self.connection.close()
> 
> And then changing usages of persist_data in fetch2 from something
> like this:
> 
> revs = bb.persist_data.persist('BB_URI_HEADREVS', d)
> 
> to this:
> 
> with contextlib.closing(bb.persist_data.persist('BB_URI_HEADREVS',
> d)) as revs:
> 
> 
> I was wondering if you or Steve would be interested in a patch for
> this to apply to Scarthgap. If so, I'd be happy to send one in. 

I think scarthgap and older users would probably appreciate having this
fixed so I'd be willing to take a patch.

Cheers,

Richard
chris.laplante@agilent.com Nov. 21, 2024, 5:30 p.m. UTC | #3
Hi Richard,

> > I was wondering if you or Steve would be interested in a patch for
> > this to apply to Scarthgap. If so, I'd be happy to send one in.
> 
> I think scarthgap and older users would probably appreciate having this fixed so
> I'd be willing to take a patch.

OK, will do! 

Question about sending the patches though: I don't think there are stable release branches for BitBake, so do I just send the patches to oe-core with the 'bitbake/' path and 'bitbake: ' commit message prefixes?  (With the [scarthgap] subject prefix, of course)

And if so, do you want me to update the stable release section in the manual to mention this? https://docs.yoctoproject.org/contributor-guide/submit-changes.html#submitting-changes-to-stable-release-branches

Thanks,
Chris
Richard Purdie Nov. 21, 2024, 5:48 p.m. UTC | #4
On Thu, 2024-11-21 at 17:30 +0000, chris.laplante@agilent.com wrote:
> Hi Richard,
> 
> > > I was wondering if you or Steve would be interested in a patch
> > > for
> > > this to apply to Scarthgap. If so, I'd be happy to send one in.
> > 
> > I think scarthgap and older users would probably appreciate having
> > this fixed so
> > I'd be willing to take a patch.
> 
> OK, will do! 
> 
> Question about sending the patches though: I don't think there are
> stable release branches for BitBake, so do I just send the patches to
> oe-core with the 'bitbake/' path and 'bitbake: ' commit message
> prefixes?  (With the [scarthgap] subject prefix, of course)
> 
> And if so, do you want me to update the stable release section in the
> manual to mention this?
> https://docs.yoctoproject.org/contributor-guide/submit-changes.html#submitting-changes-to-stable-release-branches

kirkstone is based on the 2.0 bitbake branch, scarthgap on 2.8 and
styhead on 2.10 so those are LTS branches and are where we'd need
patches. Prefix would be [2.0] with a note on why it wasn't needed for
master in the commit message.

Cheers,

Richard
chris.laplante@agilent.com Nov. 21, 2024, 5:54 p.m. UTC | #5
> 
> kirkstone is based on the 2.0 bitbake branch, scarthgap on 2.8 and styhead on
> 2.10 so those are LTS branches and are where we'd need patches. Prefix would
> be [2.0] with a note on why it wasn't needed for master in the commit
> message.

Ah gotcha, will do. Thanks again!

Chris
diff mbox series

Patch

diff --git a/lib/bb/cache.py b/lib/bb/cache.py
index 958652e0e3..ec7b023fc7 100644
--- a/lib/bb/cache.py
+++ b/lib/bb/cache.py
@@ -847,6 +847,16 @@  class MultiProcessCache(object):
         data = [{}]
         return data
 
+    def clear_cache(self):
+        if not self.cachefile:
+            bb.fatal("Can't clear invalid cachefile")
+
+        self.cachedata = self.create_cachedata()
+        self.cachedata_extras = self.create_cachedata()
+        with bb.utils.fileslocked([self.cachefile + ".lock"]):
+            bb.utils.remove(self.cachefile)
+            bb.utils.remove(self.cachefile + "-*")
+
     def save_extras(self):
         if not self.cachefile:
             return
diff --git a/lib/bb/checksum.py b/lib/bb/checksum.py
index 557793d366..3fb39a303e 100644
--- a/lib/bb/checksum.py
+++ b/lib/bb/checksum.py
@@ -142,3 +142,28 @@  class FileChecksumCache(MultiProcessCache):
 
         checksums.sort(key=operator.itemgetter(1))
         return checksums
+
+class RevisionsCache(MultiProcessCache):
+    cache_file_name = "local_srcrevisions.dat"
+    CACHE_VERSION = 1
+
+    def __init__(self):
+        MultiProcessCache.__init__(self)
+
+    def get_revs(self):
+        return self.cachedata[0]
+
+    def get_rev(self, k):
+        if k in self.cachedata_extras[0]:
+            return self.cachedata_extras[0][k]
+        if k in self.cachedata[0]:
+            return self.cachedata[0][k]
+        return None
+
+    def set_rev(self, k, v):
+        self.cachedata[0][k] = v
+        self.cachedata_extras[0][k] = v
+
+    def merge_data(self, source, dest):
+        for h in source[0]:
+            dest[0][h] = source[0][h]
diff --git a/lib/bb/cookerdata.py b/lib/bb/cookerdata.py
index 3ad5cf3dd0..1f447d30c2 100644
--- a/lib/bb/cookerdata.py
+++ b/lib/bb/cookerdata.py
@@ -1,3 +1,4 @@ 
+
 #
 # Copyright (C) 2003, 2004  Chris Larson
 # Copyright (C) 2003, 2004  Phil Blundell
@@ -267,8 +268,8 @@  class CookerDataBuilder(object):
         try:
             self.data = self.parseConfigurationFiles(self.prefiles, self.postfiles)
 
-            if self.data.getVar("BB_WORKERCONTEXT", False) is None and not worker:
-                bb.fetch.fetcher_init(self.data)
+            servercontext = self.data.getVar("BB_WORKERCONTEXT", False) is None and not worker
+            bb.fetch.fetcher_init(self.data, servercontext)
             bb.parse.init_parser(self.data)
 
             bb.event.fire(bb.event.ConfigParsed(), self.data)
diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
index b194a79be9..add742bfad 100644
--- a/lib/bb/fetch2/__init__.py
+++ b/lib/bb/fetch2/__init__.py
@@ -23,13 +23,14 @@  import collections
 import subprocess
 import pickle
 import errno
-import bb.persist_data, bb.utils
+import bb.utils
 import bb.checksum
 import bb.process
 import bb.event
 
 __version__ = "2"
 _checksum_cache = bb.checksum.FileChecksumCache()
+_revisions_cache = bb.checksum.RevisionsCache()
 
 logger = logging.getLogger("BitBake.Fetcher")
 
@@ -493,18 +494,23 @@  methods = []
 urldata_cache = {}
 saved_headrevs = {}
 
-def fetcher_init(d):
+def fetcher_init(d, servercontext=True):
     """
     Called to initialize the fetchers once the configuration data is known.
     Calls before this must not hit the cache.
     """
 
-    revs = bb.persist_data.persist('BB_URI_HEADREVS', d)
+    _checksum_cache.init_cache(d.getVar("BB_CACHEDIR"))
+    _revisions_cache.init_cache(d.getVar("BB_CACHEDIR"))
+
+    if not servercontext:
+        return
+
     try:
         # fetcher_init is called multiple times, so make sure we only save the
         # revs the first time it is called.
         if not bb.fetch2.saved_headrevs:
-            bb.fetch2.saved_headrevs = dict(revs)
+            bb.fetch2.saved_headrevs = _revisions_cache.get_revs()
     except:
         pass
 
@@ -514,11 +520,10 @@  def fetcher_init(d):
         logger.debug("Keeping SRCREV cache due to cache policy of: %s", srcrev_policy)
     elif srcrev_policy == "clear":
         logger.debug("Clearing SRCREV cache due to cache policy of: %s", srcrev_policy)
-        revs.clear()
+        _revisions_cache.clear_cache()
     else:
         raise FetchError("Invalid SRCREV cache policy of: %s" % srcrev_policy)
 
-    _checksum_cache.init_cache(d.getVar("BB_CACHEDIR"))
 
     for m in methods:
         if hasattr(m, "init"):
@@ -526,9 +531,11 @@  def fetcher_init(d):
 
 def fetcher_parse_save():
     _checksum_cache.save_extras()
+    _revisions_cache.save_extras()
 
 def fetcher_parse_done():
     _checksum_cache.save_merge()
+    _revisions_cache.save_merge()
 
 def fetcher_compare_revisions(d):
     """
@@ -536,7 +543,7 @@  def fetcher_compare_revisions(d):
     when bitbake was started and return true if they have changed.
     """
 
-    headrevs = dict(bb.persist_data.persist('BB_URI_HEADREVS', d))
+    headrevs = _revisions_cache.get_revs()
     return headrevs != bb.fetch2.saved_headrevs
 
 def mirror_from_string(data):
@@ -1662,13 +1669,13 @@  class FetchMethod(object):
         if not hasattr(self, "_latest_revision"):
             raise ParameterError("The fetcher for this URL does not support _latest_revision", ud.url)
 
-        revs = bb.persist_data.persist('BB_URI_HEADREVS', d)
         key = self.generate_revision_key(ud, d, name)
-        try:
-            return revs[key]
-        except KeyError:
-            revs[key] = rev = self._latest_revision(ud, d, name)
-            return rev
+
+        rev = _revisions_cache.get_rev(key)
+        if rev is None:
+            rev = self._latest_revision(ud, d, name)
+            _revisions_cache.set_rev(key, rev)
+        return rev
 
     def sortable_revision(self, ud, d, name):
         latest_rev = self._build_revision(ud, d, name)