diff mbox series

[2/3] fetch2/git: Add offline build support for tags

Message ID 20250129112406.1660522-3-stefan-koch@siemens.com
State New
Headers show
Series fetch2/git: Improve shallow, lfs, and tag support | expand

Commit Message

Koch, Stefan Jan. 29, 2025, 11:24 a.m. UTC
This adds support for offline builds when tags are used within `SRCREV`.
Offline building is enabled when `BB_NO_NETWORK = "1"` is set.

Signed-off-by: Stefan Koch <stefan-koch@siemens.com>
---
 lib/bb/fetch2/git.py | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

Comments

Alexander Kanavin Jan. 29, 2025, 11:34 a.m. UTC | #1
On Wed, 29 Jan 2025 at 12:24, Koch, Stefan via lists.openembedded.org
<stefan-koch=siemens.com@lists.openembedded.org> wrote:
> This adds support for offline builds when tags are used within `SRCREV`.

I'm not sure I understand the use case. Why not just specify commit id
in SRCREV?

We discourage using tags, as they can and do move around, breaking
reproducibility.

Alex
Alexander Kanavin Jan. 29, 2025, 1:20 p.m. UTC | #2
On Wed, 29 Jan 2025 at 12:24, Koch, Stefan via lists.openembedded.org
<stefan-koch=siemens.com@lists.openembedded.org> wrote:
>
> This adds support for offline builds when tags are used within `SRCREV`.
> Offline building is enabled when `BB_NO_NETWORK = "1"` is set.

I also forgot to mention there is this variable for the offline builds
situation:

   :term:`BB_SRCREV_POLICY`
      Defines the behavior of the fetcher when it interacts with source
      control systems and dynamic source revisions. The
      :term:`BB_SRCREV_POLICY` variable is useful when working without a
      network.

      The variable can be set using one of two policies:

      -  *cache* --- retains the value the system obtained previously rather
         than querying the source control system each time.

      -  *clear* --- queries the source controls system every time. With this
         policy, there is no cache. The "clear" policy is the default.

Does it not work for you? How?

Alex
Koch, Stefan Feb. 6, 2025, 12:46 p.m. UTC | #3
Initial fetch (bitbake eventmgr):
BB_SRCREV_POLICY = "cache"
EVENTMGR_URI = "git://git.server/eventmanager"
SRC_URI = "${EVENTMGR_URI};protocol=ssh;nobranch=1"
SRCREV = "v1.0.10-test"
SRCPV = "v1.0.10-test"
PV = "1.0.10-${SRCPV}"

Second fetch (after bitbake -c clean eventmgr) with additional:
BB_NO_NETWORK = "1"

With that patch it seems to be working even without that
BB_SRCREV_POLICY variable.

Without that patch it seems not to be working even with
BB_SRCREV_POLICY set to cache:

WARNING: /build/../repo/recipes-app/eventmgr/eventmgr.bb: Exception
during build_dependencies for PV | ETA: --:--:--
WARNING: /build/../repo/recipes-app/eventmgr/eventmgr.bb: Error during
finalise of /build/../repo/recipes-app/eventmgr/eventmgr.bb
ERROR: ExpansionError during parsing /build/../repo/recipes-
app/eventmgr/eventmgr.bb
Traceback (most recent call last):
 File "/work/isar/bitbake/lib/bb/fetch2/git.py", line 278, in
Git.urldata_init(ud=<bb.fetch2.FetchData object at 0x7f873f06eb90>,
d=<bb.data_smart.DataSmart object at 0x7f873fce9590>):
 ud.unresolvedrev[name] = ud.revisions[name]
 > ud.revisions[name] = self.latest_revision(ud, d, name)
 File "/work/isar/bitbake/lib/bb/fetch2/__init__.py", line 1670, in
Git.latest_revision(ud=<bb.fetch2.FetchData object at 0x7f873f06eb90>,
d=<bb.data_smart.DataSmart object at 0x7f873fce9590>, name='default'):
 except KeyError:
 > revs[key] = rev = self._latest_revision(ud, d, name)
 return rev
 File "/work/isar/bitbake/lib/bb/fetch2/git.py", line 850, in
Git._latest_revision(ud=<bb.fetch2.FetchData object at 0x7f873f06eb90>,
d=<bb.data_smart.DataSmart object at 0x7f873fce9590>, name='default'):
 > output = self._lsremote(ud, d, "")
 # Tags of the form ^{} may not work, need to fallback to other form
 File "/work/isar/bitbake/lib/bb/fetch2/git.py", line 832, in
Git._lsremote(ud=<bb.fetch2.FetchData object at 0x7f873f06eb90>,
d=<bb.data_smart.DataSmart object at 0x7f873fce9590>, search=''):
 if ud.proto.lower() != 'file':
 > bb.fetch2.check_network_access(d, cmd, repourl)
 output = runfetchcmd(cmd, d, True)
 File "/work/isar/bitbake/lib/bb/fetch2/__init__.py", line 970, in
check_network_access(d=<bb.data_smart.DataSmart object at
0x7f873fce9590>, info='git -c gc.autoDetach=false -c core.pager=cat -c
safe.bareRepository=all ls-remote ssh://git.server/eventmanager ',
url='ssh://git.server/eventmanager'):
 if bb.utils.to_boolean(d.getVar("BB_NO_NETWORK")):
 > raise NetworkAccess(url, info)
 elif not trusted_network(d, url):
bb.data_smart.ExpansionError: Failure expanding variable SRCPV,
expression was ${@bb.fetch2.get_srcrev(d)} which triggered exception
NetworkAccess: Network access disabled through BB_NO_NETWORK (or set
indirectly due to use of BB_FETCH_PREMIRRORONLY) but access requested
with command git -c gc.autoDetach=false -c core.pager=cat -c
safe.bareRepository=all ls-remote ssh://git.server/eventmanager (for
url ssh://git.server/eventmanager)
The variable dependency chain for the failure is: SRCPV -> PV ->
PV[vardepvalue]

ERROR: eventmgr-1.0.10-v1.0.10-test-r0 do_fetch: Fetcher failure:
Recipe uses a floating tag/branch 'v1.0.10-test' for repo
'...eventmanager' without a fixed SRCREV yet doesn't call
bb.fetch2.get_srcrev() (use SRCPV in PV for OE).


On Wed, 2025-01-29 at 14:20 +0100, Alexander Kanavin wrote:
> > On Wed, 29 Jan 2025 at 12:24, Koch, Stefan via
> > lists.openembedded.org
> > <stefan-koch=siemens.com@lists.openembedded.org> wrote:
> > > > 
> > > > This adds support for offline builds when tags are used within
> > > > `SRCREV`.
> > > > Offline building is enabled when `BB_NO_NETWORK = "1"` is set.
> > 
> > I also forgot to mention there is this variable for the offline
> > builds
> > situation:
> > 
> >    :term:`BB_SRCREV_POLICY`
> >       Defines the behavior of the fetcher when it interacts with
> > source
> >       control systems and dynamic source revisions. The
> >       :term:`BB_SRCREV_POLICY` variable is useful when working
> > without a
> >       network.
> > 
> >       The variable can be set using one of two policies:
> > 
> >       -  *cache* --- retains the value the system obtained
> > previously
> > rather
> >          than querying the source control system each time.
> > 
> >       -  *clear* --- queries the source controls system every time.
> > With this
> >          policy, there is no cache. The "clear" policy is the
> > default.
> > 
> > Does it not work for you? How?
> > 
> > Alex

-- 
Stefan Koch
Siemens AG
www.siemens.com

-- 
Stefan Koch
Siemens AG
www.siemens.com
Alexander Kanavin Feb. 7, 2025, 10:26 a.m. UTC | #4
On Thu, 6 Feb 2025 at 13:46, Koch, Stefan <stefan-koch@siemens.com> wrote:
>  File "/work/isar/bitbake/lib/bb/fetch2/__init__.py", line 1670, in
> Git.latest_revision(ud=<bb.fetch2.FetchData object at 0x7f873f06eb90>,
> d=<bb.data_smart.DataSmart object at 0x7f873fce9590>, name='default'):
>  except KeyError:
>  > revs[key] = rev = self._latest_revision(ud, d, name)
>  return rev

This is the function:

    def latest_revision(self, ud, d, name):
        """
        Look in the cache for the latest revision, if not present ask the SCM.
        """
        if not hasattr(self, "_latest_revision"):
            raise ParameterError("The fetcher for this URL does not
support _latest_revision", ud.url)

        revs = bb.persist_data.persist('BB_URI_HEADREVS', d)
        key = self.generate_revision_key(ud, d, name)
        try:
            return revs[key]
        except KeyError:
            revs[key] = rev = self._latest_revision(ud, d, name)
            return rev

Something doesn't compute properly here: the cached revision should've
been successfully looked up via BB_URI_HEADREVS, and a call to
_latest_revision (triggering network access) avoided altogether.

Please look into why it isn't happening, or explain what your fix does
that addresses the problem. There's moving code around (changing the
condition for what is considered a revision on the way, which should
be in a separate commit), and adding a BB_NO_NETWORK check, and I
struggle to understand what these changes do :(

Alex
Alexander Kanavin Feb. 7, 2025, 10:48 a.m. UTC | #5
On Fri, 7 Feb 2025 at 11:26, Alexander Kanavin via
lists.openembedded.org <alex.kanavin=gmail.com@lists.openembedded.org>
wrote:
>
> On Thu, 6 Feb 2025 at 13:46, Koch, Stefan <stefan-koch@siemens.com> wrote:
> >  File "/work/isar/bitbake/lib/bb/fetch2/__init__.py", line 1670, in
> > Git.latest_revision(ud=<bb.fetch2.FetchData object at 0x7f873f06eb90>,
> > d=<bb.data_smart.DataSmart object at 0x7f873fce9590>, name='default'):
> >  except KeyError:
> >  > revs[key] = rev = self._latest_revision(ud, d, name)
> >  return rev
>
> This is the function:
>
>     def latest_revision(self, ud, d, name):
>         """
>         Look in the cache for the latest revision, if not present ask the SCM.
>         """
>         if not hasattr(self, "_latest_revision"):
>             raise ParameterError("The fetcher for this URL does not
> support _latest_revision", ud.url)
>
>         revs = bb.persist_data.persist('BB_URI_HEADREVS', d)
>         key = self.generate_revision_key(ud, d, name)
>         try:
>             return revs[key]
>         except KeyError:
>             revs[key] = rev = self._latest_revision(ud, d, name)
>             return rev
>
> Something doesn't compute properly here: the cached revision should've
> been successfully looked up via BB_URI_HEADREVS, and a call to
> _latest_revision (triggering network access) avoided altogether.

This has been recently refactored to use a different mechanism, and
the code above is from an earlier yocto, but it should still be using
a cache, and the look up should succeed.

Alex
Alexander Kanavin Feb. 7, 2025, 11:24 a.m. UTC | #6
On Fri, 7 Feb 2025 at 11:48, Alexander Kanavin via
lists.openembedded.org <alex.kanavin=gmail.com@lists.openembedded.org>
wrote:

> > Something doesn't compute properly here: the cached revision should've
> > been successfully looked up via BB_URI_HEADREVS, and a call to
> > _latest_revision (triggering network access) avoided altogether.
>
> This has been recently refactored to use a different mechanism, and
> the code above is from an earlier yocto, but it should still be using
> a cache, and the look up should succeed.

Okay, I think this may have caused a regression:
https://git.openembedded.org/bitbake/commit/?id=fdc55bb649cb77456d0ac48a9600ef289a52af18

With the refactor, the file needs to be explicitly written out via
function calls, and it doesn't seem to be happening as I can't find
local_srcrevisions.dat in ${TOPDIR}/cache/
:-/

We don't have any tests for it, unfortunately.

Alex
Alexander Kanavin Feb. 7, 2025, 12:15 p.m. UTC | #7
On Fri, 7 Feb 2025 at 12:25, Alexander Kanavin via
lists.openembedded.org <alex.kanavin=gmail.com@lists.openembedded.org>
wrote:

> Okay, I think this may have caused a regression:
> https://git.openembedded.org/bitbake/commit/?id=fdc55bb649cb77456d0ac48a9600ef289a52af18
>
> With the refactor, the file needs to be explicitly written out via
> function calls, and it doesn't seem to be happening as I can't find
> local_srcrevisions.dat in ${TOPDIR}/cache/
> :-/
>
> We don't have any tests for it, unfortunately.

Ticket:
https://bugzilla.yoctoproject.org/show_bug.cgi?id=15739

Alex
Koch, Stefan Feb. 7, 2025, 12:40 p.m. UTC | #8
I have made the test with BitBake 2.8.

So the commit
fdc55bb649cb77456d0ac48a9600ef289a52af18: checksum/fetch2: Switch from
persist_data to a standard cache file

is not present, there...

Stefan

On Fri, 2025-02-07 at 13:15 +0100, Alexander Kanavin wrote:
> On Fri, 7 Feb 2025 at 12:25, Alexander Kanavin via
> lists.openembedded.org
> <alex.kanavin=gmail.com@lists.openembedded.org>
> wrote:
>
> > Okay, I think this may have caused a regression:
> > https://git.openembedded.org/bitbake/commit/?id=fdc55bb649cb77456d0ac48a9600ef289a52af18
> >
> > With the refactor, the file needs to be explicitly written out via
> > function calls, and it doesn't seem to be happening as I can't find
> > local_srcrevisions.dat in ${TOPDIR}/cache/
> > :-/
> >
> > We don't have any tests for it, unfortunately.
>
> Ticket:
> https://bugzilla.yoctoproject.org/show_bug.cgi?id=15739
>
> Alex

--
Stefan Koch
Siemens AG
http://www.siemens.com/
Alexander Kanavin Feb. 7, 2025, 12:59 p.m. UTC | #9
On Fri, 7 Feb 2025 at 13:41, Koch, Stefan <stefan-koch@siemens.com> wrote:
> I have made the test with BitBake 2.8.
>
> So the commit
> fdc55bb649cb77456d0ac48a9600ef289a52af18: checksum/fetch2: Switch from
> persist_data to a standard cache file
>
> is not present, there...

Then you could check what the sqlite database file in cache/ contains,
and if the needed revision is recorded there.

Alex
Koch, Stefan Feb. 7, 2025, 5:19 p.m. UTC | #10
There is not very much in that database:

git:git.server.ebsy.debian.ebsy-qa-
suites.gitnext	e9a98ffb7df033b5204614477b8d3efaccfc920b
git:git.server/eventmanagerv1.0.10-
test	f56e663ec0982e8b300c03b14a95cb67ab2fead5

#SRCREV = "f56e663ec0982e8b300c03b14a95cb67ab2fead5"
SRCREV = "v1.0.10-test"
SRCPV = "v1.0.10-test"

So only the SRCREV as hash. But no tag.

One curious thing:
- Without this PATCH 2/3 build without BB_NO_NETWORK = "1" works
- It fails after bitbake -c clean, with used BB_NO_NETWORK = "1"
- But when applying this PATCH 2/3 again, and let BB_NO_NETWORK = "1",
it works

On Fri, 2025-02-07 at 13:59 +0100, Alexander Kanavin wrote:
> On Fri, 7 Feb 2025 at 13:41, Koch, Stefan <stefan-koch@siemens.com>
> wrote:
> > I have made the test with BitBake 2.8.
> > 
> > So the commit
> > fdc55bb649cb77456d0ac48a9600ef289a52af18: checksum/fetch2: Switch
> > from
> > persist_data to a standard cache file
> > 
> > is not present, there...
> 
> Then you could check what the sqlite database file in cache/
> contains,
> and if the needed revision is recorded there.
> 
> Alex

-- 
Stefan Koch
Siemens AG
www.siemens.com
Alexander Kanavin Feb. 7, 2025, 7:17 p.m. UTC | #11
On Fri, 7 Feb 2025 at 18:19, Koch, Stefan <stefan-koch@siemens.com> wrote:
> git:git.server/eventmanagerv1.0.10-
> test    f56e663ec0982e8b300c03b14a95cb67ab2fead5
>
> #SRCREV = "f56e663ec0982e8b300c03b14a95cb67ab2fead5"
> SRCREV = "v1.0.10-test"
> SRCPV = "v1.0.10-test"
>
> So only the SRCREV as hash. But no tag.

What makes you say that? I see the tag right there:

git:git.server/eventmanagerv1.0.10- test
f56e663ec0982e8b300c03b14a95cb67ab2fead5

> One curious thing:
> - Without this PATCH 2/3 build without BB_NO_NETWORK = "1" works
> - It fails after bitbake -c clean, with used BB_NO_NETWORK = "1"
> - But when applying this PATCH 2/3 again, and let BB_NO_NETWORK = "1",
> it works

Please forget the patch (for now, at least). I don't understand it,
and it should not be needed to begin with. You should figure out why
the cache database lookup doesn't work and the fetcher has to issue
'git ls-remote'. I've already asked you to look into it in a previous
message, showing the place in code where the incorrect branch is
taken.

Alex
Koch, Stefan Feb. 11, 2025, 11:46 a.m. UTC | #12
On Fri, 2025-02-07 at 20:17 +0100, Alexander Kanavin wrote:
> On Fri, 7 Feb 2025 at 18:19, Koch, Stefan <stefan-koch@siemens.com>
> wrote:
> > git:git.server/eventmanagerv1.0.10-
> > test    f56e663ec0982e8b300c03b14a95cb67ab2fead5
> > 
> > #SRCREV = "f56e663ec0982e8b300c03b14a95cb67ab2fead5"
> > SRCREV = "v1.0.10-test"
> > SRCPV = "v1.0.10-test"
> > 
> > So only the SRCREV as hash. But no tag.
> 
> What makes you say that? I see the tag right there:
> 
> git:git.server/eventmanagerv1.0.10- test
> f56e663ec0982e8b300c03b14a95cb67ab2fead5
> 
> > One curious thing:
> > - Without this PATCH 2/3 build without BB_NO_NETWORK = "1" works
> > - It fails after bitbake -c clean, with used BB_NO_NETWORK = "1"
> > - But when applying this PATCH 2/3 again, and let BB_NO_NETWORK =
> > "1",
> > it works
> 
> Please forget the patch (for now, at least). I don't understand it,
> and it should not be needed to begin with. You should figure out why
> the cache database lookup doesn't work and the fetcher has to issue
> 'git ls-remote'. I've already asked you to look into it in a previous
> message, showing the place in code where the incorrect branch is
> taken.

Found the issue:
BB_SRCREV_POLICY = "cache"
must be set global.

If set only for single recipes, fetcher_init
(lib/bb/fetch2/__init__.py) sets srcrev_policy to clear by default.
> 
> Alex

-- 
Stefan Koch
Siemens AG
www.siemens.com
diff mbox series

Patch

diff --git a/lib/bb/fetch2/git.py b/lib/bb/fetch2/git.py
index 6d87c2f18..7ac690bbe 100644
--- a/lib/bb/fetch2/git.py
+++ b/lib/bb/fetch2/git.py
@@ -270,13 +270,6 @@  class Git(FetchMethod):
 
         ud.setup_revisions(d)
 
-        for name in ud.names:
-            # Ensure any revision that doesn't look like a SHA-1 is translated into one
-            if not sha1_re.match(ud.revisions[name] or ''):
-                if ud.revisions[name]:
-                    ud.unresolvedrev[name] = ud.revisions[name]
-                ud.revisions[name] = self.latest_revision(ud, d, name)
-
         gitsrcname = '%s%s' % (ud.host.replace(':', '.'), ud.path.replace('/', '.').replace('*', '.').replace(' ','_').replace('(', '_').replace(')', '_'))
         if gitsrcname.startswith('.'):
             gitsrcname = gitsrcname[1:]
@@ -294,6 +287,13 @@  class Git(FetchMethod):
         ud.clonedir = os.path.join(gitdir, gitsrcname)
         ud.localfile = ud.clonedir
 
+        for name in ud.names:
+            # Ensure anything that doesn't look like a sha256 checksum/revision is translated into one
+            if not ud.revisions[name] or len(ud.revisions[name]) != 40  or (False in [c in "abcdef0123456789" for c in ud.revisions[name]]):
+                if ud.revisions[name]:
+                    ud.unresolvedrev[name] = ud.revisions[name]
+                ud.revisions[name] = self.latest_revision(ud, d, name)
+
         mirrortarball = 'git2_%s.tar.gz' % gitsrcname
         ud.fullmirror = os.path.join(dl_dir, mirrortarball)
         ud.mirrortarballs = [mirrortarball]
@@ -909,7 +909,11 @@  class Git(FetchMethod):
             repourl = self._get_repo_url(ud)
             cmd = "%s ls-remote %s %s" % \
                 (ud.basecmd, shlex.quote(repourl), search)
-            if ud.proto.lower() != 'file':
+
+            if bb.utils.to_boolean(d.getVar("BB_NO_NETWORK")):
+                cmd = "%s ls-remote file://%s %s" % \
+                    (ud.basecmd, shlex.quote(ud.clonedir), search)
+            elif ud.proto.lower() != 'file':
                 bb.fetch2.check_network_access(d, cmd, repourl)
             output = runfetchcmd(cmd, d, True)
             if not output: