diff mbox series

[poky,scarthgap,01/23] bitbake: fetch2/git: Use git shallow fetch to implement clone_shallow_local()

Message ID 20250808084931.2156763-1-akash.hadke27@gmail.com
State New
Headers show
Series [poky,scarthgap,01/23] bitbake: fetch2/git: Use git shallow fetch to implement clone_shallow_local() | expand

Commit Message

Akash Hadke Aug. 8, 2025, 8:49 a.m. UTC
From: Robert Yang <liezhi.yang@windriver.com>

This patch can make the following settings much more faster:
BB_GIT_SHALLOW = "1"
BB_GENERATE_MIRROR_TARBALLS = "1"

* The previous implementation was:
  - Make a full clone for the repo from local ud.clonedir
  - Use git-make-shallow to remove unneeded revs

  It was very slow for recipes which have a lot of SRC_URIs, for example
  vulkan-samples and docker-compose, the docker-compose can't be done after 5
  hours.

  $ bitbake vulkan-samples -cfetch
  Before: 12 minutes
  Now: 2 minutes

  $ bitbake docker-compose -cfetch
  Before: More than 300 minutes
  Now: 15 minutes

* The patch uses git shallow fetch to fetch the repo from local
  ud.clonedir:
  - For BB_GIT_SHALLOW_DEPTH: git fetch --depth <depth> rev
  - For BB_GIT_SHALLOW_REVS: git fetch --shallow-exclude=<revs> rev

  Then the git repo will be shallow, and git-make-shallow is not needed any
  more.

  And git shallow fetch will download less commits than before since it doesn't
  need "rev^" to parse the dependencies, the previous code always need 'rev^'.

(Bitbake rev: a5a569c075224fe41707cfa9123c442d1fda2fbf)

Signed-off-by: Robert Yang <liezhi.yang@windriver.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit e2527cf58f4f390712641a99b276b9e0aeae01c6)
Signed-off-by: Akash Hadke <akash.hadke27@gmail.com>
---
 bitbake/lib/bb/fetch2/git.py | 78 ++++++++++++++++++++++++------------
 1 file changed, 52 insertions(+), 26 deletions(-)

Comments

patchtest@automation.yoctoproject.org Aug. 8, 2025, 9:10 a.m. UTC | #1
Thank you for your submission. Patchtest identified one
or more issues with the patch. Please see the log below for
more information:

---
Testing patch /home/patchtest/share/mboxes/poky-scarthgap-01-23-bitbake-fetch2-git-Use-git-shallow-fetch-to-implement-clone_shallow_local.patch

FAIL: test target mailing list: Series sent to the wrong mailing list or some patches from the series correspond to different mailing lists (test_mbox.TestMbox.test_target_mailing_list)

PASS: pretest pylint (test_python_pylint.PyLint.pretest_pylint)
PASS: test Signed-off-by presence (test_mbox.TestMbox.test_signed_off_by_presence)
PASS: test author valid (test_mbox.TestMbox.test_author_valid)
PASS: test commit message presence (test_mbox.TestMbox.test_commit_message_presence)
PASS: test commit message user tags (test_mbox.TestMbox.test_commit_message_user_tags)
PASS: test max line length (test_metadata.TestMetadata.test_max_line_length)
PASS: test mbox format (test_mbox.TestMbox.test_mbox_format)
PASS: test non-AUH upgrade (test_mbox.TestMbox.test_non_auh_upgrade)
PASS: test pylint (test_python_pylint.PyLint.test_pylint)
PASS: test shortlog format (test_mbox.TestMbox.test_shortlog_format)
PASS: test shortlog length (test_mbox.TestMbox.test_shortlog_length)

SKIP: pretest src uri left files: No modified recipes, skipping pretest (test_metadata.TestMetadata.pretest_src_uri_left_files)
SKIP: test CVE check ignore: No modified recipes or older target branch, skipping test (test_metadata.TestMetadata.test_cve_check_ignore)
SKIP: test CVE tag format: No new CVE patches introduced (test_patch.TestPatch.test_cve_tag_format)
SKIP: test Signed-off-by presence: No new CVE patches introduced (test_patch.TestPatch.test_signed_off_by_presence)
SKIP: test Upstream-Status presence: No new CVE patches introduced (test_patch.TestPatch.test_upstream_status_presence_format)
SKIP: test bugzilla entry format: No bug ID found (test_mbox.TestMbox.test_bugzilla_entry_format)
SKIP: test lic files chksum modified not mentioned: No modified recipes, skipping test (test_metadata.TestMetadata.test_lic_files_chksum_modified_not_mentioned)
SKIP: test lic files chksum presence: No added recipes, skipping test (test_metadata.TestMetadata.test_lic_files_chksum_presence)
SKIP: test license presence: No added recipes, skipping test (test_metadata.TestMetadata.test_license_presence)
SKIP: test series merge on head: Merge test is disabled for now (test_mbox.TestMbox.test_series_merge_on_head)
SKIP: test src uri left files: No modified recipes, skipping pretest (test_metadata.TestMetadata.test_src_uri_left_files)
SKIP: test summary presence: No added recipes, skipping test (test_metadata.TestMetadata.test_summary_presence)

---

Please address the issues identified and
submit a new revision of the patch, or alternatively, reply to this
email with an explanation of why the patch should be accepted. If you
believe these results are due to an error in patchtest, please submit a
bug at https://bugzilla.yoctoproject.org/ (use the 'Patchtest' category
under 'Yocto Project Subprojects'). For more information on specific
failures, see: https://wiki.yoctoproject.org/wiki/Patchtest. Thank
you!
diff mbox series

Patch

diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
index 6029144601..2194c839dd 100644
--- a/bitbake/lib/bb/fetch2/git.py
+++ b/bitbake/lib/bb/fetch2/git.py
@@ -551,18 +551,31 @@  class Git(FetchMethod):
             runfetchcmd("touch %s.done" % ud.fullmirror, d)
 
     def clone_shallow_local(self, ud, dest, d):
-        """Clone the repo and make it shallow.
+        """
+        Shallow fetch from ud.clonedir (${DL_DIR}/git2/<gitrepo> by default):
+        - For BB_GIT_SHALLOW_DEPTH: git fetch --depth <depth> rev
+        - For BB_GIT_SHALLOW_REVS: git fetch --shallow-exclude=<revs> rev
+        """
+
+        bb.utils.mkdirhier(dest)
+        init_cmd = "%s init -q" % ud.basecmd
+        if ud.bareclone:
+            init_cmd += " --bare"
+        runfetchcmd(init_cmd, d, workdir=dest)
+        runfetchcmd("%s remote add origin %s" % (ud.basecmd, ud.clonedir), d, workdir=dest)
 
-        The upstream url of the new clone isn't set at this time, as it'll be
-        set correctly when unpacked."""
-        runfetchcmd("%s clone %s %s %s" % (ud.basecmd, ud.cloneflags, ud.clonedir, dest), d)
+        # Check the histories which should be excluded
+        shallow_exclude = ''
+        for revision in ud.shallow_revs:
+            shallow_exclude += " --shallow-exclude=%s" % revision
 
-        to_parse, shallow_branches = [], []
         for name in ud.names:
             revision = ud.revisions[name]
             depth = ud.shallow_depths[name]
-            if depth:
-                to_parse.append('%s~%d^{}' % (revision, depth - 1))
+
+            # The --depth and --shallow-exclude can't be used together
+            if depth and shallow_exclude:
+                raise bb.fetch2.FetchError("BB_GIT_SHALLOW_REVS is set, but BB_GIT_SHALLOW_DEPTH is not 0.")
 
             # For nobranch, we need a ref, otherwise the commits will be
             # removed, and for non-nobranch, we truncate the branch to our
@@ -575,36 +588,49 @@  class Git(FetchMethod):
             else:
                 ref = "refs/remotes/origin/%s" % branch
 
-            shallow_branches.append(ref)
-            runfetchcmd("%s update-ref %s %s" % (ud.basecmd, ref, revision), d, workdir=dest)
+            fetch_cmd = "%s fetch origin %s" % (ud.basecmd, revision)
+            if depth:
+                fetch_cmd += " --depth %s" % depth
+
+            if shallow_exclude:
+                fetch_cmd += shallow_exclude
 
-        # Map srcrev+depths to revisions
-        parsed_depths = runfetchcmd("%s rev-parse %s" % (ud.basecmd, " ".join(to_parse)), d, workdir=dest)
+            # Advertise the revision for lower version git such as 2.25.1:
+            # error: Server does not allow request for unadvertised object.
+            # The ud.clonedir is a local temporary dir, will be removed when
+            # fetch is done, so we can do anything on it.
+            adv_cmd = 'git branch -f advertise-%s %s' % (revision, revision)
+            runfetchcmd(adv_cmd, d, workdir=ud.clonedir)
 
-        # Resolve specified revisions
-        parsed_revs = runfetchcmd("%s rev-parse %s" % (ud.basecmd, " ".join('"%s^{}"' % r for r in ud.shallow_revs)), d, workdir=dest)
-        shallow_revisions = parsed_depths.splitlines() + parsed_revs.splitlines()
+            runfetchcmd(fetch_cmd, d, workdir=dest)
+            runfetchcmd("%s update-ref %s %s" % (ud.basecmd, ref, revision), d, workdir=dest)
 
         # Apply extra ref wildcards
-        all_refs = runfetchcmd('%s for-each-ref "--format=%%(refname)"' % ud.basecmd,
-                               d, workdir=dest).splitlines()
+        all_refs_remote = runfetchcmd("%s ls-remote origin 'refs/*'" % ud.basecmd, \
+                                        d, workdir=dest).splitlines()
+        all_refs = []
+        for line in all_refs_remote:
+            all_refs.append(line.split()[-1])
+        extra_refs = []
         for r in ud.shallow_extra_refs:
             if not ud.bareclone:
                 r = r.replace('refs/heads/', 'refs/remotes/origin/')
 
             if '*' in r:
                 matches = filter(lambda a: fnmatch.fnmatchcase(a, r), all_refs)
-                shallow_branches.extend(matches)
+                extra_refs.extend(matches)
             else:
-                shallow_branches.append(r)
-
-        # Make the repository shallow
-        shallow_cmd = [self.make_shallow_path, '-s']
-        for b in shallow_branches:
-            shallow_cmd.append('-r')
-            shallow_cmd.append(b)
-        shallow_cmd.extend(shallow_revisions)
-        runfetchcmd(subprocess.list2cmdline(shallow_cmd), d, workdir=dest)
+                extra_refs.append(r)
+
+        for ref in extra_refs:
+            ref_fetch = os.path.basename(ref)
+            runfetchcmd("%s fetch origin --depth 1 %s" % (ud.basecmd, ref_fetch), d, workdir=dest)
+            revision = runfetchcmd("%s rev-parse FETCH_HEAD" % ud.basecmd, d, workdir=dest)
+            runfetchcmd("%s update-ref %s %s" % (ud.basecmd, ref, revision), d, workdir=dest)
+
+        # The url is local ud.clonedir, set it to upstream one
+        repourl = self._get_repo_url(ud)
+        runfetchcmd("%s remote set-url origin %s" % (ud.basecmd, shlex.quote(repourl)), d, workdir=dest)
 
     def unpack(self, ud, destdir, d):
         """ unpack the downloaded src to destdir"""