diff mbox series

bitbake: fetch: Add support for remapping URL's

Message ID 20250505185755.1952-1-falital@hotmail.com
State New
Headers show
Series bitbake: fetch: Add support for remapping URL's | expand

Commit Message

Sawas Etairidis May 5, 2025, 6:57 p.m. UTC
From: Savvas Etairidis <setairidis@gmail.com>

This commit adds support for remapping URL's in the fetcher.
This is useful for cases where the same server is access with
different protocols or different URL's. The remapping is done by
adding a remap entry to the global configuration or the recipe file.

Fixed #10699

Signed-off-by: Savvas Etairidis <falital@hotmail.com>
---
 .../bitbake-user-manual-fetching.rst          |  10 ++
 .../bitbake-user-manual-ref-variables.rst     |   9 ++
 bitbake/lib/bb/fetch2/__init__.py             |  68 +++++++++++-
 bitbake/lib/bb/tests/fetch.py                 | 104 +++++++++++++++++-
 4 files changed, 187 insertions(+), 4 deletions(-)

Comments

Richard Purdie May 5, 2025, 10:46 p.m. UTC | #1
On Mon, 2025-05-05 at 20:57 +0200, Savvas Etairidis via lists.openembedded.org wrote:
> From: Savvas Etairidis <setairidis@gmail.com>
> 
> This commit adds support for remapping URL's in the fetcher.
> This is useful for cases where the same server is access with
> different protocols or different URL's. The remapping is done by
> adding a remap entry to the global configuration or the recipe file.
> 
> Fixed #10699

I'm curious why you didn't use the PREMIRRORS and MIRRORS support that already exists?

Cheers,

Richard
Sawas Etairidis May 7, 2025, 7:28 a.m. UTC | #2
Hi Richard,

I think it would be better to not modify how PREMIRRORS and MIRRORS work
since users already depend on them and expect them to work on specific way.
Like this you can add extra rules to rewrite the source code url and then
use  PREMIRRORS and MIRRORS if they exists.
I could instead provide an example how someone could have the same result
using PREMIRRORS if you think it is not worth having this as standalone
feature.

Regards,
Savvas

Στις Τρί 6 Μαΐ 2025 στις 12:46 π.μ., ο/η Richard Purdie <
richard.purdie@linuxfoundation.org> έγραψε:

> On Mon, 2025-05-05 at 20:57 +0200, Savvas Etairidis via
> lists.openembedded.org wrote:
> > From: Savvas Etairidis <setairidis@gmail.com>
> >
> > This commit adds support for remapping URL's in the fetcher.
> > This is useful for cases where the same server is access with
> > different protocols or different URL's. The remapping is done by
> > adding a remap entry to the global configuration or the recipe file.
> >
> > Fixed #10699
>
> I'm curious why you didn't use the PREMIRRORS and MIRRORS support that
> already exists?
>
> Cheers,
>
> Richard
>
Richard Purdie May 7, 2025, 8:05 a.m. UTC | #3
Hi Savvas,

On Wed, 2025-05-07 at 09:28 +0200, Sawas Etairidis wrote:
> I think it would be better to not modify how PREMIRRORS and MIRRORS
> work since users already depend on them and expect them to work on
> specific way.
> Like this you can add extra rules to rewrite the source code url and
> then use  PREMIRRORS and MIRRORS if they exists.
> I could instead provide an example how someone could have the same
> result using PREMIRRORS if you think it is not worth having this as
> standalone feature.

Looking at the bug in question
(https://bugzilla.yoctoproject.org/show_bug.cgi?id=10699), the issue is
filed against the layer index and the issue in question was more about
allowing layers to list their mirror urls so that we could canonicalise
different urls.

Your patch is against bitbake itself and it is unclear if we need that
support in bitbake itself to solve that bug which was targetting a
layer index change if I understand correctly.

There is already some discussion about changing the way
PREMIRRORS/MIRRORS work for other issues and given that, I'm reluctant
to add new code paths that add complexity and do things in a different
way. I've been hoping rather than adding complexity, we could try and
simplify whilst still maintaining the different use cases we need.

Cheers,

Richard
Sawas Etairidis May 7, 2025, 2:07 p.m. UTC | #4
Hi Richard,

My bad, I never noticed this was in another component. Sorry to waste your
time on this.
I will look if I can work on this in the layer index project if not, I move
to NEW again.
If you think it is of value, I could build a check if a URL is the same,
but the access protocol is not the same to at least inform the developer
that something is wrong.
For example, if the following is configured.

recipe_a.bb
SRC_URI =  "git://github.com/project/repo"

recipe_b.bb
SRC_URI =  "https://github.com/project/repo"

The following warning should be generated.
WARNING:  recipe_a-0.0.0-r0 do_fetch: Url github.com/project is accessed
with diffrent protocols

Regards,
Savvas

Στις Τετ 7 Μαΐ 2025 στις 10:05 π.μ., ο/η Richard Purdie <
richard.purdie@linuxfoundation.org> έγραψε:

> Hi Savvas,
>
> On Wed, 2025-05-07 at 09:28 +0200, Sawas Etairidis wrote:
> > I think it would be better to not modify how PREMIRRORS and MIRRORS
> > work since users already depend on them and expect them to work on
> > specific way.
> > Like this you can add extra rules to rewrite the source code url and
> > then use  PREMIRRORS and MIRRORS if they exists.
> > I could instead provide an example how someone could have the same
> > result using PREMIRRORS if you think it is not worth having this as
> > standalone feature.
>
> Looking at the bug in question
> (https://bugzilla.yoctoproject.org/show_bug.cgi?id=10699), the issue is
> filed against the layer index and the issue in question was more about
> allowing layers to list their mirror urls so that we could canonicalise
> different urls.
>
> Your patch is against bitbake itself and it is unclear if we need that
> support in bitbake itself to solve that bug which was targetting a
> layer index change if I understand correctly.
>
> There is already some discussion about changing the way
> PREMIRRORS/MIRRORS work for other issues and given that, I'm reluctant
> to add new code paths that add complexity and do things in a different
> way. I've been hoping rather than adding complexity, we could try and
> simplify whilst still maintaining the different use cases we need.
>
> Cheers,
>
> Richard
>
diff mbox series

Patch

diff --git a/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst b/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst
index fb4f0a23d7..e932fc12e0 100644
--- a/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst
+++ b/bitbake/doc/bitbake-user-manual/bitbake-user-manual-fetching.rst
@@ -148,6 +148,16 @@  If :term:`BB_CHECK_SSL_CERTS` is set to ``0`` then SSL certificate checking will
 be disabled. This variable defaults to ``1`` so SSL certificates are normally
 checked.
 
+If :term:`BB_ALLOW_URL_REMAP` is set to ``1`` then URL's will be remapped
+based on the rules defined in :term:`BB_URL_REMAP` the varriable can be set
+in the global configuration or in the recipe.
+
+Here is a examples of how to set the variable::
+
+   BB_URL_REMAP = "\
+       git://example.com/ https://example.com/ \
+       http://example.com/ https://example.com/"
+
 .. _bb-the-unpack:
 
 The Unpack
diff --git a/bitbake/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst b/bitbake/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst
index 477443e228..1695a9e980 100644
--- a/bitbake/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst
+++ b/bitbake/doc/bitbake-user-manual/bitbake-user-manual-ref-variables.rst
@@ -494,6 +494,15 @@  overview of their function and contents.
       running builds when not connected to the Internet, and when operating
       in certain kinds of firewall environments.
 
+   :term:`BB_ALLOW_URL_REMAP`
+      Allows the use of the :term:`BB_URL_REMAP` variable to remap URL's.
+
+   :term:`BB_URL_REMAP`
+      A table of remappings. This variable is used to remap URLs to
+      different locations. The remapping is done by the fetcher module
+      before the fetcher attempts to access the URL. The remapping is
+      performed by replacing the string with the remapped string.
+
    :term:`BB_NUMBER_PARSE_THREADS`
       Sets the number of threads BitBake uses when parsing. By default, the
       number of threads is equal to the number of cores on the system.
diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py
index 2de4f4f8c0..24918887fc 100644
--- a/bitbake/lib/bb/fetch2/__init__.py
+++ b/bitbake/lib/bb/fetch2/__init__.py
@@ -222,7 +222,7 @@  class URI(object):
 
         # We hijack the URL parameters, since the way bitbake uses
         # them are not quite RFC compliant.
-        uri, param_str = (uri.split(";", 1) + [None])[:2]
+        uri, param_str = self.split_uri_and_parameters(uri)
 
         urlp = urllib.parse.urlparse(uri)
         self.scheme = urlp.scheme
@@ -298,6 +298,13 @@  class URI(object):
     def _param_str_join(self, dict_, elmdelim, kvdelim="="):
         return elmdelim.join([kvdelim.join([k, v]) if v else k for k, v in dict_.items()])
 
+    @staticmethod
+    def split_uri_and_parameters(uri):
+        """
+        Splits the URI into the URI and the parameters.
+        """
+        return (uri.split(";", 1) + [None])[:2]
+
     @property
     def hostport(self):
         if not self.port:
@@ -1319,9 +1326,9 @@  class FetchData(object):
         self.mirrortarballs = []
         self.basename = None
         self.basepath = None
-        (self.type, self.host, self.path, self.user, self.pswd, self.parm) = decodeurl(d.expand(url))
+        self.url = self.remap_url_if_needed(d.expand(url), self.load_url_remap_data(d))
+        (self.type, self.host, self.path, self.user, self.pswd, self.parm) = decodeurl(self.url)
         self.date = self.getSRCDate(d)
-        self.url = url
         if not self.user and "user" in self.parm:
             self.user = self.parm["user"]
         if not self.pswd and "pswd" in self.parm:
@@ -1424,6 +1431,61 @@  class FetchData(object):
 
         return d.getVar("SRCDATE") or d.getVar("DATE")
 
+    def parse_url_remap_entries(self, pkgname, url_remap_entries, entries_as_str):
+        """
+        Parse the URL remap entries from the given dictionary.
+        """
+        if not entries_as_str:
+            return url_remap_entries
+
+        for entry in entries_as_str.split('\n'):
+            try:
+                from_url, to_url = entry.split()
+                url_remap_entries[from_url] = to_url
+            except ValueError:
+                logger.warning("Package %s contains invalid URL remap entry: %s", entry, pkgname)
+                continue
+
+        return url_remap_entries
+
+    def load_url_remap_data(self, d):
+        """
+        Load URL remap data from the package-specific and global variables.
+        """
+        url_remap_entries = {}
+
+        allow_url_remap = d.getVar('BB_ALLOW_URL_REMAP')
+        if not allow_url_remap or not bb.utils.to_boolean(allow_url_remap):
+            logger.debug("URL remapping is disabled")
+            return url_remap_entries
+
+        pkgname = d.getVar('PN')
+        if pkgname:
+            # Check for URL remap entries in the package-specific variable
+            url_remap_entries = self.parse_url_remap_entries(pkgname,
+                                                             url_remap_entries,
+                                                             d.getVarFlag('BB_URL_REMAP', pkgname, False))
+
+        # Check for URL remap entries in the global variable
+        url_remap_entries = self.parse_url_remap_entries(pkgname,
+                                                         url_remap_entries,
+                                                         d.getVar('BB_URL_REMAP'))
+
+        return url_remap_entries
+
+    def remap_url_if_needed(self, base_url, url_remap_entries):
+        """
+        Remap the URL if needed based on the URL remap data.
+        """
+        uri, paramters = URI.split_uri_and_parameters(base_url)
+        for from_url, to_url in url_remap_entries.items():
+            if uri.startswith(from_url):
+                uri = uri.replace(from_url, to_url, 1) + ';' + paramters
+                logger.debug("Remapped URL %s to %s", base_url, uri)
+                print("Remapped URL %s to %s" % (base_url, uri))
+                return uri
+        return base_url
+
 class FetchMethod(object):
     """Base class for 'fetch'ing data"""
 
diff --git a/bitbake/lib/bb/tests/fetch.py b/bitbake/lib/bb/tests/fetch.py
index f0c628524c..670781ad93 100644
--- a/bitbake/lib/bb/tests/fetch.py
+++ b/bitbake/lib/bb/tests/fetch.py
@@ -602,13 +602,19 @@  class GitDownloadDirectoryNamingTest(FetcherTest):
 
     @skipIfNoNetwork()
     def test_that_directory_is_named_after_recipe_url_when_no_mirroring_is_used(self):
-        self.setup_mirror_rewrite()
         fetcher = bb.fetch.Fetch([self.recipe_url], self.d)
 
         fetcher.download()
 
         dir = os.listdir(self.dldir + "/git2")
         self.assertIn(self.recipe_dir, dir)
+        self.assertIn(self.recipe_dir + '.done', dir)
+        self.assertNotIn(self.mirror_dir, dir)
+        self.assertNotIn(self.mirror_dir + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_dir,
+            self.recipe_dir + '.done'
+        ]))
 
     @skipIfNoNetwork()
     def test_that_directory_exists_for_mirrored_url_and_recipe_url_when_mirroring_is_used(self):
@@ -620,6 +626,12 @@  class GitDownloadDirectoryNamingTest(FetcherTest):
         dir = os.listdir(self.dldir + "/git2")
         self.assertIn(self.mirror_dir, dir)
         self.assertIn(self.recipe_dir, dir)
+        self.assertIn(self.recipe_dir + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_dir,
+            self.recipe_dir + '.done',
+            self.mirror_dir
+        ]))
 
     @skipIfNoNetwork()
     def test_that_recipe_directory_and_mirrored_directory_exists_when_mirroring_is_used_and_the_mirrored_directory_already_exists(self):
@@ -632,7 +644,15 @@  class GitDownloadDirectoryNamingTest(FetcherTest):
 
         dir = os.listdir(self.dldir + "/git2")
         self.assertIn(self.mirror_dir, dir)
+        self.assertIn(self.mirror_dir + '.done', dir)
         self.assertIn(self.recipe_dir, dir)
+        self.assertIn(self.recipe_dir + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_dir,
+            self.recipe_dir + '.done',
+            self.mirror_dir,
+            self.mirror_dir + '.done'
+        ]))
 
 
 class TarballNamingTest(FetcherTest):
@@ -657,6 +677,15 @@  class TarballNamingTest(FetcherTest):
 
         dir = os.listdir(self.dldir)
         self.assertIn(self.recipe_tarball, dir)
+        self.assertIn(self.recipe_tarball + '.done', dir)
+        self.assertIn('git2', dir)
+        self.assertNotIn(self.mirror_tarball, dir)
+        self.assertNotIn(self.mirror_tarball + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_tarball,
+            self.recipe_tarball + '.done',
+            'git2'
+        ]))
 
     @skipIfNoNetwork()
     def test_that_the_mirror_tarball_is_created_when_mirroring_is_used(self):
@@ -666,7 +695,16 @@  class TarballNamingTest(FetcherTest):
         fetcher.download()
 
         dir = os.listdir(self.dldir)
+        self.assertNotIn(self.recipe_tarball, dir)
+        self.assertNotIn(self.recipe_tarball + '.done', dir)
+        self.assertIn('git2', dir)
         self.assertIn(self.mirror_tarball, dir)
+        self.assertIn(self.mirror_tarball + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.mirror_tarball,
+            self.mirror_tarball + '.done',
+            'git2'
+        ]))
 
 
 class GitShallowTarballNamingTest(FetcherTest):
@@ -692,6 +730,15 @@  class GitShallowTarballNamingTest(FetcherTest):
 
         dir = os.listdir(self.dldir)
         self.assertIn(self.recipe_tarball, dir)
+        self.assertIn(self.recipe_tarball + '.done', dir)
+        self.assertIn('git2', dir)
+        self.assertNotIn(self.mirror_tarball, dir)
+        self.assertNotIn(self.mirror_tarball + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_tarball,
+            self.recipe_tarball + '.done',
+            'git2'
+        ]))
 
     @skipIfNoNetwork()
     def test_that_the_mirror_tarball_is_created_when_mirroring_is_used(self):
@@ -701,7 +748,62 @@  class GitShallowTarballNamingTest(FetcherTest):
         fetcher.download()
 
         dir = os.listdir(self.dldir)
+        self.assertNotIn(self.recipe_tarball, dir)
+        self.assertNotIn(self.recipe_tarball + '.done', dir)
         self.assertIn(self.mirror_tarball, dir)
+        self.assertIn(self.mirror_tarball + '.done', dir)
+        self.assertIn('git2', dir)
+        self.assertEqual(len(dir), len([
+            self.mirror_tarball,
+            self.mirror_tarball + '.done',
+            'git2'
+        ]))
+
+
+class GitSameUrl(FetcherTest):
+    def setUp(self):
+        super(GitSameUrl, self).setUp()
+        self.recipe_url = "git://git.openembedded.org/bitbake;branch=master;protocol=https"
+        self.recipe_tarball = "git.openembedded.org.bitbake"
+        self.recipe_url_two = "git://github.com/openembedded/bitbake.git;protocol=https;branch=master"
+        self.recipe_tarball_two = "github.com.openembedded.bitbake.git"
+
+        self.d.setVar('SRCREV', '82ea737a0b42a8b53e11c9cde141e9e9c0bd8c40')
+        self.d.setVar('BB_URL_REMAP', 'git://github.com/openembedded/bitbake.git git://git.openembedded.org/bitbake')
+
+    @skipIfNoNetwork()
+    def test_url_remap_allowed(self):
+        self.d.setVar('BB_ALLOW_URL_REMAP', '1')
+        fetcher = bb.fetch.Fetch([self.recipe_url, self.recipe_url_two], self.d)
+        fetcher.download()
+
+        dir = os.listdir(self.dldir + "/git2")
+        self.assertIn(self.recipe_tarball, dir)
+        self.assertIn(self.recipe_tarball + '.done', dir)
+        self.assertNotIn(self.recipe_tarball_two, dir)
+        self.assertNotIn(self.recipe_tarball_two + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_tarball,
+            self.recipe_tarball + '.done'
+        ]))
+
+    @skipIfNoNetwork()
+    def test_url_remap_not_allowed(self):
+        self.d.setVar('BB_ALLOW_URL_REMAP', '0')
+        fetcher = bb.fetch.Fetch([self.recipe_url, self.recipe_url_two], self.d)
+        fetcher.download()
+
+        dir = os.listdir(self.dldir + "/git2")
+        self.assertIn(self.recipe_tarball, dir)
+        self.assertIn(self.recipe_tarball + '.done', dir)
+        self.assertIn(self.recipe_tarball_two, dir)
+        self.assertIn(self.recipe_tarball_two + '.done', dir)
+        self.assertEqual(len(dir), len([
+            self.recipe_tarball,
+            self.recipe_tarball + '.done',
+            self.recipe_tarball_two,
+            self.recipe_tarball_two + '.done'
+        ]))
 
 
 class CleanTarballTest(FetcherTest):