diff mbox series

[v2] fetch2/crate: support configurable registry and index URLs

Message ID 20260514200526.98596-1-ms98.cho@gmail.com
State New
Headers show
Series [v2] fetch2/crate: support configurable registry and index URLs | expand

Commit Message

minsung.cho May 14, 2026, 8:05 p.m. UTC
The crate fetcher previously hardcoded the crates.io download URL and
only supported a fixed /crate/versions API shape for custom registries.
This prevented recipes from using private registries or mirrors with
different download paths or Cargo sparse indexes.

Add BB_CRATE_REGISTRY_URL[host] and BB_CRATE_INDEX_URL[host] templates
with {crate}, {version}, and {index_path} placeholders. Keep the
existing crates.io defaults, allow custom API-style version endpoints,
and mark sparse indexes explicitly so latest-version checks parse Cargo
index NDJSON instead of JSON API responses.

Add non-network tests for default crates.io URLs, custom registry
templates, sparse index paths, trailing slash handling, and both
latest-version parser paths. Also make the fetch test cleanup chmod
invocation portable by placing -R before the mode.

[YOCTO #16276]

Signed-off-by: minsung.cho <ms98.cho@gmail.com>
---
 lib/bb/fetch2/crate.py |  39 +++++++++++--
 lib/bb/tests/fetch.py  | 122 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 156 insertions(+), 5 deletions(-)

Comments

Mathieu Dubois-Briand May 15, 2026, 11:12 a.m. UTC | #1
On Thu May 14, 2026 at 10:05 PM CEST, minsung.cho via lists.openembedded.org wrote:
> The crate fetcher previously hardcoded the crates.io download URL and
> only supported a fixed /crate/versions API shape for custom registries.
> This prevented recipes from using private registries or mirrors with
> different download paths or Cargo sparse indexes.
>
> Add BB_CRATE_REGISTRY_URL[host] and BB_CRATE_INDEX_URL[host] templates
> with {crate}, {version}, and {index_path} placeholders. Keep the
> existing crates.io defaults, allow custom API-style version endpoints,
> and mark sparse indexes explicitly so latest-version checks parse Cargo
> index NDJSON instead of JSON API responses.
>
> Add non-network tests for default crates.io URLs, custom registry
> templates, sparse index paths, trailing slash handling, and both
> latest-version parser paths. Also make the fetch test cleanup chmod
> invocation portable by placing -R before the mode.
>
> [YOCTO #16276]
>
> Signed-off-by: minsung.cho <ms98.cho@gmail.com>
> ---

Hi,

Thanks for your patch.

I note it does conflict with some other patches currently in testing,
and once these conflicts solved, I get some bitbake-selftest errors:

ERROR: test_crate_fetches_from_local_sparse_registry (bb.tests.fetch.CrateTest.test_crate_fetches_from_local_sparse_registry)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/srv/pokybuild/yocto-worker/oe-selftest-debian/build/layers/bitbake/lib/bb/tests/fetch.py", line 2829, in test_crate_fetches_from_local_sparse_registry
    self.assertEqual(ud.method.latest_versionstring(ud, self.d),
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/pokybuild/yocto-worker/oe-selftest-debian/build/layers/bitbake/lib/bb/fetch2/crate.py", line 197, in latest_versionstring
    return self._latest_versionstring_from_api(ud, d)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/srv/pokybuild/yocto-worker/oe-selftest-debian/build/layers/bitbake/lib/bb/fetch2/crate.py", line 204, in _latest_versionstring_from_api
    versions = [(0, i["num"], "") for i in json_data["versions"]]
                                           ~~~~~~~~~^^^^^^^^^^^^
KeyError: 'versions'

https://autobuilder.yoctoproject.org/valkyrie/#/builders/35/builds/3864
https://autobuilder.yoctoproject.org/valkyrie/#/builders/48/builds/3727
https://autobuilder.yoctoproject.org/valkyrie/#/builders/23/builds/3960

Maybe the easiest solution is to wait a few days until said patches are
either merged or rejected, then send a refreshed version.

If you want to have a look at the current situation, here is the branch
that was used for above builds:
https://git.yoctoproject.org/poky-ci-archive/log/?h=bitbake/autobuilder.yoctoproject.org/valkyrie/a-full-3832

Thanks,
Mathieu
minsung.cho May 15, 2026, 4:03 p.m. UTC | #2
Hi Mathieu,

Thanks for testing this and digging into the failure.

I reproduced it locally with master + Chen Qi's filter_regex series + this
patch. The KeyError comes from the conflict resolution in
latest_versionstring(), not from either patch on its own. The merged tree
kept the old prefix check (ud.versionsurl.startswith('
https://index.crates.io/')), but this patch intentionally drops that in
favour of an explicit ud.crate_index_format == 'sparse' check, since the
whole point is to support custom sparse registries whose index URL isn't on
index.crates.io. The local sparse registry test serves its index from
127.0.0.1, so the prefix check sends NDJSON to the JSON API parser.

The right merge keeps the format check and threads filter_regex through
both helpers:

def latest_versionstring(self, ud, d, filter_regex=None):
if getattr(ud, 'crate_index_format', None) == 'sparse':
return self._latest_versionstring_from_index(ud, d, filter_regex)
return self._latest_versionstring_from_api(ud, d, filter_regex)

With that, CrateTest passes here (12 tests, 5 skipped), including
test_crate_fetches_from_local_sparse_registry.

I've got a v3 ready with exactly that resolution, rebased on the
filter_regex series. Agreed it's cleaner to wait until that series lands or
gets dropped, then I'll send the refreshed v3. I'll watch the thread.

Thanks,
Minsung

On Fri, May 15, 2026 04:12 AM, Mathieu Dubois-Briand <
mathieu.dubois-briand@bootlin.com> wrote:

> On Thu May 14, 2026 at 10:05 PM CEST, minsung.cho via
> lists.openembedded.org wrote:
> > The crate fetcher previously hardcoded the crates.io download URL and
> > only supported a fixed /crate/versions API shape for custom registries.
> > This prevented recipes from using private registries or mirrors with
> > different download paths or Cargo sparse indexes.
> >
> > Add BB_CRATE_REGISTRY_URL[host] and BB_CRATE_INDEX_URL[host] templates
> > with {crate}, {version}, and {index_path} placeholders. Keep the
> > existing crates.io defaults, allow custom API-style version endpoints,
> > and mark sparse indexes explicitly so latest-version checks parse Cargo
> > index NDJSON instead of JSON API responses.
> >
> > Add non-network tests for default crates.io URLs, custom registry
> > templates, sparse index paths, trailing slash handling, and both
> > latest-version parser paths. Also make the fetch test cleanup chmod
> > invocation portable by placing -R before the mode.
> >
> > [YOCTO #16276]
> >
> > Signed-off-by: minsung.cho <ms98.cho@gmail.com>
> > ---
>
> Hi,
>
> Thanks for your patch.
>
> I note it does conflict with some other patches currently in testing,
> and once these conflicts solved, I get some bitbake-selftest errors:
>
> ERROR: test_crate_fetches_from_local_sparse_registry
> (bb.tests.fetch.CrateTest.test_crate_fetches_from_local_sparse_registry)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File
> "/srv/pokybuild/yocto-worker/oe-selftest-debian/build/layers/bitbake/lib/bb/tests/fetch.py",
> line 2829, in test_crate_fetches_from_local_sparse_registry
>     self.assertEqual(ud.method.latest_versionstring(ud, self.d),
>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File
> "/srv/pokybuild/yocto-worker/oe-selftest-debian/build/layers/bitbake/lib/bb/fetch2/crate.py",
> line 197, in latest_versionstring
>     return self._latest_versionstring_from_api(ud, d)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File
> "/srv/pokybuild/yocto-worker/oe-selftest-debian/build/layers/bitbake/lib/bb/fetch2/crate.py",
> line 204, in _latest_versionstring_from_api
>     versions = [(0, i["num"], "") for i in json_data["versions"]]
>                                            ~~~~~~~~~^^^^^^^^^^^^
> KeyError: 'versions'
>
> https://autobuilder.yoctoproject.org/valkyrie/#/builders/35/builds/3864
> https://autobuilder.yoctoproject.org/valkyrie/#/builders/48/builds/3727
> https://autobuilder.yoctoproject.org/valkyrie/#/builders/23/builds/3960
>
> Maybe the easiest solution is to wait a few days until said patches are
> either merged or rejected, then send a refreshed version.
>
> If you want to have a look at the current situation, here is the branch
> that was used for above builds:
>
> https://git.yoctoproject.org/poky-ci-archive/log/?h=bitbake/autobuilder.yoctoproject.org/valkyrie/a-full-3832
>
> Thanks,
> Mathieu
>
> --
> Mathieu Dubois-Briand, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
>
diff mbox series

Patch

diff --git a/lib/bb/fetch2/crate.py b/lib/bb/fetch2/crate.py
index bb12f4e..a6dfc6b 100644
--- a/lib/bb/fetch2/crate.py
+++ b/lib/bb/fetch2/crate.py
@@ -77,12 +77,43 @@  class Crate(Wget):
         # host (this is to allow custom crate registries to be specified
         host = '/'.join(parts[2:-2])
 
-        # If using crates.io use the CDN directly as per https://crates.io/data-access
+        # Allow overriding the registry and index URLs via bitbake variables.
+        # Example:
+        # BB_CRATE_REGISTRY_URL[my-host] = "https://my-host.com/crates/{crate}/{version}/download"
+        #          BB_CRATE_INDEX_URL[my-host] = "https://my-host.com/index/{index_path}"
+        dl_url = d.getVarFlag('BB_CRATE_REGISTRY_URL', host)
+        index_url = d.getVarFlag('BB_CRATE_INDEX_URL', host)
+
         if host == 'crates.io':
-            ud.url = "https://static.crates.io/crates/%s/%s/download" % (name, version)
-            ud.versionsurl = 'https://index.crates.io/' + self._generate_index_path(name)
+            if not dl_url:
+                dl_url = "https://static.crates.io/crates/{crate}/{version}/download"
+            if not index_url:
+                index_url = "https://index.crates.io/"
+
+        if dl_url:
+            ud.url = dl_url.replace('{crate}', name).replace('{version}', version)
         else:
             ud.url = "https://%s/%s/%s/download" % (host, name, version)
+
+        ud.crate_index_format = 'api'
+        if index_url:
+            index_path = self._generate_index_path(name)
+            # Support {index_path} for sparse index registries. If not present
+            # and it's crates.io, append it for backward compatibility.
+            if '{index_path}' in index_url:
+                ud.versionsurl = index_url.replace('{index_path}', index_path)
+                ud.crate_index_format = 'sparse'
+            elif host == 'crates.io' or index_url.startswith('https://index.crates.io/'):
+                if not index_url.endswith('/'):
+                    index_url += '/'
+                ud.versionsurl = index_url + index_path
+                ud.crate_index_format = 'sparse'
+            else:
+                ud.versionsurl = index_url
+
+            # Apply other placeholders
+            ud.versionsurl = ud.versionsurl.replace('{crate}', name).replace('{version}', version)
+        else:
             ud.versionsurl = "https://%s/%s/versions" % (host, name)
 
         ud.parm['downloadfilename'] = "%s-%s.crate" % (name, version)
@@ -160,7 +191,7 @@  class Crate(Wget):
         Return the latest upstream version, dispatching to the appropriate
         parser based on the versionsurl format.
         """
-        if ud.versionsurl.startswith('https://index.crates.io/'):
+        if getattr(ud, 'crate_index_format', None) == 'sparse':
             return self._latest_versionstring_from_index(ud, d)
         return self._latest_versionstring_from_api(ud, d)
 
diff --git a/lib/bb/tests/fetch.py b/lib/bb/tests/fetch.py
index 86dd929..0dff3fc 100644
--- a/lib/bb/tests/fetch.py
+++ b/lib/bb/tests/fetch.py
@@ -424,7 +424,7 @@  class FetcherTest(unittest.TestCase):
         if os.environ.get("BB_TMPDIR_NOCLEAN") == "yes":
             print("Not cleaning up %s. Please remove manually." % self.tempdir)
         else:
-            bb.process.run('chmod u+rw -R %s' % self.tempdir)
+            bb.process.run('chmod -R u+rw %s' % self.tempdir)
             bb.utils.prunedir(self.tempdir)
 
     def git(self, cmd, cwd=None):
@@ -2590,6 +2590,126 @@  class FetchLocallyMissingTagFromRemote(FetcherTest):
 
 
 class CrateTest(FetcherTest):
+    def test_crate_url_uses_crates_io_defaults(self):
+        ud = bb.fetch2.FetchData("crate://crates.io/glob/0.2.11", self.d)
+
+        self.assertEqual(ud.url,
+                         "https://static.crates.io/crates/glob/0.2.11/download")
+        self.assertEqual(ud.versionsurl, "https://index.crates.io/gl/ob/glob")
+        self.assertEqual(ud.crate_index_format, "sparse")
+        self.assertEqual(ud.parm["downloadfilename"], "glob-0.2.11.crate")
+        self.assertEqual(ud.parm["name"], "glob-0.2.11")
+
+    def test_crate_url_supports_custom_registry_templates(self):
+        self.d.setVarFlag("BB_CRATE_REGISTRY_URL", "registry.example.com",
+                          "https://registry.example.com/api/v1/crates/{crate}/{version}/download")
+        self.d.setVarFlag("BB_CRATE_INDEX_URL", "registry.example.com",
+                          "https://registry.example.com/api/v1/crates/{crate}/versions")
+
+        ud = bb.fetch2.FetchData("crate://registry.example.com/glob/0.2.11", self.d)
+
+        self.assertEqual(ud.url,
+                         "https://registry.example.com/api/v1/crates/glob/0.2.11/download")
+        self.assertEqual(ud.versionsurl,
+                         "https://registry.example.com/api/v1/crates/glob/versions")
+        self.assertEqual(ud.crate_index_format, "api")
+
+    def test_crate_url_supports_custom_sparse_index_templates(self):
+        self.d.setVarFlag("BB_CRATE_REGISTRY_URL", "registry.example.com",
+                          "https://registry.example.com/crates/{crate}/{version}/download")
+        self.d.setVarFlag("BB_CRATE_INDEX_URL", "registry.example.com",
+                          "https://registry.example.com/index/{index_path}")
+
+        ud = bb.fetch2.FetchData("crate://registry.example.com/aho-corasick/0.7.20", self.d)
+
+        self.assertEqual(ud.url,
+                         "https://registry.example.com/crates/aho-corasick/0.7.20/download")
+        self.assertEqual(ud.versionsurl,
+                         "https://registry.example.com/index/ah/o-/aho-corasick")
+        self.assertEqual(ud.crate_index_format, "sparse")
+
+    def test_crate_url_adds_sparse_index_path_with_missing_trailing_slash(self):
+        self.d.setVarFlag("BB_CRATE_INDEX_URL", "crates.io", "https://index.crates.io")
+
+        ud = bb.fetch2.FetchData("crate://crates.io/glob/0.2.11", self.d)
+
+        self.assertEqual(ud.versionsurl, "https://index.crates.io/gl/ob/glob")
+        self.assertEqual(ud.crate_index_format, "sparse")
+
+    def test_crate_latest_versionstring_supports_custom_sparse_index(self):
+        self.d.setVarFlag("BB_CRATE_INDEX_URL", "registry.example.com",
+                          "https://registry.example.com/index/{index_path}")
+        ud = bb.fetch2.FetchData("crate://registry.example.com/glob/0.2.11", self.d)
+        index = '\n'.join([
+            '{"vers":"0.2.10","yanked":false}',
+            '{"vers":"0.2.11","yanked":true}',
+            '{"vers":"0.2.12","yanked":false}',
+        ])
+
+        with unittest.mock.patch("bb.fetch2.crate.Crate._fetch_index", return_value=index):
+            self.assertEqual(ud.method.latest_versionstring(ud, self.d), ("0.2.12", ""))
+
+    def test_crate_latest_versionstring_supports_custom_api_index(self):
+        self.d.setVarFlag("BB_CRATE_INDEX_URL", "registry.example.com",
+                          "https://registry.example.com/api/v1/crates/{crate}/versions")
+        ud = bb.fetch2.FetchData("crate://registry.example.com/glob/0.2.11", self.d)
+        index = '{"versions":[{"num":"0.2.10"},{"num":"0.2.12"}]}'
+
+        with unittest.mock.patch("bb.fetch2.crate.Crate._fetch_index", return_value=index):
+            self.assertEqual(ud.method.latest_versionstring(ud, self.d), ("0.2.12", ""))
+
+    def test_crate_fetches_from_local_sparse_registry(self):
+        registry = os.path.join(self.tempdir, "registry")
+        crate_name = "dummycrate"
+        crate_version = "1.0.0"
+        crate_basename = "%s-%s" % (crate_name, crate_version)
+        crate_path = os.path.join(registry, "crates", crate_name,
+                                  crate_version, "download")
+        index_path = os.path.join(registry, "index", "du", "mm", crate_name)
+        source_dir = os.path.join(self.tempdir, "crate-source", crate_basename)
+        bb.utils.mkdirhier(os.path.join(source_dir, "src"))
+        bb.utils.mkdirhier(os.path.dirname(crate_path))
+        bb.utils.mkdirhier(os.path.dirname(index_path))
+
+        with open(os.path.join(source_dir, "Cargo.toml"), "w") as f:
+            f.write('[package]\nname = "%s"\nversion = "%s"\n' %
+                    (crate_name, crate_version))
+        with open(os.path.join(source_dir, "src", "lib.rs"), "w") as f:
+            f.write("pub fn answer() -> u32 { 42 }\n")
+        with tarfile.open(crate_path, "w:gz") as tar:
+            tar.add(source_dir, arcname=crate_basename)
+        with open(crate_path, "rb") as f:
+            crate_checksum = hashlib.sha256(f.read()).hexdigest()
+        with open(index_path, "w") as f:
+            f.write('{"name":"%s","vers":"%s","yanked":false}\n' %
+                    (crate_name, crate_version))
+
+        server = HTTPService(registry, host="127.0.0.1")
+        server.start()
+        try:
+            host = "127.0.0.1:%s" % server.port
+            self.d.setVarFlag("BB_CRATE_REGISTRY_URL", host,
+                              "http://%s/crates/{crate}/{version}/download" % host)
+            self.d.setVarFlag("BB_CRATE_INDEX_URL", host,
+                              "http://%s/index/{index_path}" % host)
+            self.d.setVarFlag("SRC_URI", "%s.sha256sum" % crate_basename,
+                              crate_checksum)
+            uri = "crate://%s/%s/%s" % (host, crate_name, crate_version)
+
+            fetcher = bb.fetch2.Fetch([uri], self.d)
+            ud = fetcher.ud[fetcher.urls[0]]
+            self.assertEqual(ud.crate_index_format, "sparse")
+            self.assertEqual(ud.method.latest_versionstring(ud, self.d),
+                             (crate_version, ""))
+
+            fetcher.download()
+            fetcher.unpack(self.tempdir)
+            unpacked_file = os.path.join(self.tempdir, "cargo_home", "bitbake",
+                                         crate_basename, "src", "lib.rs")
+            self.assertTrue(os.path.exists(unpacked_file))
+        finally:
+            server.stop()
+
     @skipIfNoNetwork()
     def test_crate_url(self):