| Message ID | 20260602062940.3107241-3-saisneha196@gmail.com |
|---|---|
| State | Changes Requested |
| Headers | show |
| Series | Add missing SRCREV check for SCM URIs | expand |
On Tue, 2026-06-02 at 11:59 +0530, Sai Sneha via lists.openembedded.org wrote: > A recipe with a git:// (or other SCM) URI in SRC_URI must have a > corresponding SRCREV set. Without it, BitBake performs a live query > on the remote repository at every parse, breaking reproducibility > and causing parse failures under BB_NO_NETWORK=1. > > The trivial fix of checking SRCREV in insane.bbclass do_recipe_qa > fails because fetcher_hashes_dummyfunc[vardepvalue] expands SRCREV > at parse time before QA checks run, causing a FetchError that halts > parsing entirely. This was identified in Corentin Guillevic's RFC > series and confirmed by Mathieu Dubois-Briand's autobuilder testing. > > The fix intercepts at the vardepvalue expansion point by introducing > check_srcrev_set() which uses the shared oe.qa.check_uri_srcrev() > helper. The function is called conditionally: > > fetcher_hashes_dummyfunc[vardepvalue] = > "${@bb.fetch.get_hashvalue(d) if check_srcrev_set(d) else ''}" > > Note on design choice: we intentionally use d.getVar(candidate, False) > in oe.qa.check_uri_srcrev() rather than srcrev_internal_helper() > because the latter expands SRCREV, triggering get_autorev() and live > network fetches -- the exact problem this patch fixes. > > Note: parse-time warnings intentionally complement the QA-time warnings > added in insane.bbclass. This is necessary because CI pipelines running > 'bitbake --parse-only' with BB_NO_NETWORK=1 (the original motivation > from Ross Burton) would miss the issue entirely if only QA-time > warnings existed. This pattern is already established in the codebase > (e.g. src-uri-bad fires at both parse and QA time). > > Reported-by: Yoann Congal <yoann.congal@smile.fr> > Fixes: https://bugzilla.yoctoproject.org/show_bug.cgi?id=16051 > AI-Generated: Developed with assistance from Anthropic Claude > Signed-off-by: Sai Sneha <saisneha196@gmail.com> > --- > > Changes in v5: > - No changes to this patch > > Changes in v4: > - Refactored to use shared oe.qa.check_uri_srcrev() helper > - Eliminates duplicated URI parsing logic > - Added docstring explaining True/False return values > > Changes in v3: > - Added AI-Generated disclosure and Reported-by tag > > Changes in v2: > - Initial public submission > meta/classes-global/base.bbclass | 30 ++++++++++++++++++++++++++++-- > 1 file changed, 28 insertions(+), 2 deletions(-) > > diff --git a/meta/classes-global/base.bbclass b/meta/classes-global/base.bbclass > index 62f2814bb7..7de896ca7d 100644 > --- a/meta/classes-global/base.bbclass > +++ b/meta/classes-global/base.bbclass > @@ -164,14 +164,40 @@ def setup_hosttools_dir(dest, toolsvar, d, fatal=True): > > if notfound and fatal: > bb.fatal("The following required tools (as specified by HOSTTOOLS) appear to be unavailable in PATH, please install them in order to proceed:\n %s" % " ".join(notfound)) > + > +def check_srcrev_set(d): > + """ > + Inspect SRC_URI for SCM URIs missing a valid SRCREV. > + > + Returns True if all SCM URIs have a valid SRCREV or AUTOREV set, > + allowing get_hashvalue(d) to proceed normally. > + > + Returns False if any SCM URI is missing SRCREV, suppressing > + get_hashvalue(d) to prevent a FetchError parse crash. > + """ > + import oe.qa > + src_uri = (d.getVar('SRC_URI', False) or '').split() > + pn = d.getVar('PN') > + for uri in src_uri: > + rev = oe.qa.check_uri_srcrev(pn, uri, d) > + if rev is None: > + # SRCREV missing — warn and suppress hash expansion > + if bb.utils.contains('ERROR_QA', 'missing-srcrev', True, False, d): > + bb.error("%s: SRCREV not set for %s" % (pn, uri)) > + elif bb.utils.contains('WARN_QA', 'missing-srcrev', True, False, d): > + bb.warn("%s: SRCREV not set for %s" % (pn, uri)) > + return False > + if rev and '${AUTOREV}' in rev: > + return True > + return True > + > > # We can't use vardepvalue against do_fetch directly since that would overwrite > # the other task dependencies so we use an indirect function. > python fetcher_hashes_dummyfunc() { > return > } > -fetcher_hashes_dummyfunc[vardepvalue] = "${@bb.fetch.get_hashvalue(d)}" > - > +fetcher_hashes_dummyfunc[vardepvalue] = "${@bb.fetch.get_hashvalue(d) if check_srcrev_set(d) else ''}" I think the overall approach here is wrong. This code attempts to mask the error that would occur if network access is unavailable but the SRCREV cannot be resolved without network access. However, it fails to do this properly - what about SRCREV set to a tag instead of a commit hash? In that case check_uri_srcrev() would return a non-empty string, check_srcrev_set() would return True and we would still end up looking up the tag in the remote git repository during parsing. For the cases where one SRC_URI entry specifies a branch or tag, and has no SRCREV set, all SRCREV hashes are removed from the vardeps of the do_fetch function. This means that the fetcher will no longer be re-ran when the branches or tags are changed upstream. That is breakage of expected fetcher behaviour. I recommend discussing the issues further on the bug tracker or mailing list, making sure you have a full understanding of the behaviour and the necessary change. Then come back to the implementation. Best regards,
diff --git a/meta/classes-global/base.bbclass b/meta/classes-global/base.bbclass index 62f2814bb7..7de896ca7d 100644 --- a/meta/classes-global/base.bbclass +++ b/meta/classes-global/base.bbclass @@ -164,14 +164,40 @@ def setup_hosttools_dir(dest, toolsvar, d, fatal=True): if notfound and fatal: bb.fatal("The following required tools (as specified by HOSTTOOLS) appear to be unavailable in PATH, please install them in order to proceed:\n %s" % " ".join(notfound)) + +def check_srcrev_set(d): + """ + Inspect SRC_URI for SCM URIs missing a valid SRCREV. + + Returns True if all SCM URIs have a valid SRCREV or AUTOREV set, + allowing get_hashvalue(d) to proceed normally. + + Returns False if any SCM URI is missing SRCREV, suppressing + get_hashvalue(d) to prevent a FetchError parse crash. + """ + import oe.qa + src_uri = (d.getVar('SRC_URI', False) or '').split() + pn = d.getVar('PN') + for uri in src_uri: + rev = oe.qa.check_uri_srcrev(pn, uri, d) + if rev is None: + # SRCREV missing — warn and suppress hash expansion + if bb.utils.contains('ERROR_QA', 'missing-srcrev', True, False, d): + bb.error("%s: SRCREV not set for %s" % (pn, uri)) + elif bb.utils.contains('WARN_QA', 'missing-srcrev', True, False, d): + bb.warn("%s: SRCREV not set for %s" % (pn, uri)) + return False + if rev and '${AUTOREV}' in rev: + return True + return True + # We can't use vardepvalue against do_fetch directly since that would overwrite # the other task dependencies so we use an indirect function. python fetcher_hashes_dummyfunc() { return } -fetcher_hashes_dummyfunc[vardepvalue] = "${@bb.fetch.get_hashvalue(d)}" - +fetcher_hashes_dummyfunc[vardepvalue] = "${@bb.fetch.get_hashvalue(d) if check_srcrev_set(d) else ''}" addtask fetch do_fetch[dirs] = "${DL_DIR}" do_fetch[file-checksums] = "${@bb.fetch.get_checksum_file_list(d)}"
A recipe with a git:// (or other SCM) URI in SRC_URI must have a corresponding SRCREV set. Without it, BitBake performs a live query on the remote repository at every parse, breaking reproducibility and causing parse failures under BB_NO_NETWORK=1. The trivial fix of checking SRCREV in insane.bbclass do_recipe_qa fails because fetcher_hashes_dummyfunc[vardepvalue] expands SRCREV at parse time before QA checks run, causing a FetchError that halts parsing entirely. This was identified in Corentin Guillevic's RFC series and confirmed by Mathieu Dubois-Briand's autobuilder testing. The fix intercepts at the vardepvalue expansion point by introducing check_srcrev_set() which uses the shared oe.qa.check_uri_srcrev() helper. The function is called conditionally: fetcher_hashes_dummyfunc[vardepvalue] = "${@bb.fetch.get_hashvalue(d) if check_srcrev_set(d) else ''}" Note on design choice: we intentionally use d.getVar(candidate, False) in oe.qa.check_uri_srcrev() rather than srcrev_internal_helper() because the latter expands SRCREV, triggering get_autorev() and live network fetches -- the exact problem this patch fixes. Note: parse-time warnings intentionally complement the QA-time warnings added in insane.bbclass. This is necessary because CI pipelines running 'bitbake --parse-only' with BB_NO_NETWORK=1 (the original motivation from Ross Burton) would miss the issue entirely if only QA-time warnings existed. This pattern is already established in the codebase (e.g. src-uri-bad fires at both parse and QA time). Reported-by: Yoann Congal <yoann.congal@smile.fr> Fixes: https://bugzilla.yoctoproject.org/show_bug.cgi?id=16051 AI-Generated: Developed with assistance from Anthropic Claude Signed-off-by: Sai Sneha <saisneha196@gmail.com> --- Changes in v5: - No changes to this patch Changes in v4: - Refactored to use shared oe.qa.check_uri_srcrev() helper - Eliminates duplicated URI parsing logic - Added docstring explaining True/False return values Changes in v3: - Added AI-Generated disclosure and Reported-by tag Changes in v2: - Initial public submission meta/classes-global/base.bbclass | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-)