diff mbox series

[RFC,15/30] classes: add early fetch, unpack and patch support

Message ID 20250211150034.18696-16-stefan.herbrechtsmeier-oss@weidmueller.com
State New
Headers show
Series Add vendor support for go, npm and rust | expand

Commit Message

Stefan Herbrechtsmeier Feb. 11, 2025, 3 p.m. UTC
From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>

Add support for early fetch, unpack and patches task which run before
normal patch task. This feature is useful to fetch additional
dependencies based on a patched source before the normal unpack and
patch tasks. The patch are marked as early via an early=1 parameter. An
example use case is a patch for a package manager lock file (Cargo.lock,
go.sum, package-lock.json).

Signed-off-by: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
---

 meta/classes-global/patch.bbclass | 17 +++++----
 meta/classes-recipe/early.bbclass | 61 +++++++++++++++++++++++++++++++
 meta/lib/oe/patch.py              | 10 +++--
 3 files changed, 77 insertions(+), 11 deletions(-)
 create mode 100644 meta/classes-recipe/early.bbclass

Comments

Richard Purdie Feb. 11, 2025, 10:27 p.m. UTC | #1
On Tue, 2025-02-11 at 16:00 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote:
> From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> 
> Add support for early fetch, unpack and patches task which run before
> normal patch task. This feature is useful to fetch additional
> dependencies based on a patched source before the normal unpack and
> patch tasks. The patch are marked as early via an early=1 parameter. An
> example use case is a patch for a package manager lock file (Cargo.lock,
> go.sum, package-lock.json).
> 
> Signed-off-by: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> ---
> 
>  meta/classes-global/patch.bbclass | 17 +++++----
>  meta/classes-recipe/early.bbclass | 61 +++++++++++++++++++++++++++++++
>  meta/lib/oe/patch.py              | 10 +++--
>  3 files changed, 77 insertions(+), 11 deletions(-)
>  create mode 100644 meta/classes-recipe/early.bbclass

This level of complexity is going to cause massive headaches in future.
Having two fetch, two unpack and two patch tasks will ultimately just
cause a lot of confusion and make things like the SPDX code and
archiver code more complex too.

Rather than patching a cargo lock file, I'd probably prefer the correct
version be added in a later SRC_URI entry and used to overwrite the
earlier one at unpack time.

I appreciate the challenge there is then that fetcher isn't going to
know what it is fetching at fetch time though as it would potentially
be fetching with the unpatched lock file.

This is partly why we end up with the .inc files the way we do, then it
is explicit.  I'm not sure the fetcher can know deterministically what
it is fetching in advance without the .inc data though, which is one of
the concerns that have been expressed previously.

Cheers,

Richard
Bruce Ashfield Feb. 11, 2025, 10:32 p.m. UTC | #2
In message: [OE-core] [RFC PATCH 15/30] classes: add early fetch, unpack and patch support
on 11/02/2025 Stefan Herbrechtsmeier via lists.openembedded.org wrote:

> From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> 
> Add support for early fetch, unpack and patches task which run before
> normal patch task. This feature is useful to fetch additional
> dependencies based on a patched source before the normal unpack and
> patch tasks. The patch are marked as early via an early=1 parameter. An
> example use case is a patch for a package manager lock file (Cargo.lock,
> go.sum, package-lock.json).

I understand why you need to do the above, but there's now multiple
fetch and patch tasks running throughout the pipeline of the
build and that's just more complexity to maintain.

This is what I was asking about in some of my first replies
to the RFC series. When I've done this in the past, I'd just
suggest something simpler like the recipe provide a complete
"lock file" in the layer/recipe-space and have it overlayed
in the source.

Of course that means it is in the same fetch task as what
will be fetching the vendor bits, but coordinating within
the single task is simpler in my experience (and it can
be checked on the SRC_URI). I haven't fully thought through
how to manage it, but wanted to reply to this while it was
fresh in my mind.

Bruce

> 
> Signed-off-by: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> ---
> 
>  meta/classes-global/patch.bbclass | 17 +++++----
>  meta/classes-recipe/early.bbclass | 61 +++++++++++++++++++++++++++++++
>  meta/lib/oe/patch.py              | 10 +++--
>  3 files changed, 77 insertions(+), 11 deletions(-)
>  create mode 100644 meta/classes-recipe/early.bbclass
> 
> diff --git a/meta/classes-global/patch.bbclass b/meta/classes-global/patch.bbclass
> index e5786b1c9a..7fa94c7aa7 100644
> --- a/meta/classes-global/patch.bbclass
> +++ b/meta/classes-global/patch.bbclass
> @@ -82,18 +82,18 @@ python patch_task_postfunc() {
>              oe.patch.GitApplyTree.commitIgnored("Add changes from %s" % func, dir=srcsubdir, files=['.'], d=d)
>  }
>  
> -def src_patches(d, all=False, expand=True):
> +def src_patches(d, all=False, expand=True, early=False):
>      import oe.patch
> -    return oe.patch.src_patches(d, all, expand)
> +    return oe.patch.src_patches(d, all, expand, early)
>  
> -def should_apply(parm, d):
> +def should_apply(parm, d, early=False):
>      """Determine if we should apply the given patch"""
>      import oe.patch
> -    return oe.patch.should_apply(parm, d)
> +    return oe.patch.should_apply(parm, d, early)
>  
>  should_apply[vardepsexclude] = "DATE SRCDATE"
>  
> -python patch_do_patch() {
> +def apply_patches(d, s, early=False):
>      import oe.patch
>  
>      patchsetmap = {
> @@ -113,8 +113,6 @@ python patch_do_patch() {
>  
>      classes = {}
>  
> -    s = d.getVar('S')
> -
>      os.putenv('PATH', d.getVar('PATH'))
>  
>      # We must use one TMPDIR per process so that the "patch" processes
> @@ -124,7 +122,7 @@ python patch_do_patch() {
>      process_tmpdir = tempfile.mkdtemp()
>      os.environ['TMPDIR'] = process_tmpdir
>  
> -    for patch in src_patches(d):
> +    for patch in src_patches(d, early=early):
>          _, _, local, _, _, parm = bb.fetch.decodeurl(patch)
>  
>          if "patchdir" in parm:
> @@ -159,6 +157,9 @@ python patch_do_patch() {
>  
>      bb.utils.remove(process_tmpdir, True)
>      del os.environ['TMPDIR']
> +
> +python patch_do_patch() {
> +    apply_patches(d, d.getVar('S'))
>  }
>  patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
>  
> diff --git a/meta/classes-recipe/early.bbclass b/meta/classes-recipe/early.bbclass
> new file mode 100644
> index 0000000000..e458fd8f7b
> --- /dev/null
> +++ b/meta/classes-recipe/early.bbclass
> @@ -0,0 +1,61 @@
> +# Copyright (C) 2025 Weidmueller Interface GmbH & Co. KG
> +# Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> +#
> +# SPDX-License-Identifier: MIT
> +
> +EARLY_UNPACKDIR = "${WORKDIR}/sources-unpack-early"
> +
> +def get_early_source_dir(d, sourcedir):
> +    unpackdir = d.getVar("UNPACKDIR")
> +    workdir = d.getVar('WORKDIR')
> +    originaldir = unpackdir if sourcedir.startswith(unpackdir) else workdir
> +    early_unpackdir = d.getVar("EARLY_UNPACKDIR")
> +    return sourcedir.replace(originaldir, early_unpackdir)
> +
> +python early_do_fetch_early() {
> +    fetch_src_uris(d, True)
> +}
> +addtask fetch_early
> +do_fetch_early[dirs] = "${DL_DIR}"
> +do_fetch_early[file-checksums] = "${@bb.fetch.get_checksum_file_list(d)}"
> +do_fetch[prefuncs] += "fetcher_hashes_dummyfunc"
> +do_fetch_early[network] = "1"
> +
> +python early_do_unpack_early() {
> +    unpackdir = d.getVar("EARLY_UNPACKDIR")
> +    unpack_src_uris(d, unpackdir, True)
> +}
> +addtask unpack_early after do_fetch_early
> +do_unpack_early[cleandirs] = "${EARLY_UNPACKDIR}"
> +
> +python early_do_patch_early() {
> +    source_dir = d.getVar("S")
> +    source_dir = get_early_source_dir(d, source_dir)
> +    apply_patches(d, source_dir, True)
> +}
> +addtask patch_early after do_unpack_early
> +do_patch_early[dirs] = "${WORKDIR}"
> +do_patch_early[depends] = "${PATCHDEPENDENCY}"
> +do_patch_early[vardepsexclude] = "PATCHRESOLVE"
> +
> +python () {
> +    import bb.fetch
> +    src_uris = get_src_uris(d, True)
> +    for src_uri in src_uris:
> +        uri = bb.fetch.URI(src_uri)
> +        path = uri.params.get("downloadfilename", uri.path)
> +
> +        # HTTP/FTP use the wget fetcher
> +        if uri.scheme in ("http", "https", "ftp"):
> +            d.appendVarFlag('do_fetch_early', 'depends', ' wget-native:do_populate_sysroot')
> +
> +        # Git packages should DEPEND on git-native
> +        elif uri.scheme in ("git", "gitsm"):
> +            d.appendVarFlag('do_fetch_early', 'depends', ' git-native:do_populate_sysroot')
> +
> +        # *.xz should DEPEND on xz-native for unpacking
> +        if path.endswith('.xz') or path.endswith('.txz'):
> +            d.appendVarFlag('do_fetch_early', 'depends', ' xz-native:do_populate_sysroot')
> +}
> +
> +EXPORT_FUNCTIONS do_fetch_early do_unpack_early do_patch_early
> diff --git a/meta/lib/oe/patch.py b/meta/lib/oe/patch.py
> index 58c6e34fe8..7737011e5a 100644
> --- a/meta/lib/oe/patch.py
> +++ b/meta/lib/oe/patch.py
> @@ -904,7 +904,7 @@ def patch_path(url, fetch, unpackdir, expand=True):
>  
>      return local
>  
> -def src_patches(d, all=False, expand=True):
> +def src_patches(d, all=False, expand=True, early=False):
>      unpackdir = d.getVar('UNPACKDIR')
>      fetch = bb.fetch2.Fetch([], d)
>      patches = []
> @@ -921,7 +921,7 @@ def src_patches(d, all=False, expand=True):
>          parm = urldata.parm
>          patchname = parm.get('pname') or os.path.basename(local)
>  
> -        apply, reason = should_apply(parm, d)
> +        apply, reason = should_apply(parm, d, early)
>          if not apply:
>              if reason:
>                  bb.note("Patch %s %s" % (patchname, reason))
> @@ -950,8 +950,12 @@ def src_patches(d, all=False, expand=True):
>      return patches
>  
>  
> -def should_apply(parm, d):
> +def should_apply(parm, d, early=False):
>      import bb.utils
> +
> +    if early and not bb.utils.to_boolean(parm.get('early'), False):
> +        return False, "applies to normal patch task only"
> +
>      if "mindate" in parm or "maxdate" in parm:
>          pn = d.getVar('PN')
>          srcdate = d.getVar('SRCDATE_%s' % pn)
> -- 
> 2.39.5
> 

> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#211141): https://lists.openembedded.org/g/openembedded-core/message/211141
> Mute This Topic: https://lists.openembedded.org/mt/111123537/1050810
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [bruce.ashfield@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Alexander Kanavin Feb. 12, 2025, 11:08 a.m. UTC | #3
On Tue, 11 Feb 2025 at 16:01, Stefan Herbrechtsmeier via
lists.openembedded.org
<stefan.herbrechtsmeier-oss=weidmueller.com@lists.openembedded.org>
wrote:
> Add support for early fetch, unpack and patches task which run before
> normal patch task. This feature is useful to fetch additional
> dependencies based on a patched source before the normal unpack and
> patch tasks. The patch are marked as early via an early=1 parameter. An
> example use case is a patch for a package manager lock file (Cargo.lock,
> go.sum, package-lock.json).

Why there is a need for additional tasks running before regular tasks?
Can't we do regular fetch/unpack/patch for the root item, then run
fetch_vendor_dependencies, unpack_vendor_dependencies,
patch_vendor_dependencies?

I see that others expressed dislike for any additional tasks in
principle, but I don't think that can be avoided? You can't know what
else needs to be fetched until the root item is unpacked and
inspected. I'm ok with that if the additional tasks are clearly named,
and implementations are short, sweet and obvious.

Alex
Stefan Herbrechtsmeier Feb. 12, 2025, 12:21 p.m. UTC | #4
Am 11.02.2025 um 23:27 schrieb Richard Purdie:
> On Tue, 2025-02-11 at 16:00 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote:
>> From: Stefan Herbrechtsmeier<stefan.herbrechtsmeier@weidmueller.com>
>>
>> Add support for early fetch, unpack and patches task which run before
>> normal patch task. This feature is useful to fetch additional
>> dependencies based on a patched source before the normal unpack and
>> patch tasks. The patch are marked as early via an early=1 parameter. An
>> example use case is a patch for a package manager lock file (Cargo.lock,
>> go.sum, package-lock.json).
>>
>> Signed-off-by: Stefan Herbrechtsmeier<stefan.herbrechtsmeier@weidmueller.com>
>> ---
>>
>>   meta/classes-global/patch.bbclass | 17 +++++----
>>   meta/classes-recipe/early.bbclass | 61 +++++++++++++++++++++++++++++++
>>   meta/lib/oe/patch.py              | 10 +++--
>>   3 files changed, 77 insertions(+), 11 deletions(-)
>>   create mode 100644 meta/classes-recipe/early.bbclass
> This level of complexity is going to cause massive headaches in future.
> Having two fetch, two unpack and two patch tasks will ultimately just
> cause a lot of confusion and make things like the SPDX code and
> archiver code more complex too.

The tasks don't really increase the complexity. The fetch, unpack and 
patch tasks work like before. The archiver and spdx classes are updated. 
The main difference is that they use a function and not the SRC_URI 
variable direct. The early tasks are only used by the vendor classes. 
The old tasks handle all sources including the dynamic sources. The 
early tasks works like the normal task but with a filtered source list. 
Alternative the steps could be included into the resolve task.

> Rather than patching a cargo lock file, I'd probably prefer the correct
> version be added in a later SRC_URI entry and used to overwrite the
> earlier one at unpack time.

Is this really practical? In many cases you have to update the 
dependencies of the dependency or other meta packages (.mod). It is much 
easier to update the dependency via the package manager and create a 
patch for the lock file. Take a look at patch 26.

> I appreciate the challenge there is then that fetcher isn't going to
> know what it is fetching at fetch time though as it would potentially
> be fetching with the unpatched lock file.

In worst case you have to patch the lock file so that the package 
manager can create the vendor folder.

> This is partly why we end up with the .inc files the way we do, then it
> is explicit.  I'm not sure the fetcher can know deterministically what
> it is fetching in advance without the .inc data though, which is one of
> the concerns that have been expressed previously.
The .inc file is generated from the lock file and the lock file contains 
all required information inclusive check sums.

I'm unsure if the old process of updating a SRC_URI in a .inc file is 
useful for package manager dependencies. The upstream project use 
patches to update the dependency. The .inc file is auto generated and 
any manual change need to be documented otherwise it is unclear how to 
deal with the change during the next update. The changes in the .inc 
file generate a lot of noise and complicate the review of manual updates.

Do we really need the possibility to change a SRC_URI outside of the 
lock file if we integrate the package manager into devtool and allows 
the user to manipulate the lock file via common tools and commands?
Stefan Herbrechtsmeier Feb. 12, 2025, 12:42 p.m. UTC | #5
Am 11.02.2025 um 23:32 schrieb Bruce Ashfield via lists.openembedded.org:
> In message: [OE-core] [RFC PATCH 15/30] classes: add early fetch, unpack and patch support
> on 11/02/2025 Stefan Herbrechtsmeier via lists.openembedded.org wrote:
>
>> From: Stefan Herbrechtsmeier<stefan.herbrechtsmeier@weidmueller.com>
>>
>> Add support for early fetch, unpack and patches task which run before
>> normal patch task. This feature is useful to fetch additional
>> dependencies based on a patched source before the normal unpack and
>> patch tasks. The patch are marked as early via an early=1 parameter. An
>> example use case is a patch for a package manager lock file (Cargo.lock,
>> go.sum, package-lock.json).
> I understand why you need to do the above, but there's now multiple
> fetch and patch tasks running throughout the pipeline of the
> build and that's just more complexity to maintain.

The tasks call the same functions and the difference is really low. The 
early task simple use a filtered list of the SRC_URI. At the moment the 
gitsm fetcher is doing the same but without a possibility to patch a 
dependency.

> This is what I was asking about in some of my first replies
> to the RFC series. When I've done this in the past, I'd just
> suggest something simpler like the recipe provide a complete
> "lock file" in the layer/recipe-space and have it overlayed
> in the source.

This was already supported by my first patch series. The disadvantage is 
a lot of noise during each update.

> Of course that means it is in the same fetch task as what
> will be fetching the vendor bits, but coordinating within
> the single task is simpler in my experience (and it can
> be checked on the SRC_URI). I haven't fully thought through
> how to manage it, but wanted to reply to this while it was
> fresh in my mind.

It is simpler because you manual fetch the lock file, remove the 
possibility to patch the file and skip the unpack. But you still need a 
manual fetch, unpack and maybe patch the source and copy the file into 
the meta layer.

It is also possibility to integrate the early and resolve task into 
fetch task but I'm unsure if this simplify the complexity.
Bruce Ashfield Feb. 12, 2025, 1:55 p.m. UTC | #6
On Wed, Feb 12, 2025 at 7:42 AM Stefan Herbrechtsmeier <
stefan.herbrechtsmeier-oss@weidmueller.com> wrote:

>
> Am 11.02.2025 um 23:32 schrieb Bruce Ashfield via lists.openembedded.org:
>
> In message: [OE-core] [RFC PATCH 15/30] classes: add early fetch, unpack and patch support
> on 11/02/2025 Stefan Herbrechtsmeier via lists.openembedded.org wrote:
>
>
> From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com> <stefan.herbrechtsmeier@weidmueller.com>
>
> Add support for early fetch, unpack and patches task which run before
> normal patch task. This feature is useful to fetch additional
> dependencies based on a patched source before the normal unpack and
> patch tasks. The patch are marked as early via an early=1 parameter. An
> example use case is a patch for a package manager lock file (Cargo.lock,
> go.sum, package-lock.json).
>
> I understand why you need to do the above, but there's now multiple
> fetch and patch tasks running throughout the pipeline of the
> build and that's just more complexity to maintain.
>
> The tasks call the same functions and the difference is really low. The
> early task simple use a filtered list of the SRC_URI. At the moment the
> gitsm fetcher is doing the same but without a possibility to patch a
> dependency.
>
> This is what I was asking about in some of my first replies
> to the RFC series. When I've done this in the past, I'd just
> suggest something simpler like the recipe provide a complete
> "lock file" in the layer/recipe-space and have it overlayed
> in the source.
>
> This was already supported by my first patch series. The disadvantage is a
> lot of noise during each update.
>
> Of course that means it is in the same fetch task as what
> will be fetching the vendor bits, but coordinating within
> the single task is simpler in my experience (and it can
> be checked on the SRC_URI). I haven't fully thought through
> how to manage it, but wanted to reply to this while it was
> fresh in my mind.
>
> It is simpler because you manual fetch the lock file, remove the
> possibility to patch the file and skip the unpack. But you still need a
> manual fetch, unpack and maybe patch the source and copy the file into the
> meta layer.
>

That's a completely different type of complexity. Richard and I are only
talking
about the complexity after "bitbake <foo>" has been called.

>
> It is also possibility to integrate the early and resolve task into fetch
> task but I'm unsure if this simplify the complexity.
>

Having direct calls to additional routines via conditional is less complex
than
tasks (dealing with timing, scheduling, code maintenance, etc), so that is
preferable from at least my point of view.

Bruce
Stefan Herbrechtsmeier Feb. 12, 2025, 2:40 p.m. UTC | #7
Am 12.02.2025 um 14:55 schrieb Bruce Ashfield:
>
>
> On Wed, Feb 12, 2025 at 7:42 AM Stefan Herbrechtsmeier 
> <stefan.herbrechtsmeier-oss@weidmueller.com> wrote:
>
>
>     Am 11.02.2025 um 23:32 schrieb Bruce Ashfield via
>     lists.openembedded.org <http://lists.openembedded.org>:
>>     In message: [OE-core] [RFC PATCH 15/30] classes: add early fetch, unpack and patch support
>>     on 11/02/2025 Stefan Herbrechtsmeier vialists.openembedded.org <http://lists.openembedded.org> wrote:
>>
>>>     From: Stefan Herbrechtsmeier<stefan.herbrechtsmeier@weidmueller.com> <mailto:stefan.herbrechtsmeier@weidmueller.com>
>>>
>>>     Add support for early fetch, unpack and patches task which run before
>>>     normal patch task. This feature is useful to fetch additional
>>>     dependencies based on a patched source before the normal unpack and
>>>     patch tasks. The patch are marked as early via an early=1 parameter. An
>>>     example use case is a patch for a package manager lock file (Cargo.lock,
>>>     go.sum, package-lock.json).
>>     I understand why you need to do the above, but there's now multiple
>>     fetch and patch tasks running throughout the pipeline of the
>>     build and that's just more complexity to maintain.
>
>     The tasks call the same functions and the difference is really
>     low. The early task simple use a filtered list of the SRC_URI. At
>     the moment the gitsm fetcher is doing the same but without a
>     possibility to patch a dependency.
>
>>     This is what I was asking about in some of my first replies
>>     to the RFC series. When I've done this in the past, I'd just
>>     suggest something simpler like the recipe provide a complete
>>     "lock file" in the layer/recipe-space and have it overlayed
>>     in the source.
>
>     This was already supported by my first patch series. The
>     disadvantage is a lot of noise during each update.
>
>>     Of course that means it is in the same fetch task as what
>>     will be fetching the vendor bits, but coordinating within
>>     the single task is simpler in my experience (and it can
>>     be checked on the SRC_URI). I haven't fully thought through
>>     how to manage it, but wanted to reply to this while it was
>>     fresh in my mind.
>
>     It is simpler because you manual fetch the lock file, remove the
>     possibility to patch the file and skip the unpack. But you still
>     need a manual fetch, unpack and maybe patch the source and copy
>     the file into the meta layer.
>
>
> That's a completely different type of complexity. Richard and I are 
> only talking
> about the complexity after "bitbake <foo>" has been called.
>
>
>     It is also possibility to integrate the early and resolve task
>     into fetch task but I'm unsure if this simplify the complexity.
>
>
> Having direct calls to additional routines via conditional is less 
> complex than
> tasks (dealing with timing, scheduling, code maintenance, etc), so that is
> preferable from at least my point of view.
Okay, I thought separate tasks are preferred. What is the alternative to 
the EXPORT_FUNCTIONS in this case. I need to call different functions 
depending on the inherit class.
Stefan Herbrechtsmeier Feb. 12, 2025, 4:23 p.m. UTC | #8
Am 12.02.2025 um 12:08 schrieb Alexander Kanavin:
> On Tue, 11 Feb 2025 at 16:01, Stefan Herbrechtsmeier via
> lists.openembedded.org
> <stefan.herbrechtsmeier-oss=weidmueller.com@lists.openembedded.org>
> wrote:
>> Add support for early fetch, unpack and patches task which run before
>> normal patch task. This feature is useful to fetch additional
>> dependencies based on a patched source before the normal unpack and
>> patch tasks. The patch are marked as early via an early=1 parameter. An
>> example use case is a patch for a package manager lock file (Cargo.lock,
>> go.sum, package-lock.json).
> Why there is a need for additional tasks running before regular tasks?
> Can't we do regular fetch/unpack/patch for the root item, then run
> fetch_vendor_dependencies, unpack_vendor_dependencies,
> patch_vendor_dependencies?

I use early task approach to keep the old behavior. Other classes like 
the archive class depends on the unpack task and expect that everything 
is unpacked after the task. Additional the user use the fetch command to 
fetch the dependencies. Therefore I add additional task before the 
common task to keep the known behavior.

> I see that others expressed dislike for any additional tasks in
> principle, but I don't think that can be avoided? You can't know what
> else needs to be fetched until the root item is unpacked and
> inspected. I'm ok with that if the additional tasks are clearly named,
> and implementations are short, sweet and obvious.

It should be possible to move the early tasks code into a prefuncs of 
fetch to avoid additional tasks.
diff mbox series

Patch

diff --git a/meta/classes-global/patch.bbclass b/meta/classes-global/patch.bbclass
index e5786b1c9a..7fa94c7aa7 100644
--- a/meta/classes-global/patch.bbclass
+++ b/meta/classes-global/patch.bbclass
@@ -82,18 +82,18 @@  python patch_task_postfunc() {
             oe.patch.GitApplyTree.commitIgnored("Add changes from %s" % func, dir=srcsubdir, files=['.'], d=d)
 }
 
-def src_patches(d, all=False, expand=True):
+def src_patches(d, all=False, expand=True, early=False):
     import oe.patch
-    return oe.patch.src_patches(d, all, expand)
+    return oe.patch.src_patches(d, all, expand, early)
 
-def should_apply(parm, d):
+def should_apply(parm, d, early=False):
     """Determine if we should apply the given patch"""
     import oe.patch
-    return oe.patch.should_apply(parm, d)
+    return oe.patch.should_apply(parm, d, early)
 
 should_apply[vardepsexclude] = "DATE SRCDATE"
 
-python patch_do_patch() {
+def apply_patches(d, s, early=False):
     import oe.patch
 
     patchsetmap = {
@@ -113,8 +113,6 @@  python patch_do_patch() {
 
     classes = {}
 
-    s = d.getVar('S')
-
     os.putenv('PATH', d.getVar('PATH'))
 
     # We must use one TMPDIR per process so that the "patch" processes
@@ -124,7 +122,7 @@  python patch_do_patch() {
     process_tmpdir = tempfile.mkdtemp()
     os.environ['TMPDIR'] = process_tmpdir
 
-    for patch in src_patches(d):
+    for patch in src_patches(d, early=early):
         _, _, local, _, _, parm = bb.fetch.decodeurl(patch)
 
         if "patchdir" in parm:
@@ -159,6 +157,9 @@  python patch_do_patch() {
 
     bb.utils.remove(process_tmpdir, True)
     del os.environ['TMPDIR']
+
+python patch_do_patch() {
+    apply_patches(d, d.getVar('S'))
 }
 patch_do_patch[vardepsexclude] = "PATCHRESOLVE"
 
diff --git a/meta/classes-recipe/early.bbclass b/meta/classes-recipe/early.bbclass
new file mode 100644
index 0000000000..e458fd8f7b
--- /dev/null
+++ b/meta/classes-recipe/early.bbclass
@@ -0,0 +1,61 @@ 
+# Copyright (C) 2025 Weidmueller Interface GmbH & Co. KG
+# Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
+#
+# SPDX-License-Identifier: MIT
+
+EARLY_UNPACKDIR = "${WORKDIR}/sources-unpack-early"
+
+def get_early_source_dir(d, sourcedir):
+    unpackdir = d.getVar("UNPACKDIR")
+    workdir = d.getVar('WORKDIR')
+    originaldir = unpackdir if sourcedir.startswith(unpackdir) else workdir
+    early_unpackdir = d.getVar("EARLY_UNPACKDIR")
+    return sourcedir.replace(originaldir, early_unpackdir)
+
+python early_do_fetch_early() {
+    fetch_src_uris(d, True)
+}
+addtask fetch_early
+do_fetch_early[dirs] = "${DL_DIR}"
+do_fetch_early[file-checksums] = "${@bb.fetch.get_checksum_file_list(d)}"
+do_fetch[prefuncs] += "fetcher_hashes_dummyfunc"
+do_fetch_early[network] = "1"
+
+python early_do_unpack_early() {
+    unpackdir = d.getVar("EARLY_UNPACKDIR")
+    unpack_src_uris(d, unpackdir, True)
+}
+addtask unpack_early after do_fetch_early
+do_unpack_early[cleandirs] = "${EARLY_UNPACKDIR}"
+
+python early_do_patch_early() {
+    source_dir = d.getVar("S")
+    source_dir = get_early_source_dir(d, source_dir)
+    apply_patches(d, source_dir, True)
+}
+addtask patch_early after do_unpack_early
+do_patch_early[dirs] = "${WORKDIR}"
+do_patch_early[depends] = "${PATCHDEPENDENCY}"
+do_patch_early[vardepsexclude] = "PATCHRESOLVE"
+
+python () {
+    import bb.fetch
+    src_uris = get_src_uris(d, True)
+    for src_uri in src_uris:
+        uri = bb.fetch.URI(src_uri)
+        path = uri.params.get("downloadfilename", uri.path)
+
+        # HTTP/FTP use the wget fetcher
+        if uri.scheme in ("http", "https", "ftp"):
+            d.appendVarFlag('do_fetch_early', 'depends', ' wget-native:do_populate_sysroot')
+
+        # Git packages should DEPEND on git-native
+        elif uri.scheme in ("git", "gitsm"):
+            d.appendVarFlag('do_fetch_early', 'depends', ' git-native:do_populate_sysroot')
+
+        # *.xz should DEPEND on xz-native for unpacking
+        if path.endswith('.xz') or path.endswith('.txz'):
+            d.appendVarFlag('do_fetch_early', 'depends', ' xz-native:do_populate_sysroot')
+}
+
+EXPORT_FUNCTIONS do_fetch_early do_unpack_early do_patch_early
diff --git a/meta/lib/oe/patch.py b/meta/lib/oe/patch.py
index 58c6e34fe8..7737011e5a 100644
--- a/meta/lib/oe/patch.py
+++ b/meta/lib/oe/patch.py
@@ -904,7 +904,7 @@  def patch_path(url, fetch, unpackdir, expand=True):
 
     return local
 
-def src_patches(d, all=False, expand=True):
+def src_patches(d, all=False, expand=True, early=False):
     unpackdir = d.getVar('UNPACKDIR')
     fetch = bb.fetch2.Fetch([], d)
     patches = []
@@ -921,7 +921,7 @@  def src_patches(d, all=False, expand=True):
         parm = urldata.parm
         patchname = parm.get('pname') or os.path.basename(local)
 
-        apply, reason = should_apply(parm, d)
+        apply, reason = should_apply(parm, d, early)
         if not apply:
             if reason:
                 bb.note("Patch %s %s" % (patchname, reason))
@@ -950,8 +950,12 @@  def src_patches(d, all=False, expand=True):
     return patches
 
 
-def should_apply(parm, d):
+def should_apply(parm, d, early=False):
     import bb.utils
+
+    if early and not bb.utils.to_boolean(parm.get('early'), False):
+        return False, "applies to normal patch task only"
+
     if "mindate" in parm or "maxdate" in parm:
         pn = d.getVar('PN')
         srcdate = d.getVar('SRCDATE_%s' % pn)