diff mbox series

package_manager : Add configuration option to select filterbydependencies

Message ID 20241025161627.3352-1-sreejith.ravi087@gmail.com
State New
Headers show
Series package_manager : Add configuration option to select filterbydependencies | expand

Commit Message

Sreejith Ravi Oct. 25, 2024, 4:16 p.m. UTC
Currently, filterbydependencies is always set with True, which results in
the creation of `oe-rootfs-repo`. If filterbydependencies is not set,
it creates `oe-rootfs-repo` as a symlink to the "DEPLOY_DIR_<IPK/RPM/DEB>"
directory. With this patch, users can configure this behavior according
to their use cases using the variable `DISABLE_FILTER_BY_DEPENDENCY`.

Signed-off-by: Sreejith Ravi <sreejith.ravi087@gmail.com>
---
 meta/conf/bitbake.conf                    | 3 +++
 meta/lib/oe/package_manager/__init__.py   | 1 +
 meta/lib/oe/package_manager/deb/rootfs.py | 4 +++-
 meta/lib/oe/package_manager/ipk/rootfs.py | 7 +++++--
 meta/lib/oe/package_manager/rpm/rootfs.py | 4 +++-
 5 files changed, 15 insertions(+), 4 deletions(-)

Comments

Alexander Kanavin Oct. 25, 2024, 4:26 p.m. UTC | #1
On Fri, 25 Oct 2024 at 18:17, Sreejith Ravi via lists.openembedded.org
<sreejith.ravi087=gmail.com@lists.openembedded.org> wrote:
> Currently, filterbydependencies is always set with True, which results in
> the creation of `oe-rootfs-repo`. If filterbydependencies is not set,
> it creates `oe-rootfs-repo` as a symlink to the "DEPLOY_DIR_<IPK/RPM/DEB>"
> directory. With this patch, users can configure this behavior according
> to their use cases using the variable `DISABLE_FILTER_BY_DEPENDENCY`.

You do need to explain why. We've had this before image-specific
rootfs repositories were implemented, and it was causing issues when
building images from deploy directories that have unrelated packages
and/or other bitbake tasks writing into them.

Alex
Sreejith Ravi Oct. 25, 2024, 6:19 p.m. UTC | #2
On Fri, Oct 25, 2024 at 5:26 PM Alexander Kanavin <alex.kanavin@gmail.com>
wrote:

> On Fri, 25 Oct 2024 at 18:17, Sreejith Ravi via lists.openembedded.org
> <sreejith.ravi087=gmail.com@lists.openembedded.org> wrote:
> > Currently, filterbydependencies is always set with True, which results in
> > the creation of `oe-rootfs-repo`. If filterbydependencies is not set,
> > it creates `oe-rootfs-repo` as a symlink to the
> "DEPLOY_DIR_<IPK/RPM/DEB>"
> > directory. With this patch, users can configure this behavior according
> > to their use cases using the variable `DISABLE_FILTER_BY_DEPENDENCY`.
>
> You do need to explain why. We've had this before image-specific
> rootfs repositories were implemented, and it was causing issues when
> building images from deploy directories that have unrelated packages
> and/or other bitbake tasks writing into them.
>
> Alex
>

We aim to use released IPK files from a previous release build instead of
building from source. During package processing, we skip all build tasks
for the package and copy it to the deploy directory, adjusting the feeds
accordingly. With a customized opkg configuration, we can set the feeds
either from a remote server or the deploy directory. Since we are using the
released IPK files, there is no need to filter by dependencies, as this has
already been handled during the release of the IPK.

This patch provides an additional option to use the deploy directory as an
IPK repository. By default, this option is disabled but can be enabled if
needed.

Can you please share the details of the issues encountered when building
images from deploy directories?

Cheers
Sreejith
Alexander Kanavin Oct. 28, 2024, 6:52 p.m. UTC | #3
On Fri, 25 Oct 2024 at 20:19, Sreejith Ravi <sreejith.ravi087@gmail.com> wrote:
> We aim to use released IPK files from a previous release build instead of building from source. During package processing, we skip all build tasks for the package and copy it to the deploy directory, adjusting the feeds accordingly. With a customized opkg configuration, we can set the feeds either from a remote server or the deploy directory. Since we are using the released IPK files, there is no need to filter by dependencies, as this has already been handled during the release of the IPK.
>
> This patch provides an additional option to use the deploy directory as an IPK repository. By default, this option is disabled but can be enabled if needed.

Unfortunately I am strongly against supporting this use case in
openembedded core, and I think many will agree. OE's main feature is
end to end builds from source that are also reproducible. Sstate is
the way to rebuild only what's needed, and you need to further explain
why it isn't being used instead. All code in rootfs creation assumes
that packages used to compose an image are produced through the
regular build process or come from sstate, and not imported from
elsewhere.

If you must do things differently then you need to implement your own
set of classes (perhaps subclassing what's there), and set
PACKAGES_CLASSES = "package_prebuilt_ipk" or similar.

Alex
Sreejith Ravi Nov. 11, 2024, 9:53 a.m. UTC | #4
Hi

We are not changing the default behavior of Yocto but are instead using our
own custom class to handle IPK.

Since create_packages_dir is a function in rootfs.py, it would be
beneficial to add an option allowing users to select either the deploy
directory or oe-rootfs-repo. This would also save the copy time to
oe-rootfs when it is not required for their projects.

Currently, in create_packages_dir, the option to check whether to create
"oe-rootfs-repo" or use a symlink from "DEPLOY_DIR_IPK" to "oe-rootfs-repo"
is ineffective with filterbydependencies, as it is always set to True. Our
request is to add a configuration option for this behavior; otherwise, this
check serves no purpose when used with filterbydependencies.

    if nodeps or not filterbydependencies:
        for arch in d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").split() +
d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").replace("-", "_").split():
            target = os.path.join(deploydir + "/" + arch)
            if os.path.exists(target):
                oe.path.symlink(target, subrepo_dir + "/" + arch, True)
        return


This patch does not change the existing functionality. By default, it is
set to True (consistent with default Yocto behavior), but users can adjust
it as needed.

Thanks
Sreejith

On Mon, Oct 28, 2024 at 6:52 PM Alexander Kanavin <alex.kanavin@gmail.com>
wrote:

> On Fri, 25 Oct 2024 at 20:19, Sreejith Ravi <sreejith.ravi087@gmail.com>
> wrote:
> > We aim to use released IPK files from a previous release build instead
> of building from source. During package processing, we skip all build tasks
> for the package and copy it to the deploy directory, adjusting the feeds
> accordingly. With a customized opkg configuration, we can set the feeds
> either from a remote server or the deploy directory. Since we are using the
> released IPK files, there is no need to filter by dependencies, as this has
> already been handled during the release of the IPK.
> >
> > This patch provides an additional option to use the deploy directory as
> an IPK repository. By default, this option is disabled but can be enabled
> if needed.
>
> Unfortunately I am strongly against supporting this use case in
> openembedded core, and I think many will agree. OE's main feature is
> end to end builds from source that are also reproducible. Sstate is
> the way to rebuild only what's needed, and you need to further explain
> why it isn't being used instead. All code in rootfs creation assumes
> that packages used to compose an image are produced through the
> regular build process or come from sstate, and not imported from
> elsewhere.
>
> If you must do things differently then you need to implement your own
> set of classes (perhaps subclassing what's there), and set
> PACKAGES_CLASSES = "package_prebuilt_ipk" or similar.
>
> Alex
>
Richard Purdie Nov. 11, 2024, 1:08 p.m. UTC | #5
On Mon, 2024-11-11 at 09:53 +0000, Sreejith Ravi wrote:
> We are not changing the default behavior of Yocto but are instead
> using our own custom class to handle IPK.
> 
> Since create_packages_dir is a function in rootfs.py, it would be
> beneficial to add an option allowing users to select either the
> deploy directory or oe-rootfs-repo. This would also save the copy
> time to oe-rootfs when it is not required for their projects.

For reference, the copy time here is minimal since the copy is created
using hardlinks to the files.

The reason this is done is to filter the packages to the ones directly
in the recipe's dependency chain. This is because files can be written
to tmp/deploy/ while an image build is running and this can cause build
failures.

For example consider core-image-sato-sdk and core-image-minimal being
built in parallel. The minimal rootfs will happen before the sato-sdk
one but if they all used the same directory, the minimal image could
see pieces of the SDK image. The files may be seen partially through
the write too.

To avoid this and more complex forms of the race (where things are
removed as they're rebuilt due to changes), we create the hardlinked
copy of only the packages a given image recipe should "see".

> Currently, in create_packages_dir, the option to check whether to
> create "oe-rootfs-repo" or use a symlink from "DEPLOY_DIR_IPK" to
> "oe-rootfs-repo" is ineffective with filterbydependencies, as it is
> always set to True. Our request is to add a configuration option for
> this behavior; otherwise, this check serves no purpose when used with
> filterbydependencies.
> 
>     if nodeps or not filterbydependencies:
>         for arch in d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").split() +
> d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").replace("-", "_").split():
>             target = os.path.join(deploydir + "/" + arch)
>             if os.path.exists(target):
>                 oe.path.symlink(target, subrepo_dir + "/" + arch,
> True)
>         return
> 
> 
> This patch does not change the existing functionality. By default, it
> is set to True (consistent with default Yocto behavior), but users
> can adjust it as needed.

I'm not saying yes or no to the patch but you have to understand that
adding an extra parameter means we have to support that code path.
Should we include it in our test matrix? If we do, there is the cost of
testing it but it should work. If we don't, there is a significant risk
it might stop working at some point. There is therefore a cost/risk to
both directions if we do add it.

For me, the bigger concern is that we may then see people using this
without understanding the race concerns above too.

Cheers,

Richard
Sreejith Ravi Nov. 12, 2024, 10:26 a.m. UTC | #6
Thanks for the detailed explanation. I understand the hesitancy around
adding an extra parameter, especially given the cost and maintenance
implications for both testing and long-term support. If we decide to
incorporate it, I agree that it would be essential to include it in our
test matrix to ensure it works reliably and doesn’t degrade over time.

One possible race condition could occur when there’s a parallel build for
different images with distinct sets of required packages. For example, if
one process is running an opkg update while another is creating an index
with an opkg make index at the same time, this could cause the root
filesystem image installation to fail.

Regarding your concern about potential misuse and the risk of race
conditions like above, I share that reservation. Would it be helpful if we
documented these concerns prominently or added clear guidance on how to use
this parameter safely?

Thanks again for the details, which help me better understand Yocto
internals.

Sreejith

On Mon, Nov 11, 2024 at 1:08 PM Richard Purdie <
richard.purdie@linuxfoundation.org> wrote:

> On Mon, 2024-11-11 at 09:53 +0000, Sreejith Ravi wrote:
> > We are not changing the default behavior of Yocto but are instead
> > using our own custom class to handle IPK.
> >
> > Since create_packages_dir is a function in rootfs.py, it would be
> > beneficial to add an option allowing users to select either the
> > deploy directory or oe-rootfs-repo. This would also save the copy
> > time to oe-rootfs when it is not required for their projects.
>
> For reference, the copy time here is minimal since the copy is created
> using hardlinks to the files.
>
> The reason this is done is to filter the packages to the ones directly
> in the recipe's dependency chain. This is because files can be written
> to tmp/deploy/ while an image build is running and this can cause build
> failures.
>
> For example consider core-image-sato-sdk and core-image-minimal being
> built in parallel. The minimal rootfs will happen before the sato-sdk
> one but if they all used the same directory, the minimal image could
> see pieces of the SDK image. The files may be seen partially through
> the write too.
>
> To avoid this and more complex forms of the race (where things are
> removed as they're rebuilt due to changes), we create the hardlinked
> copy of only the packages a given image recipe should "see".
>
> > Currently, in create_packages_dir, the option to check whether to
> > create "oe-rootfs-repo" or use a symlink from "DEPLOY_DIR_IPK" to
> > "oe-rootfs-repo" is ineffective with filterbydependencies, as it is
> > always set to True. Our request is to add a configuration option for
> > this behavior; otherwise, this check serves no purpose when used with
> > filterbydependencies.
> >
> >     if nodeps or not filterbydependencies:
> >         for arch in d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").split() +
> > d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").replace("-", "_").split():
> >             target = os.path.join(deploydir + "/" + arch)
> >             if os.path.exists(target):
> >                 oe.path.symlink(target, subrepo_dir + "/" + arch,
> > True)
> >         return
> >
> >
> > This patch does not change the existing functionality. By default, it
> > is set to True (consistent with default Yocto behavior), but users
> > can adjust it as needed.
>
> I'm not saying yes or no to the patch but you have to understand that
> adding an extra parameter means we have to support that code path.
> Should we include it in our test matrix? If we do, there is the cost of
> testing it but it should work. If we don't, there is a significant risk
> it might stop working at some point. There is therefore a cost/risk to
> both directions if we do add it.
>
> For me, the bigger concern is that we may then see people using this
> without understanding the race concerns above too.
>
> Cheers,
>
> Richard
>
>
>
Alexander Kanavin Nov. 12, 2024, 10:31 a.m. UTC | #7
On Tue, 12 Nov 2024 at 11:26, Sreejith Ravi <sreejith.ravi087@gmail.com> wrote:
> Regarding your concern about potential misuse and the risk of race conditions like above, I share that reservation. Would it be helpful if we documented these concerns prominently or added clear guidance on how to use this parameter safely?

I would still suggest that you need to implement your own packaging
class, and not try to reuse classes from oe-core. If that class needs
tweaks to the base implementations, we can consider them.

Alex
diff mbox series

Patch

diff --git a/meta/conf/bitbake.conf b/meta/conf/bitbake.conf
index eda505c861..1b3ef40782 100644
--- a/meta/conf/bitbake.conf
+++ b/meta/conf/bitbake.conf
@@ -992,3 +992,6 @@  oe.path.format_display[vardepsexclude] = "TOPDIR"
 oe.utils.get_bb_number_threads[vardepsexclude] = "BB_NUMBER_THREADS"
 oe.packagedata.emit_pkgdata[vardepsexclude] = "BB_NUMBER_THREADS"
 oe.packagedata.read_subpkgdata_extended[vardepsexclude] = "BB_NUMBER_THREADS"
+
+# To configure the filterbydependencies feature for rootfs generation to set the feeds.
+DISABLE_FILTER_BY_DEPENDENCY ?= "0"
diff --git a/meta/lib/oe/package_manager/__init__.py b/meta/lib/oe/package_manager/__init__.py
index 2100a97c12..806bebfae4 100644
--- a/meta/lib/oe/package_manager/__init__.py
+++ b/meta/lib/oe/package_manager/__init__.py
@@ -467,6 +467,7 @@  def create_packages_dir(d, subrepo_dir, deploydir, taskname, filterbydependencie
 
     # Detect bitbake -b usage
     nodeps = d.getVar("BB_LIMITEDDEPS") or False
+    bb.note("filterbydependencies %s" %filterbydependencies)
     if nodeps or not filterbydependencies:
         for arch in d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").split() + d.getVar("ALL_MULTILIB_PACKAGE_ARCHS").replace("-", "_").split():
             target = os.path.join(deploydir + "/" + arch)
diff --git a/meta/lib/oe/package_manager/deb/rootfs.py b/meta/lib/oe/package_manager/deb/rootfs.py
index 1e25b64ed9..04fd55c485 100644
--- a/meta/lib/oe/package_manager/deb/rootfs.py
+++ b/meta/lib/oe/package_manager/deb/rootfs.py
@@ -134,9 +134,11 @@  class PkgRootfs(DpkgOpkgRootfs):
         bb.utils.remove(self.image_rootfs, True)
         bb.utils.remove(self.d.getVar('MULTILIB_TEMP_ROOTFS'), True)
         self.manifest = PkgManifest(d, manifest_dir)
+        filter_deps = False if d.getVar("DISABLE_FILTER_BY_DEPENDENCY") == "1" else True
         self.pm = DpkgPM(d, d.getVar('IMAGE_ROOTFS'),
                          d.getVar('PACKAGE_ARCHS'),
-                         d.getVar('DPKG_ARCH'))
+                         d.getVar('DPKG_ARCH'),
+                         filterbydependencies=filter_deps)
 
 
     def _create(self):
diff --git a/meta/lib/oe/package_manager/ipk/rootfs.py b/meta/lib/oe/package_manager/ipk/rootfs.py
index ba93eb62ea..52d0785ef9 100644
--- a/meta/lib/oe/package_manager/ipk/rootfs.py
+++ b/meta/lib/oe/package_manager/ipk/rootfs.py
@@ -131,6 +131,7 @@  class PkgRootfs(DpkgOpkgRootfs):
         self.manifest = PkgManifest(d, manifest_dir)
         self.opkg_conf = self.d.getVar("IPKGCONF_TARGET")
         self.pkg_archs = self.d.getVar("ALL_MULTILIB_PACKAGE_ARCHS")
+        filter_deps = False if d.getVar("DISABLE_FILTER_BY_DEPENDENCY") == "1" else True
 
         self.inc_opkg_image_gen = self.d.getVar('INC_IPK_IMAGE_GEN') or ""
         if self._remove_old_rootfs():
@@ -138,12 +139,14 @@  class PkgRootfs(DpkgOpkgRootfs):
             self.pm = OpkgPM(d,
                              self.image_rootfs,
                              self.opkg_conf,
-                             self.pkg_archs)
+                             self.pkg_archs,
+                             filterbydependencies=filter_deps)
         else:
             self.pm = OpkgPM(d,
                              self.image_rootfs,
                              self.opkg_conf,
-                             self.pkg_archs)
+                             self.pkg_archs,
+                             filterbydependencies=filter_deps)
             self.pm.recover_packaging_data()
 
         bb.utils.remove(self.d.getVar('MULTILIB_TEMP_ROOTFS'), True)
diff --git a/meta/lib/oe/package_manager/rpm/rootfs.py b/meta/lib/oe/package_manager/rpm/rootfs.py
index 3ba5396320..361a77ca7f 100644
--- a/meta/lib/oe/package_manager/rpm/rootfs.py
+++ b/meta/lib/oe/package_manager/rpm/rootfs.py
@@ -16,12 +16,14 @@  class PkgRootfs(Rootfs):
         self.log_check_regex = r'(unpacking of archive failed|Cannot find package'\
                                r'|exit 1|ERROR: |Error: |Error |ERROR '\
                                r'|Failed |Failed: |Failed$|Failed\(\d+\):)'
+        filter_deps = False if d.getVar("DISABLE_FILTER_BY_DEPENDENCY") == "1" else True
 
         self.manifest = PkgManifest(d, manifest_dir)
 
         self.pm = RpmPM(d,
                         d.getVar('IMAGE_ROOTFS'),
-                        self.d.getVar('TARGET_VENDOR')
+                        self.d.getVar('TARGET_VENDOR'),
+                        filterbydependencies=filter_deps
                         )
 
         self.inc_rpm_image_gen = self.d.getVar('INC_RPM_IMAGE_GEN')