Message ID | 20230906165633.2382629-16-alex@linutronix.de |
---|---|
State | New |
Headers | show |
Series | [01/17] openssl: build and install manpages only if they are enabled | expand |
On Wed, 2023-09-06 at 18:56 +0200, Alexander Kanavin wrote: > Target task is using executables populated by the native task > and as they run in parallel, races can occur. > > This was triggered by shadow recipe update which added depedendent libraries, > and where half-populated native sysroot (dependent libraries missing) > was triggering useradd failures. > > Presence or absence of useradd itself is a soft failure, and so was previously unnoticed. > > Signed-off-by: Alexander Kanavin <alex@linutronix.de> > --- > meta/recipes-core/meta/build-sysroots.bb | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/meta/recipes-core/meta/build-sysroots.bb b/meta/recipes-core/meta/build-sysroots.bb > index 1a3b692a1b1..ac74dda22c4 100644 > --- a/meta/recipes-core/meta/build-sysroots.bb > +++ b/meta/recipes-core/meta/build-sysroots.bb > @@ -42,6 +42,6 @@ python do_build_target_sysroot () { > } > do_build_target_sysroot[cleandirs] = "${STANDALONE_SYSROOT}" > do_build_target_sysroot[nostamp] = "1" > -addtask do_build_target_sysroot before do_build > +addtask do_build_target_sysroot before do_build after do_build_native_sysroot > > do_clean[cleandirs] += "${STANDALONE_SYSROOT} ${STANDALONE_SYSROOT_NATIVE}" This doesn't just force ordering but means target will always now trigger the native sysroot as a dependency. We don't want to do that... Cheers, Richard
On Wed, 6 Sept 2023 at 21:23, Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > > do_clean[cleandirs] += "${STANDALONE_SYSROOT} ${STANDALONE_SYSROOT_NATIVE}" > > This doesn't just force ordering but means target will always now > trigger the native sysroot as a dependency. We don't want to do that... But I think we do. Target is using executables from native sysroot such as user management utilities from shadow-native, and they won't be otherwise present there. Alex
On Wed, 6 Sept 2023 at 21:28, Alexander Kanavin via lists.openembedded.org <alex.kanavin=gmail.com@lists.openembedded.org> wrote: > > This doesn't just force ordering but means target will always now > > trigger the native sysroot as a dependency. We don't want to do that... > > But I think we do. Target is using executables from native sysroot > such as user management utilities from shadow-native, and they won't > be otherwise present there. By the way extend_recipe_sysroot() is already doing: staging_populate_sysroot_dir(recipesysroot, recipesysrootnative, True, d) staging_populate_sysroot_dir(recipesysroot, recipesysrootnative, False, d) so just calling one after the other in a single task. Why build-sysroots recipe has those as separate tasks I'm not sure, but they should either be folded into one task, or be strictly ordered and required. Unless I am missing something. Alex
On Wed, 2023-09-06 at 21:28 +0200, Alexander Kanavin wrote: > On Wed, 6 Sept 2023 at 21:23, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > > > do_clean[cleandirs] += "${STANDALONE_SYSROOT} ${STANDALONE_SYSROOT_NATIVE}" > > > > This doesn't just force ordering but means target will always now > > trigger the native sysroot as a dependency. We don't want to do that... > > But I think we do. Target is using executables from native sysroot > such as user management utilities from shadow-native, and they won't > be otherwise present there. Firstly, that recipe was a bit of a hack. In bygone times, we had a shared sysroot and this was effectively creating it for the places we hadn't converted to use recipe specific sysroots. There should only be a small number of places it is used where we haven't found a better way. When you use it, the calling code first has to ensure the things it wants in the sysroots have been built. The caller is therefore expected to build the right set of dependencies. Putting ordering constraints into the recipe is going to make people think it all happens by magic. It doesn't and the caller has to be careful. I'm reluctant to add any dependencies when it should be clear the caller is entirely responsible for this. Cheers, Richard
On Wed, 6 Sept 2023 at 22:13, Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > Firstly, that recipe was a bit of a hack. In bygone times, we had a > shared sysroot and this was effectively creating it for the places we > hadn't converted to use recipe specific sysroots. There should only be > a small number of places it is used where we haven't found a better > way. > > When you use it, the calling code first has to ensure the things it > wants in the sysroots have been built. The caller is therefore expected > to build the right set of dependencies. > > Putting ordering constraints into the recipe is going to make people > think it all happens by magic. It doesn't and the caller has to be > careful. > > I'm reluctant to add any dependencies when it should be clear the > caller is entirely responsible for this. This commit was prompted by seeing these intermittent races with the upgraded shadow: https://autobuilder.yoctoproject.org/typhoon/#/builders/146/builds/468/steps/12/logs/stdio https://autobuilder.yoctoproject.org/typhoon/#/builders/147/builds/467/steps/12/logs/stdio What happens here is: - target sysroot population relies on being able to run useradd without errors. It will also skip the useradd execution if useradd is not present. - useradd comes from the native sysroot - native sysroot is being populated at the same time as target sysroot is because the build-sysroots recipe doesn't constrain that - this can result in races where useradd executable is already there, but the libraries it needs (libbsd, libattr, libmd) are not How would the caller ensure this doesn't happen? I think it's reasonable to assume 'bitbake build-sysroots' should complete without intermittent failures. Alex
On Wed, 2023-09-06 at 22:28 +0200, Alexander Kanavin wrote: > On Wed, 6 Sept 2023 at 22:13, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > Firstly, that recipe was a bit of a hack. In bygone times, we had a > > shared sysroot and this was effectively creating it for the places we > > hadn't converted to use recipe specific sysroots. There should only be > > a small number of places it is used where we haven't found a better > > way. > > > > When you use it, the calling code first has to ensure the things it > > wants in the sysroots have been built. The caller is therefore expected > > to build the right set of dependencies. > > > > Putting ordering constraints into the recipe is going to make people > > think it all happens by magic. It doesn't and the caller has to be > > careful. > > > > I'm reluctant to add any dependencies when it should be clear the > > caller is entirely responsible for this. > > This commit was prompted by seeing these intermittent races with the > upgraded shadow: > > https://autobuilder.yoctoproject.org/typhoon/#/builders/146/builds/468/steps/12/logs/stdio > https://autobuilder.yoctoproject.org/typhoon/#/builders/147/builds/467/steps/12/logs/stdio > > What happens here is: > - target sysroot population relies on being able to run useradd > without errors. It will also skip the useradd execution if useradd is > not present. > - useradd comes from the native sysroot > - native sysroot is being populated at the same time as target sysroot > is because the build-sysroots recipe doesn't constrain that > - this can result in races where useradd executable is already there, > but the libraries it needs (libbsd, libattr, libmd) are not > > How would the caller ensure this doesn't happen? I think it's > reasonable to assume 'bitbake build-sysroots' should complete without > intermittent failures. That test/calls were fairly recently added: https://git.yoctoproject.org/poky/commit/meta/lib/oeqa/selftest/cases/meta_ide.py?id=9b3fcb0d91648ae3b53ec8ffcb31fb6eac9209dd That test should probably call: bitbake("build-sysroots -c build_native_sysroot") bitbake("build-sysroots -c build_target_sysroot") om the setup case and then just call the target piece again in the specific test. We could drop the "before do_build" in the build-sysroots recipe. Note that the tasks are nostamp so they will always rerun. It does make sense to have a way to regenerate the target sysroot only but if you change it as you suggest, that becomes impossible. Cheers, Richard
On Wed, 6 Sept 2023 at 22:53, Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > That test/calls were fairly recently added: > > https://git.yoctoproject.org/poky/commit/meta/lib/oeqa/selftest/cases/meta_ide.py?id=9b3fcb0d91648ae3b53ec8ffcb31fb6eac9209dd > > That test should probably call: > > bitbake("build-sysroots -c build_native_sysroot") > bitbake("build-sysroots -c build_target_sysroot") > > om the setup case and then just call the target piece again in the > specific test. > > We could drop the "before do_build" in the build-sysroots recipe. > > Note that the tasks are nostamp so they will always rerun. It does make > sense to have a way to regenerate the target sysroot only but if you > change it as you suggest, that becomes impossible. Setting up the 'direct esdk' would become somewhat more awkward as it does currently rely on being able to run 'bitbake build-sysroots' directly as officially published: (yes the doc formatting needs to be fixed): https://docs.yoctoproject.org/sdk-manual/extensible.html#setting-up-the-extensible-sdk-environment-directly-in-a-yocto-build https://docs.yoctoproject.org/sdk-manual/extensible.html#when-using-the-extensible-sdk-directly-in-a-yocto-build I can fix both the test and the documentation to run first native, then target task explicitly, but I would really prefer to make 'bitbake build-sysroots' just work without chance of failures. Alex
On Wed, 2023-09-06 at 23:27 +0200, Alexander Kanavin wrote: > On Wed, 6 Sept 2023 at 22:53, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > That test/calls were fairly recently added: > > > > https://git.yoctoproject.org/poky/commit/meta/lib/oeqa/selftest/cases/meta_ide.py?id=9b3fcb0d91648ae3b53ec8ffcb31fb6eac9209dd > > > > That test should probably call: > > > > bitbake("build-sysroots -c build_native_sysroot") > > bitbake("build-sysroots -c build_target_sysroot") > > > > om the setup case and then just call the target piece again in the > > specific test. > > > > We could drop the "before do_build" in the build-sysroots recipe. > > > > Note that the tasks are nostamp so they will always rerun. It does make > > sense to have a way to regenerate the target sysroot only but if you > > change it as you suggest, that becomes impossible. > > Setting up the 'direct esdk' would become somewhat more awkward as it > does currently rely on being able to run 'bitbake build-sysroots' > directly as officially published: > (yes the doc formatting needs to be fixed): > > https://docs.yoctoproject.org/sdk-manual/extensible.html#setting-up-the-extensible-sdk-environment-directly-in-a-yocto-build > https://docs.yoctoproject.org/sdk-manual/extensible.html#when-using-the-extensible-sdk-directly-in-a-yocto-build > > I can fix both the test and the documentation to run first native, > then target task explicitly, but I would really prefer to make > 'bitbake build-sysroots' just work without chance of failures. Taking a step back, is user information actually useful in the context of these sysroots? Really, you shouldn't need the native sysroot for the target one. We only have postinsts for sysroots where they were absolutely unavoidable: * useradd * xmlcatalog * ldso/qemu issue * pixbuf Basically, they're on used for "index" creation issues. Ideally we'd not have these things at all, they're horrible to have to hack in. In the context of external SDKs, useradd doesn't make much sense. Even for "in-tree" use, given the significant dependency creep, I'm starting to think we should drop the useradd calls from the postinst script and code something else to create the right passwd/group entries which is all we care about (to keep pseudo working ok for packaging). The reason the dependency creep worries me is I know what the code internal to bitbake does when it hits these dependencies. It is really suboptimal :(. I know it is really tempting just to add dependencies and ignore the deeper issues but some of this really doesn't make sense when you step back and think about it. Cheers, Richard
On Wed, 6 Sept 2023 at 23:44, Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > I know it is really tempting just to add dependencies and ignore the > deeper issues but some of this really doesn't make sense when you step > back and think about it. I sent a different patch that decouples native/target sysroot population tasks from do_build in the build-sysroots recipe and tweaks usage to run them explicitly. I agree that user/group handling at build time needs a redesign: there's just too many corner cases and special handling all over the place currently. I would really just rather place a file into /etc/passwd.d/ and that would amount to 'making' a user. Indexing or other special processing, if needed, can happen at do_rootfs time. Alex
On Thu, 2023-09-07 at 13:05 +0200, Alexander Kanavin wrote: > On Wed, 6 Sept 2023 at 23:44, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > > I know it is really tempting just to add dependencies and ignore the > > deeper issues but some of this really doesn't make sense when you step > > back and think about it. > > I sent a different patch that decouples native/target sysroot > population tasks from do_build in the build-sysroots recipe and tweaks > usage to run them explicitly. > > I agree that user/group handling at build time needs a redesign: > there's just too many corner cases and special handling all over the > place currently. I would really just rather place a file into > /etc/passwd.d/ and that would amount to 'making' a user. Indexing or > other special processing, if needed, can happen at do_rootfs time. It isn't just do_rootfs, it is also package creation as the users/groups are encoded into the packages. Crazy idea - can we statically link the shadow-native binaries? Cheers, Richard
On Thu, 7 Sept 2023 at 13:18, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
> Crazy idea - can we statically link the shadow-native binaries?
Not impossible, but needs a bit of work - libbsd, libattr and libmd
all need to be made available in static versions (via resetting
DISABLE_STATIC), and shadow convinced to use them by injecting -static
into the linker flags at appropriate places. There's no precedent for
this in core that I've seen.
Alex
On Thu, 2023-09-07 at 13:45 +0200, Alexander Kanavin wrote: > On Thu, 7 Sept 2023 at 13:18, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > Crazy idea - can we statically link the shadow-native binaries? > > Not impossible, but needs a bit of work - libbsd, libattr and libmd > all need to be made available in static versions (via resetting > DISABLE_STATIC), and shadow convinced to use them by injecting -static > into the linker flags at appropriate places. There's no precedent for > this in core that I've seen. See meta/conf/distro/include/no-static-libs.inc which does things like: meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-openssl-native = "" meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-nativesdk-openssl = "" I wasn't sure how easy the shadow patch would be, that is hopefully the main tricky part. sqlite-native used to be built allowing static for pseudo iirc but I don't think we need to do that there any more. Cheers, Richard
On Thu, 7 Sept 2023 at 14:04, Richard Purdie <richard.purdie@linuxfoundation.org> wrote: > See meta/conf/distro/include/no-static-libs.inc which does things like: > > meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-openssl-native = "" > meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-nativesdk-openssl = "" > > I wasn't sure how easy the shadow patch would be, that is hopefully the > main tricky part. sqlite-native used to be built allowing static for > pseudo iirc but I don't think we need to do that there any more. I poked around at passing options from the shadow recipe, but couldn't arrive at a working combination. libtool and automake insert too many abstraction layers on the way to the linker invocation. So there has to be a custom, most likely non-upstreamable patch for Makefile.am files all over the place, which I think is worse than manually unrolling setscene dependencies. Alex
On Mon, 2023-09-11 at 17:19 +0200, Alexander Kanavin wrote: > On Thu, 7 Sept 2023 at 14:04, Richard Purdie > <richard.purdie@linuxfoundation.org> wrote: > > See meta/conf/distro/include/no-static-libs.inc which does things like: > > > > meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-openssl-native = "" > > meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-nativesdk-openssl = "" > > > > I wasn't sure how easy the shadow patch would be, that is hopefully the > > main tricky part. sqlite-native used to be built allowing static for > > pseudo iirc but I don't think we need to do that there any more. > > I poked around at passing options from the shadow recipe, but couldn't > arrive at a working combination. libtool and automake insert too many > abstraction layers on the way to the linker invocation. > > So there has to be a custom, most likely non-upstreamable patch for > Makefile.am files all over the place, which I think is worse than > manually unrolling setscene dependencies. I was curious to see if I could make anything work, the best I could come up with was this: diff --git a/meta/conf/distro/include/no-static-libs.inc b/meta/conf/distro/include/no-static-libs.inc index 75359928a14..312629f394c 100644 --- a/meta/conf/distro/include/no-static-libs.inc +++ b/meta/conf/distro/include/no-static-libs.inc @@ -30,3 +30,8 @@ EXTRA_OECMAKE:append:pn-libjpeg-turbo-native = " -DENABLE_STATIC=False" EXCONFIG_ARGS:append:pn-ncurses = " --without-normal" EXCONFIG_ARGS:append:pn-ncurses-native = " --without-normal" EXCONFIG_ARGS:append:pn-nativesdk-ncurses = " --without-normal" + +# Needed so we can statically link shadow-native tools +DISABLE_STATIC:pn-attr-native = "" +DISABLE_STATIC:pn-libbsd-native = "" +DISABLE_STATIC:pn-libmd-native = "" diff --git a/meta/recipes-extended/shadow/shadow_4.14.0.bb b/meta/recipes-extended/shadow/shadow_4.14.0.bb index 4e554463125..8cd916d01bd 100644 --- a/meta/recipes-extended/shadow/shadow_4.14.0.bb +++ b/meta/recipes-extended/shadow/shadow_4.14.0.bb @@ -4,6 +4,14 @@ require shadow.inc # libcrypt. This breaks chsh. BUILD_LDFLAGS:append:class-target = " ${@bb.utils.contains('DISTRO_FEATURES', 'pam', '-lcrypt', '', d)}" +# Force static linking of utilities so we can use from the sysroot/sstate for useradd +# without worrying about the dependency libraries being available +do_compile:prepend:class-native () { + sed -i -e 's#\(mode=link.*CCLD.\s*\)#\1-all-static #g' \ + -e 's#\(LIBS.*-lbsd\)#\1 -lmd#g' \ + -e 's#\(LIBBSD.*-lbsd\)#\1 -lmd#g' ${B}/*/Makefile +} + BBCLASSEXTEND = "native nativesdk" # https://bugzilla.redhat.com/show_bug.cgi?id=884658 which isn't nice but probably isn't too bad to maintain and does give us the static tools. Cheers, Richard
On Wed, 2023-09-13 at 15:00 +0100, Richard Purdie via lists.openembedded.org wrote: > On Mon, 2023-09-11 at 17:19 +0200, Alexander Kanavin wrote: > > On Thu, 7 Sept 2023 at 14:04, Richard Purdie > > <richard.purdie@linuxfoundation.org> wrote: > > > See meta/conf/distro/include/no-static-libs.inc which does things like: > > > > > > meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-openssl-native = "" > > > meta/conf/distro/include/no-static-libs.inc:DISABLE_STATIC:pn-nativesdk-openssl = "" > > > > > > I wasn't sure how easy the shadow patch would be, that is hopefully the > > > main tricky part. sqlite-native used to be built allowing static for > > > pseudo iirc but I don't think we need to do that there any more. > > > > I poked around at passing options from the shadow recipe, but couldn't > > arrive at a working combination. libtool and automake insert too many > > abstraction layers on the way to the linker invocation. > > > > So there has to be a custom, most likely non-upstreamable patch for > > Makefile.am files all over the place, which I think is worse than > > manually unrolling setscene dependencies. > > I was curious to see if I could make anything work, the best I could > come up with was this: > > diff --git a/meta/conf/distro/include/no-static-libs.inc b/meta/conf/distro/include/no-static-libs.inc > index 75359928a14..312629f394c 100644 > --- a/meta/conf/distro/include/no-static-libs.inc > +++ b/meta/conf/distro/include/no-static-libs.inc > @@ -30,3 +30,8 @@ EXTRA_OECMAKE:append:pn-libjpeg-turbo-native = " -DENABLE_STATIC=False" > EXCONFIG_ARGS:append:pn-ncurses = " --without-normal" > EXCONFIG_ARGS:append:pn-ncurses-native = " --without-normal" > EXCONFIG_ARGS:append:pn-nativesdk-ncurses = " --without-normal" > + > +# Needed so we can statically link shadow-native tools > +DISABLE_STATIC:pn-attr-native = "" > +DISABLE_STATIC:pn-libbsd-native = "" > +DISABLE_STATIC:pn-libmd-native = "" > diff --git a/meta/recipes-extended/shadow/shadow_4.14.0.bb b/meta/recipes-extended/shadow/shadow_4.14.0.bb > index 4e554463125..8cd916d01bd 100644 > --- a/meta/recipes-extended/shadow/shadow_4.14.0.bb > +++ b/meta/recipes-extended/shadow/shadow_4.14.0.bb > @@ -4,6 +4,14 @@ require shadow.inc > # libcrypt. This breaks chsh. > BUILD_LDFLAGS:append:class-target = " ${@bb.utils.contains('DISTRO_FEATURES', 'pam', '-lcrypt', '', d)}" > > +# Force static linking of utilities so we can use from the sysroot/sstate for useradd > +# without worrying about the dependency libraries being available > +do_compile:prepend:class-native () { > + sed -i -e 's#\(mode=link.*CCLD.\s*\)#\1-all-static #g' \ > + -e 's#\(LIBS.*-lbsd\)#\1 -lmd#g' \ > + -e 's#\(LIBBSD.*-lbsd\)#\1 -lmd#g' ${B}/*/Makefile > +} > + > BBCLASSEXTEND = "native nativesdk" > > # https://bugzilla.redhat.com/show_bug.cgi?id=884658 > > which isn't nice but probably isn't too bad to maintain and does give > us the static tools. Of course since this removes libc, pseudo no longer works and the chroot calls inside shadow then fail. Which then brings us to: + sed -i -e 's#\(LIBS.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' \ + -e 's#\(LIBBSD.*\)-lbsd#\1 ${STAGING_LIBDIR}/libbsd.a ${STAGING_LIBDIR}/libmd.a#g' ${B}/*/Makefile instead... Cheers, Richard
diff --git a/meta/recipes-core/meta/build-sysroots.bb b/meta/recipes-core/meta/build-sysroots.bb index 1a3b692a1b1..ac74dda22c4 100644 --- a/meta/recipes-core/meta/build-sysroots.bb +++ b/meta/recipes-core/meta/build-sysroots.bb @@ -42,6 +42,6 @@ python do_build_target_sysroot () { } do_build_target_sysroot[cleandirs] = "${STANDALONE_SYSROOT}" do_build_target_sysroot[nostamp] = "1" -addtask do_build_target_sysroot before do_build +addtask do_build_target_sysroot before do_build after do_build_native_sysroot do_clean[cleandirs] += "${STANDALONE_SYSROOT} ${STANDALONE_SYSROOT_NATIVE}"
Target task is using executables populated by the native task and as they run in parallel, races can occur. This was triggered by shadow recipe update which added depedendent libraries, and where half-populated native sysroot (dependent libraries missing) was triggering useradd failures. Presence or absence of useradd itself is a soft failure, and so was previously unnoticed. Signed-off-by: Alexander Kanavin <alex@linutronix.de> --- meta/recipes-core/meta/build-sysroots.bb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)