diff mbox series

[meta-oe] nodejs: inherit qemu class conditionally

Message ID 20250627140633.354460-1-skandigraun@gmail.com
State New
Headers show
Series [meta-oe] nodejs: inherit qemu class conditionally | expand

Commit Message

Gyorgy Sarvari June 27, 2025, 2:06 p.m. UTC
The recipe unconditionally inherits the qemu class, because it executes
some target binaries when it is cross-compiled and the bit-width of the
build host and the target host are different.

Since it is unconditional, it also means that it is inherited for native
and nativesdk builds also. The qemu class uses some qemu options that are
always derived from the target machine's configuration, even when the
recipe is built for class-native. This means that some of the variables
used by the recipe changes (e.g. QEMU_OPTIONS), and the shared state cache
is invalidated when the target machine changes, even when nodejs-native is
being built - and it triggers a full rebuild of nodejs-native unnecessarily.

To avoid this, inherit the qemu class conditionally, only in case it is
used (when the target and build arch's bit-widths are different).

Also, inherit qemu-native based on the same condition, and move around the
qemu-dependent code a bit, so it will be only executed when the qemu class
is inherited.

Signed-off-by: Gyorgy Sarvari <skandigraun@gmail.com>
---
 meta-oe/recipes-devtools/nodejs/nodejs_22.16.0.bb | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Alexander Kanavin June 27, 2025, 2:36 p.m. UTC | #1
On Fri, 27 Jun 2025 at 16:06, Gyorgy Sarvari via
lists.openembedded.org <skandigraun=gmail.com@lists.openembedded.org>
wrote:
> The recipe unconditionally inherits the qemu class, because it executes
> some target binaries when it is cross-compiled and the bit-width of the
> build host and the target host are different.
>
> Since it is unconditional, it also means that it is inherited for native
> and nativesdk builds also. The qemu class uses some qemu options that are
> always derived from the target machine's configuration, even when the
> recipe is built for class-native. This means that some of the variables
> used by the recipe changes (e.g. QEMU_OPTIONS), and the shared state cache
> is invalidated when the target machine changes, even when nodejs-native is
> being built - and it triggers a full rebuild of nodejs-native unnecessarily.
>
> To avoid this, inherit the qemu class conditionally, only in case it is
> used (when the target and build arch's bit-widths are different).
>
> Also, inherit qemu-native based on the same condition, and move around the
> qemu-dependent code a bit, so it will be only executed when the qemu class
> is inherited.

This doesn't quite look right. Plenty of other recipes inherit qemu
unconditionally, including some pretty foundational ones like python3,
and they do not need this fix, or do they? I think something else is
going on here, and that issue needs to be properly investigated.

Can you describe the steps needed to observe the issue? Can you take
those steps for e.g. python3-native?

Alex
Gyorgy Sarvari June 30, 2025, 9:46 a.m. UTC | #2
On 6/27/25 16:36, Alexander Kanavin wrote:
> On Fri, 27 Jun 2025 at 16:06, Gyorgy Sarvari via
> lists.openembedded.org <skandigraun=gmail.com@lists.openembedded.org>
> wrote:
>> The recipe unconditionally inherits the qemu class, because it executes
>> some target binaries when it is cross-compiled and the bit-width of the
>> build host and the target host are different.
>>
>> Since it is unconditional, it also means that it is inherited for native
>> and nativesdk builds also. The qemu class uses some qemu options that are
>> always derived from the target machine's configuration, even when the
>> recipe is built for class-native. This means that some of the variables
>> used by the recipe changes (e.g. QEMU_OPTIONS), and the shared state cache
>> is invalidated when the target machine changes, even when nodejs-native is
>> being built - and it triggers a full rebuild of nodejs-native unnecessarily.
>>
>> To avoid this, inherit the qemu class conditionally, only in case it is
>> used (when the target and build arch's bit-widths are different).
>>
>> Also, inherit qemu-native based on the same condition, and move around the
>> qemu-dependent code a bit, so it will be only executed when the qemu class
>> is inherited.
> This doesn't quite look right. Plenty of other recipes inherit qemu
> unconditionally, including some pretty foundational ones like python3,
> and they do not need this fix, or do they? I think something else is
> going on here, and that issue needs to be properly investigated.
>
> Can you describe the steps needed to observe the issue? Can you take
> those steps for e.g. python3-native?

Though I thought I do, apparently I don't have a step-by-step
reproduction (yet?) unfortunately. I keep looking at diffsigs and then
bitbake -e, and hope to catch it (which usually point to QEMU_OPTIONS &
co, but yeah. there seems to be more to this).
Until now I thought that it happens just by simply building
nodejs-native for different targets, but now that I tried to observe it
in a targeted way, it doesn't seem to be the case. Keep building images
that have nodejs-native as their build dependency :S

2 anecdata:
1. Just now I set up manually a vanilla[1] oe-core+meta-oe env from
master, and added nodejs-native to psplash's DEPENDS (psplash has no
relevancy here, just added it to a random recipe, to build nodejs). 
  - When I built nodejs-native directly with qemuarm and qemuarm64,
sstate cache worked. 
  - When I built core-image-minimal and core-image-sato also that way
for the same targets, they worked too. 
  - Then I added PR = "r01" to openssl recipe. By this I tried to
simulate the update of nodejs dependencies. Then I rebuilt
core-image-sato for the same targets. As expected, nodejs-native was
rebuilt for the first build. And then it was rebuilt from scratch for
the second MACHINE too.
So now my suspicion is somewhere around updated dependencies if there
are some sstate cache already, but I won't be able to look much at this
in the first half of the week.
  
2. In one of my own projects (where nodejs-native is a real dependency),
over the weekend nodejs was rebuilt twice, in same build environment.
The diff pulled between the config of the 2 rebuilds is below[2]. All I
did was just building one image after another. I used kas with this,
with pinned layer revisions - it hasn't changed between runs. They were
also built in the same docker image, but in separate, sequential docker
sessions.

[1]: Vanilla = only bblayers.conf is modified to add the meta-oe layer,
and added INHERIT += "rm_work" in local.conf and nothing else. Only the
MACHINE variable changes between builds, through environment variable.
The env is unused, sstate cache and everything else were generated from
0 during the first build, and then reused during the subsequent ones.

[2]: the conf diff:

[meeee@desktop yocto]$ diff -Naur
nodejs-rebuild-walnascar-musl-riscv.local.conf
nodejs-rebuild-walnascar-musl-x86.local.conf
--- nodejs-rebuild-walnascar-musl-riscv.local.conf      2025-06-28
17:02:03.282055974 +0200
+++ nodejs-rebuild-walnascar-musl-x86.local.conf        2025-06-29
17:51:14.298700768 +0200
@@ -1,5 +1,5 @@
 # image_name
-IMAGE_NAME = "musl-walnascar-esr-riscv"
+IMAGE_NAME = "musl-walnascar-esr-x86_64"
 BBMASK += "meta-browser/meta-firefox/recipes-devtools/rust"
 
 # meta-firefox-common
@@ -137,6 +137,6 @@
 REQUIRED_VERSION_rust-llvm = "1.84.1"
 REQUIRED_VERSION_rust-llvm-native = "1.84.1"
 
-MACHINE ??= "qemuriscv64"
+MACHINE ??= "qemux86-64"
 DISTRO ??= "poky"
 BBMULTICONFIG ?= ""
[meeee@desktop yocto]$ 

>
> Alex
Alexander Kanavin June 30, 2025, 9:49 a.m. UTC | #3
On Mon, 30 Jun 2025 at 11:46, Gyorgy Sarvari <skandigraun@gmail.com> wrote:
> Though I thought I do, apparently I don't have a step-by-step
> reproduction (yet?) unfortunately. I keep looking at diffsigs and then
> bitbake -e, and hope to catch it (which usually point to QEMU_OPTIONS &
> co, but yeah. there seems to be more to this).
> Until now I thought that it happens just by simply building
> nodejs-native for different targets, but now that I tried to observe it
> in a targeted way, it doesn't seem to be the case. Keep building images
> that have nodejs-native as their build dependency :S

If you're unable to reproduce on demand, you can't confirm that this
patch is indeed fixing the issue either :-/

I can try to help you, but only if the starting point is specific
instructions on how to see the problem, otherwise I'd only get
frustrated at wasting time trying to get there first.

Alex
Alexander Kanavin June 30, 2025, 9:52 a.m. UTC | #4
On Mon, 30 Jun 2025 at 11:49, Alexander Kanavin via
lists.openembedded.org <alex.kanavin=gmail.com@lists.openembedded.org>
wrote:
> If you're unable to reproduce on demand, you can't confirm that this
> patch is indeed fixing the issue either :-/
>
> I can try to help you, but only if the starting point is specific
> instructions on how to see the problem, otherwise I'd only get
> frustrated at wasting time trying to get there first.

I see that Khem has merged it. I still do not think it's correct, and
I believe we need to revert and look at this properly.

Alex
Gyorgy Sarvari July 1, 2025, 8:14 p.m. UTC | #5
On 6/30/25 11:49, Alexander Kanavin wrote:
> On Mon, 30 Jun 2025 at 11:46, Gyorgy Sarvari <skandigraun@gmail.com> wrote:
>> Though I thought I do, apparently I don't have a step-by-step
>> reproduction (yet?) unfortunately. I keep looking at diffsigs and then
>> bitbake -e, and hope to catch it (which usually point to QEMU_OPTIONS &
>> co, but yeah. there seems to be more to this).
>> Until now I thought that it happens just by simply building
>> nodejs-native for different targets, but now that I tried to observe it
>> in a targeted way, it doesn't seem to be the case. Keep building images
>> that have nodejs-native as their build dependency :S
> If you're unable to reproduce on demand, you can't confirm that this
> patch is indeed fixing the issue either :-/
>
> I can try to help you, but only if the starting point is specific
> instructions on how to see the problem, otherwise I'd only get
> frustrated at wasting time trying to get there first.
>
> Alex
Apparently somehow my clean setup broke yesterday - and that caused my
surprise that the issue doesn't happen as I described in the commit
message. It was building garbage. After deleting it, and starting new,
the issue does happen again, 100% - just "MACHINE=foo1 bitbake
nodejs-native". New target machine comes with new rebuild. Inheriting
rm_work globally makes it very visible. Even after deleting the env and
starting it from scratch, the behavior is consistent.

Python recipe doesn't use qemu by default, and doesn't use it at all for
the native build (it even errors out due to circular dependency if one
tries to force it with pgo packageconfig). When I modify the recipe to
use it without pgo (comment out  'write_pgo_wrapper:class-native =
":"'), then it has the same issue: when target machine changes,
TUNE_PKGARCH changes also, and it kicks off a new python-native rebuild.

Now the issue seems to happen if the qemu_wrapper_cmdline function from
qemu class is actually part of the recipe. Inheriting qemu class in any
recipe, and adding e.g. "echo ${@qemu_wrapper_cmdline(d, 'foo',
['bar'])}" to any task that's running during native build will trigger
the issue.
Don't know if it's some optimization to only include variables that are
actually used, or something else, but today I can't spend time on this,
and at the end of the day it looks like I'm the only one building images
for multiple targets with nodejs dependency, so whatever.

As a closing thought, I want to reject all kind of suggestions and
insinuations that I have asked for your help, I am frustrating you or
that I am wasting your time.
If you ignore or spend time on any information I share anywhere, that's
only your choice. I have never asked any help from you - if you do it on
your own, that's ok, but if not, that's just as equally ok. If you feel
that I am pressuring you to do anything, ever, call me out on that.
Otherwise if you are getting frustrated, that's your own doing, and
don't put the blame on me nor on anyone else.

The first email was a response to a question, not an demand/ask for help
or any other action. This email is a curtesy followup, in the hope that
it will be of interest for some - but not any request nor suggestion for
you (or anyone else) to do anything whatsoever.
diff mbox series

Patch

diff --git a/meta-oe/recipes-devtools/nodejs/nodejs_22.16.0.bb b/meta-oe/recipes-devtools/nodejs/nodejs_22.16.0.bb
index 4bc829f140..162c9607d2 100644
--- a/meta-oe/recipes-devtools/nodejs/nodejs_22.16.0.bb
+++ b/meta-oe/recipes-devtools/nodejs/nodejs_22.16.0.bb
@@ -9,7 +9,9 @@  DEPENDS = "openssl openssl-native file-replacement-native python3-packaging-nati
 DEPENDS:append:class-target = " qemu-native"
 DEPENDS:append:class-native = " c-ares-native"
 
-inherit pkgconfig python3native qemu ptest siteinfo
+inherit pkgconfig python3native ptest siteinfo
+inherit_defer ${@bb.utils.contains('HOST_AND_TARGET_SAME_WIDTH', '0', 'qemu', '', d)}
+
 
 COMPATIBLE_MACHINE:armv4 = "(!.*armv4).*"
 COMPATIBLE_MACHINE:armv5 = "(!.*armv5).*"
@@ -108,11 +110,11 @@  python do_create_v8_qemu_wrapper () {
     on the host."""
     qemu_libdirs = [d.expand('${STAGING_DIR_HOST}${libdir}'),
                     d.expand('${STAGING_DIR_HOST}${base_libdir}')]
-    qemu_cmd = qemu_wrapper_cmdline(d, d.getVar('STAGING_DIR_HOST'),
-                                    qemu_libdirs)
+    qemu_cmd = ""
 
-    if d.getVar("HOST_AND_TARGET_SAME_WIDTH") == "1":
-        qemu_cmd = ""
+    if d.getVar("HOST_AND_TARGET_SAME_WIDTH") == "0":
+        qemu_cmd = qemu_wrapper_cmdline(d, d.getVar('STAGING_DIR_HOST'),
+                                        qemu_libdirs)
 
     wrapper_path = d.expand('${B}/v8-qemu-wrapper.sh')
     with open(wrapper_path, 'w') as wrapper_file:
@@ -209,6 +211,7 @@  python __anonymous () {
     # 32 bit target and 64 bit host (x86-64 or aarch64) have different bit width
     if d.getVar("SITEINFO_BITS") == "32" and "64" in d.getVar("BUILD_ARCH"):
         d.setVar("HOST_AND_TARGET_SAME_WIDTH", "0")
+        d.appendVar("DEPENDS:class-target", " qemu-native")
     else:
         d.setVar("HOST_AND_TARGET_SAME_WIDTH", "1")
 }