mbox series

[0/3] set qemuppc64 default nfs rootfs mount r/w size to 524288

Message ID 20230117020504.697181-1-xiangyu.chen@eng.windriver.com
Headers show
Series set qemuppc64 default nfs rootfs mount r/w size to 524288 | expand

Message

Xiangyu Chen Jan. 17, 2023, 2:05 a.m. UTC
From: Xiangyu Chen <xiangyu.chen@windriver.com>

On master oe, build a qemuppc64 with systemd as default init, when we
use nfs bootup, the kernel might panic due to missing symbol in dynamic
libraries as below:

  hid-generic 0003:0627:0001.0003: input: USB HID v0.01 Mouse [QEMU QEMU USB Tablet] on usb-0000:00:01.0-3/input0
  /sbin/init: /lib64/libm.so.6: version `XZ_5.0' not found (required by /usr/lib64/libkmod.so.2)
  Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
  CPU: 0 PID: 1 Comm: init Not tainted 5.15.78-yocto-standard #1
  Call Trace:
  [c000000007443ba0] [c0000000009538d0] dump_stack_lvl+0x74/0xa8 (unreliable)
  [c000000007443be0] [c000000000103524] panic+0x170/0x3cc
  [c000000007443c80] [c00000000010cf64] do_exit+0xb44/0xb50
  [c000000007443d50] [c00000000010d040] do_group_exit+0x60/0xd0
  [c000000007443d90] [c00000000010d0d4] sys_exit_group+0x24/0x30
  [c000000007443db0] [c00000000002cfd4] system_call_exception+0x194/0x2f0
  [c000000007443e10] [c00000000000c2cc] system_call_common+0xec/0x250
  --- interrupt: c00 at 0x7fff9ed9e840
  NIP:  00007fff9ed9e840 LR: 00007fff9ed7da20 CTR: 0000000000000000
  REGS: c000000007443e80 TRAP: 0c00   Not tainted (5.15.78-yocto-standard)
  MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 24022442  XER: 00000000

One or more of the libraries systemd depends on failed to load due to
unresolved symbols/functions.  This was intermittent - with a failure
rate estimated between 5% and 30%.

After checking the code, this issue happens on gcc 12, kirkstone is using
gcc 11 works well, with both using the exact same v5.15.84 kernel commit.

There is a kernel fix from upstream [1], they changed the rsize / wsize
to a multiple of PAGE_SIZE, when we applied this patch, the qemuppc64's
default r/wsize went from 4096 to 524288.But the qemuppc64 doesn't have
its own linux-yocto kernel branch, so apply this change might cause
regression with other platforms which share branch with qemuppc64.

So, we added an extra option for nfs rootfs, and set the qemuppc64 default
r/w size to 524288 to line up with the kernel fix[1].

Yocto did a similar thing in the distant past[2] - prior to boot-arg
adjustments existing - by allowing a Kconfig to set the defaults on
nfsboot, in order to work around hardware limitations.

There are 3 patches for this fix:
the patch 1 added QB_NFSROOTFS_EXTRA_OPT in qemuboot.bbclass,it contains 
a default QB_NFSROOTFS_EXTRA_OPT value and some notes in comment.
the patch 2 contains processing this value in runqemu script
the patch 3 set the r/wsize value in qemuppc64 machine configuration
file.

Reference:
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=940261a195080cf
[2] https://git.yoctoproject.org/linux-yocto-4.1/commit/?h=standard/base&id=a96cfd98add95

Xiangyu Chen (3):
  qemuboot.bbclass: add QB_NFSROOTFS_EXTRA_OPT for nfs rootfs extra
    option
  runqemu: add process of option QB_NFSROOTFS_EXTRA_OPT
  qemuppc64: set the qemuppc64 nfs r/wsize mount options to 524288

 meta/classes-recipe/qemuboot.bbclass | 3 +++
 meta/conf/machine/qemuppc64.conf     | 1 +
 scripts/runqemu                      | 6 +++++-
 3 files changed, 9 insertions(+), 1 deletion(-)

Comments

Richard Purdie Jan. 25, 2023, 4:38 p.m. UTC | #1
On Tue, 2023-01-17 at 10:05 +0800, Xiangyu Chen wrote:
> From: Xiangyu Chen <xiangyu.chen@windriver.com>
> 
> On master oe, build a qemuppc64 with systemd as default init, when we
> use nfs bootup, the kernel might panic due to missing symbol in dynamic
> libraries as below:
> 
>   hid-generic 0003:0627:0001.0003: input: USB HID v0.01 Mouse [QEMU QEMU USB Tablet] on usb-0000:00:01.0-3/input0
>   /sbin/init: /lib64/libm.so.6: version `XZ_5.0' not found (required by /usr/lib64/libkmod.so.2)
>   Kernel panic - not syncing: Attempted to kill init! exitcode=0x00007f00
>   CPU: 0 PID: 1 Comm: init Not tainted 5.15.78-yocto-standard #1
>   Call Trace:
>   [c000000007443ba0] [c0000000009538d0] dump_stack_lvl+0x74/0xa8 (unreliable)
>   [c000000007443be0] [c000000000103524] panic+0x170/0x3cc
>   [c000000007443c80] [c00000000010cf64] do_exit+0xb44/0xb50
>   [c000000007443d50] [c00000000010d040] do_group_exit+0x60/0xd0
>   [c000000007443d90] [c00000000010d0d4] sys_exit_group+0x24/0x30
>   [c000000007443db0] [c00000000002cfd4] system_call_exception+0x194/0x2f0
>   [c000000007443e10] [c00000000000c2cc] system_call_common+0xec/0x250
>   --- interrupt: c00 at 0x7fff9ed9e840
>   NIP:  00007fff9ed9e840 LR: 00007fff9ed7da20 CTR: 0000000000000000
>   REGS: c000000007443e80 TRAP: 0c00   Not tainted (5.15.78-yocto-standard)
>   MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 24022442  XER: 00000000
> 
> One or more of the libraries systemd depends on failed to load due to
> unresolved symbols/functions.  This was intermittent - with a failure
> rate estimated between 5% and 30%.
> 
> After checking the code, this issue happens on gcc 12, kirkstone is using
> gcc 11 works well, with both using the exact same v5.15.84 kernel commit.
> 
> There is a kernel fix from upstream [1], they changed the rsize / wsize
> to a multiple of PAGE_SIZE, when we applied this patch, the qemuppc64's
> default r/wsize went from 4096 to 524288.But the qemuppc64 doesn't have
> its own linux-yocto kernel branch, so apply this change might cause
> regression with other platforms which share branch with qemuppc64.
> 
> So, we added an extra option for nfs rootfs, and set the qemuppc64 default
> r/w size to 524288 to line up with the kernel fix[1].
> 
> Yocto did a similar thing in the distant past[2] - prior to boot-arg
> adjustments existing - by allowing a Kconfig to set the defaults on
> nfsboot, in order to work around hardware limitations.
> 
> There are 3 patches for this fix:
> the patch 1 added QB_NFSROOTFS_EXTRA_OPT in qemuboot.bbclass,it contains 
> a default QB_NFSROOTFS_EXTRA_OPT value and some notes in comment.
> the patch 2 contains processing this value in runqemu script
> the patch 3 set the r/wsize value in qemuppc64 machine configuration
> file.
> 
> Reference:
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=940261a195080cf
> [2] https://git.yoctoproject.org/linux-yocto-4.1/commit/?h=standard/base&id=a96cfd98add95
> 
> Xiangyu Chen (3):
>   qemuboot.bbclass: add QB_NFSROOTFS_EXTRA_OPT for nfs rootfs extra
>     option
>   runqemu: add process of option QB_NFSROOTFS_EXTRA_OPT
>   qemuppc64: set the qemuppc64 nfs r/wsize mount options to 524288
> 
>  meta/classes-recipe/qemuboot.bbclass | 3 +++
>  meta/conf/machine/qemuppc64.conf     | 1 +
>  scripts/runqemu                      | 6 +++++-
>  3 files changed, 9 insertions(+), 1 deletion(-)

I've tried to understand this a couple of times and failed, I think
there is some missing information in the commits.

What is the default page size on qemuppc64? What is the default nfs
r/wsize? Is the issue the default page size on ppc64?

I also need to highlight that qemuppc64 isn't officially supported by
Yocto Project and isn't part of our test matrix on the autobuilder. If
anyone is relying on this support they really need to commit to
maintaining that architecture.

Cheers,

Richard