diff mbox series

[1/2] qemux86/conf: TEST ONLY. Drop smp boot

Message ID 20230608193901.363242-1-bruce.ashfield@gmail.com
State New
Headers show
Series [1/2] qemux86/conf: TEST ONLY. Drop smp boot | expand

Commit Message

Bruce Ashfield June 8, 2023, 7:39 p.m. UTC
From: Bruce Ashfield <bruce.ashfield@gmail.com>

We are seeing intermittent boot hangs with x86. Once of the last
messages in failing logs is secondary cpu startup.

Dropping us down to smp for testing to see if the hangs start to
go away with smp removed.

Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
---
 meta/conf/machine/include/x86/qemuboot-x86.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Richard Purdie June 9, 2023, 6:05 a.m. UTC | #1
On Thu, 2023-06-08 at 19:39 +0000, bruce.ashfield@gmail.com wrote:
> From: Bruce Ashfield <bruce.ashfield@gmail.com>
> 
> We are seeing intermittent boot hangs with x86. Once of the last
> messages in failing logs is secondary cpu startup.
> 
> Dropping us down to smp for testing to see if the hangs start to
> go away with smp removed.
> 
> Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
> ---
>  meta/conf/machine/include/x86/qemuboot-x86.inc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meta/conf/machine/include/x86/qemuboot-x86.inc b/meta/conf/machine/include/x86/qemuboot-x86.inc
> index 6ae03633ae..54a12119a7 100644
> --- a/meta/conf/machine/include/x86/qemuboot-x86.inc
> +++ b/meta/conf/machine/include/x86/qemuboot-x86.inc
> @@ -1,6 +1,6 @@
>  # For runqemu
>  IMAGE_CLASSES += "qemuboot"
> -QB_SMP ?= "-smp 4"
> +QB_SMP ?= ""
>  QB_CPU:x86 ?= "-cpu IvyBridge -machine q35,i8042=off"
>  QB_CPU_KVM:x86 ?= "-cpu IvyBridge -machine q35,i8042=off"
> 
> 

FWIW, this patch didn't change much. With it applied we still see:

https://autobuilder.yoctoproject.org/typhoon/#/builders/81/builds/5168/steps/12/logs/stdio

[    0.634735] Freeing SMP alternatives memory: 52K
[   12.275510] smpboot: CPU0: Intel Xeon E3-12xx v2 (Ivy Bridge) (family: 0x6, model: 0x3a, stepping: 0x9)
[274491.955544] cblist_init_generic: Setting adjustable number of callback queues.
[274492.008510] cblist_init_generic: Setting shift to 0 and lim to 1.
[279820.791509] cblist_init_generic: Setting shift to 0 and lim to 1.
[282104.735510] cblist_init_generic: Setting shift to 0 and lim to 1.
[386831.154510] Performance Events: unsupported p6 CPU model 58 no PMU driver, software events only.
[386850.632510] rcu: Hierarchical SRCU implementation.
[386850.668510] rcu: 	Max phase no-delay instances is 400.
[386852.898509] smp: Bringing up secondary CPUs ...
[386852.946510] smp: Brought up 1 node, 1 CPU
[386852.992510] smpboot: Max logical packages: 1
[386853.018509] smpboot: Total of 1 processors activated (5199.99 BogoMIPS)
[386855.257509] devtmpfs: initialized
[386857.982509] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[386858.041510] futex hash table entries: 256 (order: 2, 16384 bytes, linear)
[386858.691509] pinctrl core: initialized pinctrl subsystem
[386862.836510] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[386864.720510] thermal_sys: Registered thermal governor 'step_wise'
[386864.725509] thermal_sys: Registered thermal governor 'user_space'
[386864.864510] cpuidle: using governor menu
[386865.662509] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xb0000000-0xbfffffff] (base 0xb0000000)
[386865.693510] PCI: MMCONFIG at [mem 0xb0000000-0xbfffffff] reserved in E820
[386865.900509] PCI: Using configuration type 1 for base access
[386878.550509] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.

Cheers,

Richard
Bruce Ashfield June 9, 2023, 2:08 p.m. UTC | #2
On Fri, Jun 9, 2023 at 2:06 AM Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
>
> On Thu, 2023-06-08 at 19:39 +0000, bruce.ashfield@gmail.com wrote:
> > From: Bruce Ashfield <bruce.ashfield@gmail.com>
> >
> > We are seeing intermittent boot hangs with x86. Once of the last
> > messages in failing logs is secondary cpu startup.
> >
> > Dropping us down to smp for testing to see if the hangs start to
> > go away with smp removed.
> >
> > Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
> > ---
> >  meta/conf/machine/include/x86/qemuboot-x86.inc | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/meta/conf/machine/include/x86/qemuboot-x86.inc b/meta/conf/machine/include/x86/qemuboot-x86.inc
> > index 6ae03633ae..54a12119a7 100644
> > --- a/meta/conf/machine/include/x86/qemuboot-x86.inc
> > +++ b/meta/conf/machine/include/x86/qemuboot-x86.inc
> > @@ -1,6 +1,6 @@
> >  # For runqemu
> >  IMAGE_CLASSES += "qemuboot"
> > -QB_SMP ?= "-smp 4"
> > +QB_SMP ?= ""
> >  QB_CPU:x86 ?= "-cpu IvyBridge -machine q35,i8042=off"
> >  QB_CPU_KVM:x86 ?= "-cpu IvyBridge -machine q35,i8042=off"
> >
> >
>
> FWIW, this patch didn't change much. With it applied we still see:


ok. so at least we can rule out it being specific to secondary cpu startup.

Bruce


>
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/81/builds/5168/steps/12/logs/stdio
>
> [    0.634735] Freeing SMP alternatives memory: 52K
> [   12.275510] smpboot: CPU0: Intel Xeon E3-12xx v2 (Ivy Bridge) (family: 0x6, model: 0x3a, stepping: 0x9)
> [274491.955544] cblist_init_generic: Setting adjustable number of callback queues.
> [274492.008510] cblist_init_generic: Setting shift to 0 and lim to 1.
> [279820.791509] cblist_init_generic: Setting shift to 0 and lim to 1.
> [282104.735510] cblist_init_generic: Setting shift to 0 and lim to 1.
> [386831.154510] Performance Events: unsupported p6 CPU model 58 no PMU driver, software events only.
> [386850.632510] rcu: Hierarchical SRCU implementation.
> [386850.668510] rcu:    Max phase no-delay instances is 400.
> [386852.898509] smp: Bringing up secondary CPUs ...
> [386852.946510] smp: Brought up 1 node, 1 CPU
> [386852.992510] smpboot: Max logical packages: 1
> [386853.018509] smpboot: Total of 1 processors activated (5199.99 BogoMIPS)
> [386855.257509] devtmpfs: initialized
> [386857.982509] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
> [386858.041510] futex hash table entries: 256 (order: 2, 16384 bytes, linear)
> [386858.691509] pinctrl core: initialized pinctrl subsystem
> [386862.836510] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> [386864.720510] thermal_sys: Registered thermal governor 'step_wise'
> [386864.725509] thermal_sys: Registered thermal governor 'user_space'
> [386864.864510] cpuidle: using governor menu
> [386865.662509] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xb0000000-0xbfffffff] (base 0xb0000000)
> [386865.693510] PCI: MMCONFIG at [mem 0xb0000000-0xbfffffff] reserved in E820
> [386865.900509] PCI: Using configuration type 1 for base access
> [386878.550509] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
>
> Cheers,
>
> Richard
>
Paul Gortmaker June 11, 2023, 2:08 a.m. UTC | #3
[Re: [PATCH 1/2] qemux86/conf: TEST ONLY. Drop smp boot] On 09/06/2023 (Fri 10:08) Bruce Ashfield wrote:

> On Fri, Jun 9, 2023 at 2:06???AM Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> >

[...]

> > FWIW, this patch didn't change much. With it applied we still see:
> 
> 
> ok. so at least we can rule out it being specific to secondary cpu startup.

I'll own some of that.  I was looking forward in the dmesg where
secondary CPU init was, and ignoring looking backwards.

I have a new thought.  I still could be completely off base.  I updated
the case with my theory.

https://bugzilla.yoctoproject.org/show_bug.cgi?id=15138

Might totally be another wild goose chase.  Who knows...

Paul.
--
Richard Purdie June 12, 2023, 10:18 a.m. UTC | #4
On Sat, 2023-06-10 at 22:08 -0400, Paul Gortmaker wrote:
> [Re: [PATCH 1/2] qemux86/conf: TEST ONLY. Drop smp boot] On 09/06/2023 (Fri 10:08) Bruce Ashfield wrote:
> 
> > On Fri, Jun 9, 2023 at 2:06???AM Richard Purdie
> > <richard.purdie@linuxfoundation.org> wrote:
> > > 
> 
> [...]
> 
> > > FWIW, this patch didn't change much. With it applied we still see:
> > 
> > 
> > ok. so at least we can rule out it being specific to secondary cpu startup.
> 
> I'll own some of that.  I was looking forward in the dmesg where
> secondary CPU init was, and ignoring looking backwards.
> 
> I have a new thought.  I still could be completely off base.  I updated
> the case with my theory.
> 
> https://bugzilla.yoctoproject.org/show_bug.cgi?id=15138
> 
> Might totally be another wild goose chase.  Who knows...

I was leaning toward leaving these on as it tests things as most users
would end up using them. Whether this is the right thing to do, not
sure.

As an extra data point, we have this failure with 6.1.28:

https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/5262/steps/14/logs/stdio

which means the issue is probably between 6.1.26 and 6.1.28. I'm
testing with 6.1.27 and haven't seen it there yet.

Cheers,

Richard
diff mbox series

Patch

diff --git a/meta/conf/machine/include/x86/qemuboot-x86.inc b/meta/conf/machine/include/x86/qemuboot-x86.inc
index 6ae03633ae..54a12119a7 100644
--- a/meta/conf/machine/include/x86/qemuboot-x86.inc
+++ b/meta/conf/machine/include/x86/qemuboot-x86.inc
@@ -1,6 +1,6 @@ 
 # For runqemu
 IMAGE_CLASSES += "qemuboot"
-QB_SMP ?= "-smp 4"
+QB_SMP ?= ""
 QB_CPU:x86 ?= "-cpu IvyBridge -machine q35,i8042=off"
 QB_CPU_KVM:x86 ?= "-cpu IvyBridge -machine q35,i8042=off"