diff mbox series

[v2,2/2] oeqa runtime parselogs: ignore KV260 USB error on genericarm64

Message ID 20260306093316.7018-3-mikko.rapeli@linaro.org (mailing list archive)
State New
Headers show
Series oeqa runtime parselogs: BeaglePlay and KV260 genericarm64 workarounds | expand

Commit Message

Mikko Rapeli March 6, 2026, 9:33 a.m. UTC
On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but
continues and USB stack works so ignore the error:

usb 1-1: device descriptor read/64, error -71
hub 1-1:1.0: USB hub found
hub 1-1:1.0: 5 ports detected
usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
hub 2-1:1.0: USB hub found
hub 2-1:1.0: 4 ports detected

https://lists.yoctoproject.org/g/automated-testing/message/1525

Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
---
 .../lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt   | 2 ++
 1 file changed, 2 insertions(+)

v2: no changes

v1: https://lists.yoctoproject.org/g/poky/message/13836

Comments

Richard Purdie March 6, 2026, 12:17 p.m. UTC | #1
On Fri, 2026-03-06 at 11:33 +0200, Mikko Rapeli via lists.yoctoproject.org wrote:
> On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but
> continues and USB stack works so ignore the error:
> 
> usb 1-1: device descriptor read/64, error -71
> hub 1-1:1.0: USB hub found
> hub 1-1:1.0: 5 ports detected
> usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
> hub 2-1:1.0: USB hub found
> hub 2-1:1.0: 4 ports detected
> 
> https://lists.yoctoproject.org/g/automated-testing/message/1525
> 
> Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>

I'm sure people will tell me I should just merge these and we move on.

Before I do that, I want to be really clear that this approach really
isn't great. It amounts to "oh, we have errors we don't even understand
but we'll just ignore them and hope for the best". Should we even
bother with the test if we're just going to ignore anything it shows
up?

I appreciate Mikko can't go and fight the world on issues like this and
this isn't directed at him, it is just a general frustration with the
"quality" of where software seems to be going.

Where do we draw the line if we're just going to ignore any error?

Cheers,

Richard
Mikko Rapeli March 6, 2026, 1:52 p.m. UTC | #2
Hi,

On Fri, Mar 06, 2026 at 12:17:32PM +0000, Richard Purdie wrote:
> On Fri, 2026-03-06 at 11:33 +0200, Mikko Rapeli via lists.yoctoproject.org wrote:
> > On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but
> > continues and USB stack works so ignore the error:
> > 
> > usb 1-1: device descriptor read/64, error -71
> > hub 1-1:1.0: USB hub found
> > hub 1-1:1.0: 5 ports detected
> > usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
> > hub 2-1:1.0: USB hub found
> > hub 2-1:1.0: 4 ports detected
> > 
> > https://lists.yoctoproject.org/g/automated-testing/message/1525
> > 
> > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
> 
> I'm sure people will tell me I should just merge these and we move on.
> 
> Before I do that, I want to be really clear that this approach really
> isn't great. It amounts to "oh, we have errors we don't even understand
> but we'll just ignore them and hope for the best". Should we even
> bother with the test if we're just going to ignore anything it shows
> up?
> 
> I appreciate Mikko can't go and fight the world on issues like this and
> this isn't directed at him, it is just a general frustration with the
> "quality" of where software seems to be going.
> 
> Where do we draw the line if we're just going to ignore any error?

I think the maintainers needs to decide.

I maintain yocto genericarm64 testing on real target HW like the AMD KV260 and thus
I've decided that I accept these workarounds. These changes do not impact
other SW or maintainers in meta-yocto git repo so I think applying the commit
is safe.

The parselogs test is really good, too good infact, in finding issues coming from DUT HW,
lab setup with DUT, firmware, kernel and rest of the high level yocto
based OS image. Since my maintenance time is limited I need to
decide which issues to fix and which to just accept. It is ok to challenge my
decisions and especially help with fixing the actual root causes. Any HW vendors,
please step in to help!

I have been able to fix DUT HW setup issues for example by adding dummy HDMI
adapters so that Xorg startup completes, fix firmware and kernel issues for
some of the devices like Xorg 16bpp graphics output, and fix higher level
oddities like systemd udev breakages so I do some analysis and try fix issues
before adding these workarounds.

Another aspect is the reuse of oeqa runtime tests in other environments and
genericarm64 showing an example how to do that in the imperfect world. I
would like to see more of this since they are actually pretty good basis for
doing CI and on target testing. But that testing environment is complex and
target HW and firmware by definition unstable since they are under development,
even after vendors ship the HW and provide the BSP SW adaptations. Thus
these layer specific ignore lists are likely needed in custom
environments to get basic set of tests working and to stop new issues from popping
up, which is actually really important. It is important to get stable testing results
and it is important to find bugs and fixes for them. The parselogs findings are
sometimes rare oddities but sadly common with real HW, firmware and yocto based OSes.
That world is never perfect black and white, the gray zone is quite thick. That
is where parselogs ignore lists fit, into the gray zone when someone needs make
something imperfect but usable to produce reliable pass or fail results in testing.

Cheers,

-Mikko
Richard Purdie March 7, 2026, 10:23 p.m. UTC | #3
On Fri, 2026-03-06 at 15:52 +0200, Mikko Rapeli wrote:
> Hi,
> 
> On Fri, Mar 06, 2026 at 12:17:32PM +0000, Richard Purdie wrote:
> > On Fri, 2026-03-06 at 11:33 +0200, Mikko Rapeli via lists.yoctoproject.org wrote:
> > > On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but
> > > continues and USB stack works so ignore the error:
> > > 
> > > usb 1-1: device descriptor read/64, error -71
> > > hub 1-1:1.0: USB hub found
> > > hub 1-1:1.0: 5 ports detected
> > > usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
> > > hub 2-1:1.0: USB hub found
> > > hub 2-1:1.0: 4 ports detected
> > > 
> > > https://lists.yoctoproject.org/g/automated-testing/message/1525
> > > 
> > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org>
> > 
> > I'm sure people will tell me I should just merge these and we move on.
> > 
> > Before I do that, I want to be really clear that this approach really
> > isn't great. It amounts to "oh, we have errors we don't even understand
> > but we'll just ignore them and hope for the best". Should we even
> > bother with the test if we're just going to ignore anything it shows
> > up?
> > 
> > I appreciate Mikko can't go and fight the world on issues like this and
> > this isn't directed at him, it is just a general frustration with the
> > "quality" of where software seems to be going.
> > 
> > Where do we draw the line if we're just going to ignore any error?
> 
> I think the maintainers needs to decide.
> 
> I maintain yocto genericarm64 testing on real target HW like the AMD KV260 and thus
> I've decided that I accept these workarounds. These changes do not impact
> other SW or maintainers in meta-yocto git repo so I think applying the commit
> is safe.
> 
> The parselogs test is really good, too good infact, in finding issues coming from DUT HW,
> lab setup with DUT, firmware, kernel and rest of the high level yocto
> based OS image. Since my maintenance time is limited I need to
> decide which issues to fix and which to just accept. It is ok to challenge my
> decisions and especially help with fixing the actual root causes. Any HW vendors,
> please step in to help!
> 
> I have been able to fix DUT HW setup issues for example by adding dummy HDMI
> adapters so that Xorg startup completes, fix firmware and kernel issues for
> some of the devices like Xorg 16bpp graphics output, and fix higher level
> oddities like systemd udev breakages so I do some analysis and try fix issues
> before adding these workarounds.

That is good to know, I knew you'd fixed some things but I am glad it
it doesn't have a totally bad reputation!

> Another aspect is the reuse of oeqa runtime tests in other environments and
> genericarm64 showing an example how to do that in the imperfect world. I
> would like to see more of this since they are actually pretty good basis for
> doing CI and on target testing. But that testing environment is complex and
> target HW and firmware by definition unstable since they are under development,
> even after vendors ship the HW and provide the BSP SW adaptations. Thus
> these layer specific ignore lists are likely needed in custom
> environments to get basic set of tests working and to stop new issues from popping
> up, which is actually really important. It is important to get stable testing results
> and it is important to find bugs and fixes for them. The parselogs findings are
> sometimes rare oddities but sadly common with real HW, firmware and yocto based OSes.
> That world is never perfect black and white, the gray zone is quite thick. That
> is where parselogs ignore lists fit, into the gray zone when someone needs make
> something imperfect but usable to produce reliable pass or fail results in testing.

The higher level software tests struggle with many of the same issues.
We're slowly winning there but it has taken a lot to get to where we
are. Definitely worthwhile in the long run though.

Cheers,

Richard
diff mbox series

Patch

diff --git a/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt b/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt
index 2bff23b1bcb6..623af74b0857 100644
--- a/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt
+++ b/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt
@@ -10,6 +10,8 @@  Error applying setting, reverse things back
 Can't lookup blockdev
 # X11 fails to start without connected display
 modeset(0): failed to set mode: Invalid argument
+# minor USB setup issue
+device descriptor read/64, error
 
 # Rockpi4b with u-boot 2025.10rc2 based firmware
 memory-controller: probe with driver rk3399-dmc-freq failed with error