| Message ID | 20260306093316.7018-3-mikko.rapeli@linaro.org (mailing list archive) |
|---|---|
| State | New |
| Headers | show |
| Series | oeqa runtime parselogs: BeaglePlay and KV260 genericarm64 workarounds | expand |
On Fri, 2026-03-06 at 11:33 +0200, Mikko Rapeli via lists.yoctoproject.org wrote: > On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but > continues and USB stack works so ignore the error: > > usb 1-1: device descriptor read/64, error -71 > hub 1-1:1.0: USB hub found > hub 1-1:1.0: 5 ports detected > usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd > hub 2-1:1.0: USB hub found > hub 2-1:1.0: 4 ports detected > > https://lists.yoctoproject.org/g/automated-testing/message/1525 > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> I'm sure people will tell me I should just merge these and we move on. Before I do that, I want to be really clear that this approach really isn't great. It amounts to "oh, we have errors we don't even understand but we'll just ignore them and hope for the best". Should we even bother with the test if we're just going to ignore anything it shows up? I appreciate Mikko can't go and fight the world on issues like this and this isn't directed at him, it is just a general frustration with the "quality" of where software seems to be going. Where do we draw the line if we're just going to ignore any error? Cheers, Richard
Hi, On Fri, Mar 06, 2026 at 12:17:32PM +0000, Richard Purdie wrote: > On Fri, 2026-03-06 at 11:33 +0200, Mikko Rapeli via lists.yoctoproject.org wrote: > > On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but > > continues and USB stack works so ignore the error: > > > > usb 1-1: device descriptor read/64, error -71 > > hub 1-1:1.0: USB hub found > > hub 1-1:1.0: 5 ports detected > > usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd > > hub 2-1:1.0: USB hub found > > hub 2-1:1.0: 4 ports detected > > > > https://lists.yoctoproject.org/g/automated-testing/message/1525 > > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > > I'm sure people will tell me I should just merge these and we move on. > > Before I do that, I want to be really clear that this approach really > isn't great. It amounts to "oh, we have errors we don't even understand > but we'll just ignore them and hope for the best". Should we even > bother with the test if we're just going to ignore anything it shows > up? > > I appreciate Mikko can't go and fight the world on issues like this and > this isn't directed at him, it is just a general frustration with the > "quality" of where software seems to be going. > > Where do we draw the line if we're just going to ignore any error? I think the maintainers needs to decide. I maintain yocto genericarm64 testing on real target HW like the AMD KV260 and thus I've decided that I accept these workarounds. These changes do not impact other SW or maintainers in meta-yocto git repo so I think applying the commit is safe. The parselogs test is really good, too good infact, in finding issues coming from DUT HW, lab setup with DUT, firmware, kernel and rest of the high level yocto based OS image. Since my maintenance time is limited I need to decide which issues to fix and which to just accept. It is ok to challenge my decisions and especially help with fixing the actual root causes. Any HW vendors, please step in to help! I have been able to fix DUT HW setup issues for example by adding dummy HDMI adapters so that Xorg startup completes, fix firmware and kernel issues for some of the devices like Xorg 16bpp graphics output, and fix higher level oddities like systemd udev breakages so I do some analysis and try fix issues before adding these workarounds. Another aspect is the reuse of oeqa runtime tests in other environments and genericarm64 showing an example how to do that in the imperfect world. I would like to see more of this since they are actually pretty good basis for doing CI and on target testing. But that testing environment is complex and target HW and firmware by definition unstable since they are under development, even after vendors ship the HW and provide the BSP SW adaptations. Thus these layer specific ignore lists are likely needed in custom environments to get basic set of tests working and to stop new issues from popping up, which is actually really important. It is important to get stable testing results and it is important to find bugs and fixes for them. The parselogs findings are sometimes rare oddities but sadly common with real HW, firmware and yocto based OSes. That world is never perfect black and white, the gray zone is quite thick. That is where parselogs ignore lists fit, into the gray zone when someone needs make something imperfect but usable to produce reliable pass or fail results in testing. Cheers, -Mikko
On Fri, 2026-03-06 at 15:52 +0200, Mikko Rapeli wrote: > Hi, > > On Fri, Mar 06, 2026 at 12:17:32PM +0000, Richard Purdie wrote: > > On Fri, 2026-03-06 at 11:33 +0200, Mikko Rapeli via lists.yoctoproject.org wrote: > > > On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but > > > continues and USB stack works so ignore the error: > > > > > > usb 1-1: device descriptor read/64, error -71 > > > hub 1-1:1.0: USB hub found > > > hub 1-1:1.0: 5 ports detected > > > usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd > > > hub 2-1:1.0: USB hub found > > > hub 2-1:1.0: 4 ports detected > > > > > > https://lists.yoctoproject.org/g/automated-testing/message/1525 > > > > > > Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> > > > > I'm sure people will tell me I should just merge these and we move on. > > > > Before I do that, I want to be really clear that this approach really > > isn't great. It amounts to "oh, we have errors we don't even understand > > but we'll just ignore them and hope for the best". Should we even > > bother with the test if we're just going to ignore anything it shows > > up? > > > > I appreciate Mikko can't go and fight the world on issues like this and > > this isn't directed at him, it is just a general frustration with the > > "quality" of where software seems to be going. > > > > Where do we draw the line if we're just going to ignore any error? > > I think the maintainers needs to decide. > > I maintain yocto genericarm64 testing on real target HW like the AMD KV260 and thus > I've decided that I accept these workarounds. These changes do not impact > other SW or maintainers in meta-yocto git repo so I think applying the commit > is safe. > > The parselogs test is really good, too good infact, in finding issues coming from DUT HW, > lab setup with DUT, firmware, kernel and rest of the high level yocto > based OS image. Since my maintenance time is limited I need to > decide which issues to fix and which to just accept. It is ok to challenge my > decisions and especially help with fixing the actual root causes. Any HW vendors, > please step in to help! > > I have been able to fix DUT HW setup issues for example by adding dummy HDMI > adapters so that Xorg startup completes, fix firmware and kernel issues for > some of the devices like Xorg 16bpp graphics output, and fix higher level > oddities like systemd udev breakages so I do some analysis and try fix issues > before adding these workarounds. That is good to know, I knew you'd fixed some things but I am glad it it doesn't have a totally bad reputation! > Another aspect is the reuse of oeqa runtime tests in other environments and > genericarm64 showing an example how to do that in the imperfect world. I > would like to see more of this since they are actually pretty good basis for > doing CI and on target testing. But that testing environment is complex and > target HW and firmware by definition unstable since they are under development, > even after vendors ship the HW and provide the BSP SW adaptations. Thus > these layer specific ignore lists are likely needed in custom > environments to get basic set of tests working and to stop new issues from popping > up, which is actually really important. It is important to get stable testing results > and it is important to find bugs and fixes for them. The parselogs findings are > sometimes rare oddities but sadly common with real HW, firmware and yocto based OSes. > That world is never perfect black and white, the gray zone is quite thick. That > is where parselogs ignore lists fit, into the gray zone when someone needs make > something imperfect but usable to produce reliable pass or fail results in testing. The higher level software tests struggle with many of the same issues. We're slowly winning there but it has taken a lot to get to where we are. Definitely worthwhile in the long run though. Cheers, Richard
diff --git a/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt b/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt index 2bff23b1bcb6..623af74b0857 100644 --- a/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt +++ b/meta-yocto-bsp/lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt @@ -10,6 +10,8 @@ Error applying setting, reverse things back Can't lookup blockdev # X11 fails to start without connected display modeset(0): failed to set mode: Invalid argument +# minor USB setup issue +device descriptor read/64, error # Rockpi4b with u-boot 2025.10rc2 based firmware memory-controller: probe with driver rk3399-dmc-freq failed with error
On AMD KV260 the USB 1-1 driver sometimes throws this new warning at boot but continues and USB stack works so ignore the error: usb 1-1: device descriptor read/64, error -71 hub 1-1:1.0: USB hub found hub 1-1:1.0: 5 ports detected usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd hub 2-1:1.0: USB hub found hub 2-1:1.0: 4 ports detected https://lists.yoctoproject.org/g/automated-testing/message/1525 Signed-off-by: Mikko Rapeli <mikko.rapeli@linaro.org> --- .../lib/oeqa/runtime/cases/parselogs-ignores-genericarm64.txt | 2 ++ 1 file changed, 2 insertions(+) v2: no changes v1: https://lists.yoctoproject.org/g/poky/message/13836