Message ID | 20250623171552.28824-1-gavrosc@yahoo.com |
---|---|
State | New |
Headers | show |
Series | [v2] host_dmesg_logger: dmesg diff on BuildStarted and BuildCompleted | expand |
On Mon, 2025-06-23 at 19:15 +0200, Christos Gavros via lists.openembedded.org wrote: > This class is used to capture the host dmesg output > when BuildStarted and BuildCompleted. Then computes > a diff and prints the result. > Fixes [YOCTO #15557] > > CC: Yoann Congal <yoann.congal@smile.fr> > CC: Randy MacLeod <randy.macleod@windriver.com> > CC: Alexandre Belloni <alexandre.belloni@bootlin.com> > Signed-off-by: Christos Gavros <gavrosc@yahoo.com> > --- > v1->v2 > * add in the description class's usage and drawbacks > * add in the description how to enable the class > * add spaces around "=" > * run dmesg with '-T' option > --- > meta/classes-global/host_dmesg_logger.bbclass | 92 +++++++++++++++++++ > 1 file changed, 92 insertions(+) > create mode 100644 meta/classes-global/host_dmesg_logger.bbclass > > diff --git a/meta/classes-global/host_dmesg_logger.bbclass b/meta/classes-global/host_dmesg_logger.bbclass > new file mode 100644 > index 0000000000..fd4261c53d > --- /dev/null > +++ b/meta/classes-global/host_dmesg_logger.bbclass > @@ -0,0 +1,92 @@ > +# > +# Copyright OpenEmbedded Contributors > +# > +# SPDX-License-Identifier: MIT > +# > + > +# This class captures the host's dmesg output at the start and completion of a build. > +# It stores these outputs in a variable and upon build completion, computes a diff between > +# the two logs to detect any changes that occurred during the build process. It prints the diff > +# using bb.warn(). > +# This class is intended to expose failures (e.g. segmentation faults) that > +# are hidden or ignored by the host build system. However, it may also capture > +# irrelevant or unimportant messages. > +# The user needs to have privileges to run dmesg otherwise the operation is not allowed > +# and a warning is generated. > +# To enable dmesg logging, add the following to your local.conf: > +# INHERIT += "host_dmesg_logger" > + > +def append_log(msg, d): > + """ > + Appends a log message to variable '_dmesg_log' > + """ > + current_log = d.getVar('_dmesg_log') or '' > + current_log += msg + '\n' > + d.setVar('_dmesg_log', current_log) > + > +def capture_dmesg(): > + """ > + Returns the current output of dmesg command > + """ > + import subprocess > + > + try: > + result = subprocess.run(['dmesg', '-T'], capture_output=True, text=True, check=False) > + if result.returncode != 0: > + return f"Error running dmesg: {result.stderr.strip()}" > + return result.stdout > + except Exception as e: > + return f"Exception running dmesg: {str(e)}" > + > +addhandler hostdmesglogger_eventhandler > +python hostdmesglogger_eventhandler() { > + import difflib > + > + start_marker = "=== BuildStarted dmesg ===" > + end_marker = "=== BuildCompleted dmesg ===" > + diff_marker = "=== dmesg diff ===" > + > + # execute dmesg when BuildStarted event is fired > + if isinstance(e, bb.event.BuildStarted): > + dmesg_output = capture_dmesg() > + if dmesg_output.startswith("Error running dmesg:") or dmesg_output.startswith("Exception running dmesg:"): > + bb.warn(dmesg_output) > + else: > + append_log(start_marker, d) > + append_log(dmesg_output, d) > + > + # execute dmesg when BuildCompleted event is fired > + if isinstance(e, bb.event.BuildCompleted): > + dmesg_output = capture_dmesg() > + if dmesg_output.startswith("Error running dmesg:") or dmesg_output.startswith("Exception running dmesg:"): > + bb.warn(dmesg_output) > + else: > + append_log(end_marker, d) > + append_log(dmesg_output, d) > + > + content = d.getVar('_dmesg_log') or '' > + if start_marker in content and end_marker in content: > + start_dmesg = content.split(start_marker)[1].split(end_marker)[0] > + end_dmesg = content.split(end_marker)[1] > + > + start_lines = start_dmesg.strip().splitlines() > + end_lines = end_dmesg.strip().splitlines() > + > + # generating diff between BuildStarted and BuildCompleted dmesg outputs > + diff = list(difflib.unified_diff( > + start_lines, end_lines, > + fromfile = 'dmesg_start', > + tofile = 'dmesg_end', > + lineterm = '' > + )) > + > + append_log(diff_marker, d) > + if diff: > + for line in diff: > + bb.warn(line) > + else: > + bb.warn("No differences in dmesg output.") Thanks for this. I decided to test on the autobuilder by using this patch: https://git.yoctoproject.org/poky/commit/?h=master-next&id=56c251c67582835bd131114e903401c169f4b7c1 diff --git a/meta/conf/fragments/yocto-autobuilder/autobuilder.conf b/meta/conf/fragments/yocto-autobuilder/autobuilder.conf index e3bac34339..d698c877dc 100644 --- a/meta/conf/fragments/yocto-autobuilder/autobuilder.conf +++ b/meta/conf/fragments/yocto-autobuilder/autobuilder.conf @@ -3,6 +3,7 @@ BB_CONF_FRAGMENT_DESCRIPTION = "Default variables used on the Yocto Projects aut QEMU_USE_KVM ?= 'True' INHERIT += 'report-error' +INHERIT += "host_dmesg_logger" PREMIRRORS = '' BB_GENERATE_MIRROR_TARBALLS = '1' IMAGE_CLASSES += 'testimage' I started a build and you can see 1 warning so far: https://autobuilder.yoctoproject.org/valkyrie/#/builders/29/builds/1867 If no differences are detected, do we really want to show a warning? These are going to pollute the CI logs significantly and detract from real issues. They are useful to see that the code is running but we probably don't want to do that in general. Cheers, Richard
Hi I will remove the warning for no differences and send a v3. Br Christos
On Mon, 2025-06-23 at 22:18 +0100, Richard Purdie via lists.openembedded.org wrote: > On Mon, 2025-06-23 at 19:15 +0200, Christos Gavros via lists.openembedded.org wrote: > > This class is used to capture the host dmesg output > > when BuildStarted and BuildCompleted. Then computes > > a diff and prints the result. > > Fixes [YOCTO #15557] > > > > CC: Yoann Congal <yoann.congal@smile.fr> > > CC: Randy MacLeod <randy.macleod@windriver.com> > > CC: Alexandre Belloni <alexandre.belloni@bootlin.com> > > Signed-off-by: Christos Gavros <gavrosc@yahoo.com> > > --- > > v1->v2 > > * add in the description class's usage and drawbacks > > * add in the description how to enable the class > > * add spaces around "=" > > * run dmesg with '-T' option > > --- > > meta/classes-global/host_dmesg_logger.bbclass | 92 +++++++++++++++++++ > > 1 file changed, 92 insertions(+) > > create mode 100644 meta/classes-global/host_dmesg_logger.bbclass > > > > diff --git a/meta/classes-global/host_dmesg_logger.bbclass b/meta/classes-global/host_dmesg_logger.bbclass > > new file mode 100644 > > index 0000000000..fd4261c53d > > --- /dev/null > > +++ b/meta/classes-global/host_dmesg_logger.bbclass > > @@ -0,0 +1,92 @@ > > +# > > +# Copyright OpenEmbedded Contributors > > +# > > +# SPDX-License-Identifier: MIT > > +# > > + > > +# This class captures the host's dmesg output at the start and completion of a build. > > +# It stores these outputs in a variable and upon build completion, computes a diff between > > +# the two logs to detect any changes that occurred during the build process. It prints the diff > > +# using bb.warn(). > > +# This class is intended to expose failures (e.g. segmentation faults) that > > +# are hidden or ignored by the host build system. However, it may also capture > > +# irrelevant or unimportant messages. > > +# The user needs to have privileges to run dmesg otherwise the operation is not allowed > > +# and a warning is generated. > > +# To enable dmesg logging, add the following to your local.conf: > > +# INHERIT += "host_dmesg_logger" > > + > > +def append_log(msg, d): > > + """ > > + Appends a log message to variable '_dmesg_log' > > + """ > > + current_log = d.getVar('_dmesg_log') or '' > > + current_log += msg + '\n' > > + d.setVar('_dmesg_log', current_log) > > + > > +def capture_dmesg(): > > + """ > > + Returns the current output of dmesg command > > + """ > > + import subprocess > > + > > + try: > > + result = subprocess.run(['dmesg', '-T'], capture_output=True, text=True, check=False) > > + if result.returncode != 0: > > + return f"Error running dmesg: {result.stderr.strip()}" > > + return result.stdout > > + except Exception as e: > > + return f"Exception running dmesg: {str(e)}" > > + > > +addhandler hostdmesglogger_eventhandler > > +python hostdmesglogger_eventhandler() { > > + import difflib > > + > > + start_marker = "=== BuildStarted dmesg ===" > > + end_marker = "=== BuildCompleted dmesg ===" > > + diff_marker = "=== dmesg diff ===" > > + > > + # execute dmesg when BuildStarted event is fired > > + if isinstance(e, bb.event.BuildStarted): > > + dmesg_output = capture_dmesg() > > + if dmesg_output.startswith("Error running dmesg:") or dmesg_output.startswith("Exception running dmesg:"): > > + bb.warn(dmesg_output) > > + else: > > + append_log(start_marker, d) > > + append_log(dmesg_output, d) > > + > > + # execute dmesg when BuildCompleted event is fired > > + if isinstance(e, bb.event.BuildCompleted): > > + dmesg_output = capture_dmesg() > > + if dmesg_output.startswith("Error running dmesg:") or dmesg_output.startswith("Exception running dmesg:"): > > + bb.warn(dmesg_output) > > + else: > > + append_log(end_marker, d) > > + append_log(dmesg_output, d) > > + > > + content = d.getVar('_dmesg_log') or '' > > + if start_marker in content and end_marker in content: > > + start_dmesg = content.split(start_marker)[1].split(end_marker)[0] > > + end_dmesg = content.split(end_marker)[1] > > + > > + start_lines = start_dmesg.strip().splitlines() > > + end_lines = end_dmesg.strip().splitlines() > > + > > + # generating diff between BuildStarted and BuildCompleted dmesg outputs > > + diff = list(difflib.unified_diff( > > + start_lines, end_lines, > > + fromfile = 'dmesg_start', > > + tofile = 'dmesg_end', > > + lineterm = '' > > + )) > > + > > + append_log(diff_marker, d) > > + if diff: > > + for line in diff: > > + bb.warn(line) > > + else: > > + bb.warn("No differences in dmesg output.") > > Thanks for this. I decided to test on the autobuilder by using this > patch: > > https://git.yoctoproject.org/poky/commit/?h=master-next&id=56c251c67582835bd131114e903401c169f4b7c1 > > diff --git a/meta/conf/fragments/yocto-autobuilder/autobuilder.conf b/meta/conf/fragments/yocto-autobuilder/autobuilder.conf > index e3bac34339..d698c877dc 100644 > --- a/meta/conf/fragments/yocto-autobuilder/autobuilder.conf > +++ b/meta/conf/fragments/yocto-autobuilder/autobuilder.conf > @@ -3,6 +3,7 @@ BB_CONF_FRAGMENT_DESCRIPTION = "Default variables used on the Yocto Projects aut > > QEMU_USE_KVM ?= 'True' > INHERIT += 'report-error' > +INHERIT += "host_dmesg_logger" > PREMIRRORS = '' > BB_GENERATE_MIRROR_TARBALLS = '1' > IMAGE_CLASSES += 'testimage' > > I started a build and you can see 1 warning so far: > > https://autobuilder.yoctoproject.org/valkyrie/#/builders/29/builds/1867 > > If no differences are detected, do we really want to show a warning? > These are going to pollute the CI logs significantly and detract from > real issues. They are useful to see that the code is running but we > probably don't want to do that in general. Since the builds were too noisy with the warning about no differences, I ran a second build with that line commented out. https://autobuilder.yoctoproject.org/valkyrie/#/builders/29/builds/1868 As you can see, there were still a lot of warnings, enough that it is hard to spot the real issues compared to the ones that aren't important. For example, this one: https://autobuilder.yoctoproject.org/valkyrie/#/builders/26/builds/1886/steps/23/logs/warnings "stdio: WARNING: +[Mon Jun 23 23:21:37 2025] hrtimer: interrupt took 7320 ns" isn't really anything we can do anything about. Similiarly: https://autobuilder.yoctoproject.org/valkyrie/#/builders/16/builds/1924/steps/14/logs/warnings "stdio: WARNING: +[Mon Jun 23 23:54:18 2025] workqueue: inode_switch_wbs_work_fn hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND" https://autobuilder.yoctoproject.org/valkyrie/#/builders/2/builds/1906 stdio: WARNING: +[Mon Jun 23 23:16:39 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap1: link becomes ready stdio: WARNING: +[Mon Jun 23 23:16:39 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap2: link becomes ready stdio: WARNING: +[Mon Jun 23 23:16:40 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap0: link becomes ready which we see any time qemu starts/stops using a device. Similiarly, this one is quite long but nothing we can do much about: https://autobuilder.yoctoproject.org/valkyrie/#/builders/78/builds/1930 stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap2: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap4: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap5: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap7: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap10: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap8: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap11: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap14: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:54 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap15: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:55 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap12: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:55 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap13: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:55 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap9: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:55 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap1: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:55 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap3: link becomes ready stdio: WARNING: +[Mon Jun 23 23:12:56 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap6: link becomes ready stdio: WARNING: +[Mon Jun 23 23:13:32 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap17: link becomes ready stdio: WARNING: +[Mon Jun 23 23:13:32 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap18: link becomes ready stdio: WARNING: +[Mon Jun 23 23:13:35 2025] IPv6: ADDRCONF(NETDEV_CHANGE): tap16: link becomes ready stdio: WARNING: +[Mon Jun 23 23:16:06 2025] hrtimer: interrupt took 1480 ns However there is an issue here: https://autobuilder.yoctoproject.org/valkyrie/#/builders/43/builds/1900 stdio: WARNING: +[Mon Jun 23 23:24:20 2025] rustc[2793145]: segfault at 7d85a57fef60 ip 00007d85ace75881 sp 00007d85a57fef60 error 6 in librustc_driver-2c391d96288416bb.so[7d85a72c0000+6246000] likely on CPU 23 (core 53, socket 0) stdio: WARNING: +[Mon Jun 23 23:24:20 2025] Code: b5 13 03 48 89 df e8 be af 44 fa 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 41 57 41 56 41 55 41 54 53 48 81 ec b8 00 00 00 <48> 89 34 24 49 89 ff 4c 8b 2f 49 83 fd 02 0f 84 dc 00 00 00 49 83 stdio: WARNING: +[Mon Jun 23 23:24:20 2025] rustc[2793415]: segfault at 761e429fee20 ip 0000761e49d83872 sp 0000761e429fee10 error 6 in librustc_driver-2c391d96288416bb.so[761e444c0000+6246000] likely on CPU 65 (core 21, socket 0) stdio: WARNING: +[Mon Jun 23 23:24:20 2025] Code: 84 00 00 00 00 00 66 90 55 41 57 41 56 41 55 41 54 53 48 81 ec b8 06 00 00 48 8b 9e b8 00 00 00 80 4e 64 10 49 89 d4 49 89 fe <48> 89 74 24 10 0f b6 86 a8 00 00 00 83 f8 22 74 1e 83 f8 23 0f 85 stdio: WARNING: +[Mon Jun 23 23:24:20 2025] rustc[2792864]: segfault at 7336b89fef50 ip 00007336bf403241 sp 00007336b89fef40 error 6 in librustc_driver-2c391d96288416bb.so[7336ba4c0000+6246000] likely on CPU 91 (core 41, socket 0) stdio: WARNING: +[Mon Jun 23 23:24:20 2025] Code: 48 8d 7c 24 40 e8 8f 9d ff ff 48 89 df e8 f7 d5 0b fb ff 15 49 98 d9 03 90 55 41 57 41 56 41 55 41 54 53 48 81 ec 18 02 00 00 <4c> 89 44 24 10 48 89 cb 89 94 24 b0 00 00 00 49 89 f7 48 89 7c 24 stdio: WARNING: +[Mon Jun 23 23:24:21 2025] rustc[2792945]: segfault at 7b4c847fefe0 ip 00007b4c8b003241 sp 00007b4c847fefd0 error 6 in librustc_driver-2c391d96288416bb.so[7b4c860c0000+6246000] likely on CPU 46 (core 44, socket 0) stdio: WARNING: +[Mon Jun 23 23:24:21 2025] Code: 48 8d 7c 24 40 e8 8f 9d ff ff 48 89 df e8 f7 d5 0b fb ff 15 49 98 d9 03 90 55 41 57 41 56 41 55 41 54 53 48 81 ec 18 02 00 00 <4c> 89 44 24 10 48 89 cb 89 94 24 b0 00 00 00 49 89 f7 48 89 7c 24 stdio: WARNING: +[Mon Jun 23 23:24:24 2025] rustc[2792678]: segfault at 7312ff7fff80 ip 000073130ad372c9 sp 00007312ff7fff80 error 6 in librustc_driver-2c391d96288416bb.so[7313058c0000+6246000] likely on CPU 64 (core 20, socket 0) stdio: WARNING: +[Mon Jun 23 23:24:24 2025] Code: 00 00 48 89 b4 24 c8 00 00 00 48 39 ca 0f 85 97 06 00 00 49 89 d6 49 89 f7 48 89 fb 0f b6 86 f5 00 00 00 0f b6 8e f7 00 00 00 <48> 89 14 24 88 4c 24 08 88 44 24 09 48 89 e6 4c 89 ff e8 e0 d1 fe The question is therefore going to be how to filter this down to something useful, and/or how to save the data so we can work on reducing these issues over time? The data is useful to have, and over time we should look for and fix these segfaults and so on but we can't have this level of warnings on the builds. Cheers, Richard
On 24/06/2025 08:59:18+0100, Richard Purdie wrote: > However there is an issue here: > > https://autobuilder.yoctoproject.org/valkyrie/#/builders/43/builds/1900 > > stdio: WARNING: +[Mon Jun 23 23:24:20 2025] rustc[2793145]: segfault at 7d85a57fef60 ip 00007d85ace75881 sp 00007d85a57fef60 error 6 in librustc_driver-2c391d96288416bb.so[7d85a72c0000+6246000] likely on CPU 23 (core 53, socket 0) > stdio: WARNING: +[Mon Jun 23 23:24:20 2025] Code: b5 13 03 48 89 df e8 be af 44 fa 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 41 57 41 56 41 55 41 54 53 48 81 ec b8 00 00 00 <48> 89 34 24 49 89 ff 4c 8b 2f 49 83 fd 02 0f 84 dc 00 00 00 49 83 > stdio: WARNING: +[Mon Jun 23 23:24:20 2025] rustc[2793415]: segfault at 761e429fee20 ip 0000761e49d83872 sp 0000761e429fee10 error 6 in librustc_driver-2c391d96288416bb.so[761e444c0000+6246000] likely on CPU 65 (core 21, socket 0) > stdio: WARNING: +[Mon Jun 23 23:24:20 2025] Code: 84 00 00 00 00 00 66 90 55 41 57 41 56 41 55 41 54 53 48 81 ec b8 06 00 00 48 8b 9e b8 00 00 00 80 4e 64 10 49 89 d4 49 89 fe <48> 89 74 24 10 0f b6 86 a8 00 00 00 83 f8 22 74 1e 83 f8 23 0f 85 > stdio: WARNING: +[Mon Jun 23 23:24:20 2025] rustc[2792864]: segfault at 7336b89fef50 ip 00007336bf403241 sp 00007336b89fef40 error 6 in librustc_driver-2c391d96288416bb.so[7336ba4c0000+6246000] likely on CPU 91 (core 41, socket 0) > stdio: WARNING: +[Mon Jun 23 23:24:20 2025] Code: 48 8d 7c 24 40 e8 8f 9d ff ff 48 89 df e8 f7 d5 0b fb ff 15 49 98 d9 03 90 55 41 57 41 56 41 55 41 54 53 48 81 ec 18 02 00 00 <4c> 89 44 24 10 48 89 cb 89 94 24 b0 00 00 00 49 89 f7 48 89 7c 24 > stdio: WARNING: +[Mon Jun 23 23:24:21 2025] rustc[2792945]: segfault at 7b4c847fefe0 ip 00007b4c8b003241 sp 00007b4c847fefd0 error 6 in librustc_driver-2c391d96288416bb.so[7b4c860c0000+6246000] likely on CPU 46 (core 44, socket 0) > stdio: WARNING: +[Mon Jun 23 23:24:21 2025] Code: 48 8d 7c 24 40 e8 8f 9d ff ff 48 89 df e8 f7 d5 0b fb ff 15 49 98 d9 03 90 55 41 57 41 56 41 55 41 54 53 48 81 ec 18 02 00 00 <4c> 89 44 24 10 48 89 cb 89 94 24 b0 00 00 00 49 89 f7 48 89 7c 24 > stdio: WARNING: +[Mon Jun 23 23:24:24 2025] rustc[2792678]: segfault at 7312ff7fff80 ip 000073130ad372c9 sp 00007312ff7fff80 error 6 in librustc_driver-2c391d96288416bb.so[7313058c0000+6246000] likely on CPU 64 (core 20, socket 0) > stdio: WARNING: +[Mon Jun 23 23:24:24 2025] Code: 00 00 48 89 b4 24 c8 00 00 00 48 39 ca 0f 85 97 06 00 00 49 89 d6 49 89 f7 48 89 fb 0f b6 86 f5 00 00 00 0f b6 8e f7 00 00 00 <48> 89 14 24 88 4c 24 08 88 44 24 09 48 89 e6 4c 89 ff e8 e0 d1 fe > > The question is therefore going to be how to filter this down to > something useful, and/or how to save the data so we can work on > reducing these issues over time? > > The data is useful to have, and over time we should look for and fix > these segfaults and so on but we can't have this level of warnings on > the builds. Seeing the actual output, I don't think having a human readable date is a good idea as all the other dates we use are in UNIX timestamp format. Maybe we could use -l to restrict the levels, something like dmesg --level=err+ or dmesg --level=warn+ I'm pretty sure this is going to filter the tap interface message. The hrtimer message is a pr_warn_once so it would appear only after reboot, I guess the build was juste after maintenance. Or we could use --facility and filter out all the kernel builderssages as we are probably more interested in what is happening in user space. maybe using -x would be a good idea and we could then do this filtering and make it configurable using a variable in the class. Having a way to more precisely filter messages would be great but will also require maintaining the list of filters.
I think it would be easier to decide how to filter if we had more data! Is there any way I can help more with that? I have not used the autobuilder yet ...
On Tue, 2025-06-24 at 13:16 -0700, Christos Gavros via lists.openembedded.org wrote: > I think it would be easier to decide how to filter if we had more > data! > Is there any way I can help more with that? > I have not used the autobuilder yet ... > I gave links to a couple of the builds, one with all warnings, one only with warnings with differences. I also gave some links to samples of the data but the full builds are there with all the warnings they generated if you want to dig through and collect more. What kind of data would you like to have? Cheers, Richard
I have read the comment with the links but I didn't dig through them. I was thinking that maybe we need data from more builds in order to decide how to filter out. Then, if this data is sufficient, shall we filter using "dmesg --level=err,crit,alert,emerg" ?
On Tue, 2025-06-24 at 14:14 -0700, Christos Gavros via lists.openembedded.org wrote: > I have read the comment with the links but I didn't dig through them. > I was thinking that maybe we need data from more builds in order to > decide how to filter out. > Then, if this data is sufficient, shall we filter using "dmesg -- > level=err,crit,alert,emerg" ? I don't know what level the error messages in question were so it is hard for me to say. I'm not convinced running more builds generating those warnings is going to help unless someone actually goes through them. It seems you don't want to (I can't blame you) and I have done my best to try and at least help others through them and put what time I can in to try and help. Personally, rather than log levels, I think it may be better to filter out the known things which we consider "noise", then run another build and see what messages we're getting that we do care about. We could then work out their log levels if we still think that kind of filtering would help. Cheers, Richard
hi I am sorry! I didn't understand that the way forward is to dig to the builds you already ran. I don't have problem doing that. I can dig through them and try to come up with a filtering suggestion. I can spend time on that, no problem :)
diff --git a/meta/classes-global/host_dmesg_logger.bbclass b/meta/classes-global/host_dmesg_logger.bbclass new file mode 100644 index 0000000000..fd4261c53d --- /dev/null +++ b/meta/classes-global/host_dmesg_logger.bbclass @@ -0,0 +1,92 @@ +# +# Copyright OpenEmbedded Contributors +# +# SPDX-License-Identifier: MIT +# + +# This class captures the host's dmesg output at the start and completion of a build. +# It stores these outputs in a variable and upon build completion, computes a diff between +# the two logs to detect any changes that occurred during the build process. It prints the diff +# using bb.warn(). +# This class is intended to expose failures (e.g. segmentation faults) that +# are hidden or ignored by the host build system. However, it may also capture +# irrelevant or unimportant messages. +# The user needs to have privileges to run dmesg otherwise the operation is not allowed +# and a warning is generated. +# To enable dmesg logging, add the following to your local.conf: +# INHERIT += "host_dmesg_logger" + +def append_log(msg, d): + """ + Appends a log message to variable '_dmesg_log' + """ + current_log = d.getVar('_dmesg_log') or '' + current_log += msg + '\n' + d.setVar('_dmesg_log', current_log) + +def capture_dmesg(): + """ + Returns the current output of dmesg command + """ + import subprocess + + try: + result = subprocess.run(['dmesg', '-T'], capture_output=True, text=True, check=False) + if result.returncode != 0: + return f"Error running dmesg: {result.stderr.strip()}" + return result.stdout + except Exception as e: + return f"Exception running dmesg: {str(e)}" + +addhandler hostdmesglogger_eventhandler +python hostdmesglogger_eventhandler() { + import difflib + + start_marker = "=== BuildStarted dmesg ===" + end_marker = "=== BuildCompleted dmesg ===" + diff_marker = "=== dmesg diff ===" + + # execute dmesg when BuildStarted event is fired + if isinstance(e, bb.event.BuildStarted): + dmesg_output = capture_dmesg() + if dmesg_output.startswith("Error running dmesg:") or dmesg_output.startswith("Exception running dmesg:"): + bb.warn(dmesg_output) + else: + append_log(start_marker, d) + append_log(dmesg_output, d) + + # execute dmesg when BuildCompleted event is fired + if isinstance(e, bb.event.BuildCompleted): + dmesg_output = capture_dmesg() + if dmesg_output.startswith("Error running dmesg:") or dmesg_output.startswith("Exception running dmesg:"): + bb.warn(dmesg_output) + else: + append_log(end_marker, d) + append_log(dmesg_output, d) + + content = d.getVar('_dmesg_log') or '' + if start_marker in content and end_marker in content: + start_dmesg = content.split(start_marker)[1].split(end_marker)[0] + end_dmesg = content.split(end_marker)[1] + + start_lines = start_dmesg.strip().splitlines() + end_lines = end_dmesg.strip().splitlines() + + # generating diff between BuildStarted and BuildCompleted dmesg outputs + diff = list(difflib.unified_diff( + start_lines, end_lines, + fromfile = 'dmesg_start', + tofile = 'dmesg_end', + lineterm = '' + )) + + append_log(diff_marker, d) + if diff: + for line in diff: + bb.warn(line) + else: + bb.warn("No differences in dmesg output.") + else: + bb.warn("Could not find both dmesg sections for diff.") +} +hostdmesglogger_eventhandler[eventmask] = "bb.event.BuildStarted bb.event.BuildCompleted"
This class is used to capture the host dmesg output when BuildStarted and BuildCompleted. Then computes a diff and prints the result. Fixes [YOCTO #15557] CC: Yoann Congal <yoann.congal@smile.fr> CC: Randy MacLeod <randy.macleod@windriver.com> CC: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Christos Gavros <gavrosc@yahoo.com> --- v1->v2 * add in the description class's usage and drawbacks * add in the description how to enable the class * add spaces around "=" * run dmesg with '-T' option --- meta/classes-global/host_dmesg_logger.bbclass | 92 +++++++++++++++++++ 1 file changed, 92 insertions(+) create mode 100644 meta/classes-global/host_dmesg_logger.bbclass