From patchwork Fri Jan 20 14:44:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2C90C38159 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75990.1674225899533585748 for ; Fri, 20 Jan 2023 06:45:00 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdU-006KUP-QN; Fri, 20 Jan 2023 16:44:57 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdU-00CELz-H9; Fri, 20 Jan 2023 16:44:56 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 01/14] oeqa ssh.py: move output prints to new line Date: Fri, 20 Jan 2023 16:44:37 +0200 Message-Id: <20230120144450.2913935-2-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-2-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176191 The output from is garbled otherwise and it's not easy to remove debug output form real command output on target. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/core/target/ssh.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py index 7a10ba9763..13fd5b2a49 100644 --- a/meta/lib/oeqa/core/target/ssh.py +++ b/meta/lib/oeqa/core/target/ssh.py @@ -240,7 +240,7 @@ def SSHCall(command, logger, timeout=None, **opts): eof = True else: output += data - logger.debug('Partial data from SSH call: %s' % data) + logger.debug('Partial data from SSH call:\n%s' % data) endtime = time.time() + timeout except InterruptedError: continue @@ -256,12 +256,12 @@ def SSHCall(command, logger, timeout=None, **opts): endtime = time.time() - starttime lastline = ("\nProcess killed - no output for %d seconds. Total" " running time: %d seconds." % (timeout, endtime)) - logger.debug('Received data from SSH call %s ' % lastline) + logger.debug('Received data from SSH call:\n%s ' % lastline) output += lastline else: output = process.communicate()[0].decode('utf-8', errors='ignore') - logger.debug('Data from SSH call: %s' % output.rstrip()) + logger.debug('Data from SSH call:\n%s' % output.rstrip()) options = { "stdout": subprocess.PIPE, From patchwork Fri Jan 20 14:44:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id C19D5C38A23 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75991.1674225900228690739 for ; Fri, 20 Jan 2023 06:45:00 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdV-006KUi-Tw; Fri, 20 Jan 2023 16:44:58 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdV-00CEPa-KC; Fri, 20 Jan 2023 16:44:57 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 02/14] oeqa ssh.py: fix hangs in run() and add debug prints Date: Fri, 20 Jan 2023 16:44:38 +0200 Message-Id: <20230120144450.2913935-3-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-3-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176193 When qemu machine hangs, the ssh commands done by tests are not timing out. do_testimage() task has last logs like this: DEBUG: time: 1673531086.3155053, endtime: 1673531686.315502 The test process is stuck for hours, or for ever if the executing command or test case did not set a timeout correctly. The default 300 second timeout is not working when target hangs. Note that timeout is really a "inactive timeout" since data returned by the process will reset the timeout. Added debug prints help to identify which calls are hanging and not timing out. Make the process stdout non-blocking so read() will always return right away using os.set_blocking() available in python 3.5 and later. Then change from python codec reader to plain read() and make the ssh subprocess stdout non-blocking. Even with select() making sure the file had input to be read, the codec reader was trying to find more stuff and blocking for ever when process hangs. While at it, add a small timeout to read data in larger chunks if possible. This avoids reading data one or few characters at a time and makes the debug logs more readable. close() the stdout file in all cases after read loop is complete. Then make sure to wait or kill the ssh subprocess in all cases. Just reading the output stream and receiving EOF there does not mean that the process exited, and wait() needs a timeout if the process is hanging. In the end kill the process and return the return value and captured output utf-8 encoded, just like before these changes. This fixes ssh run() related deadlocks when a qemu target hangs completely. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/core/target/ssh.py | 42 ++++++++++++++++++++++++++------ 1 file changed, 35 insertions(+), 7 deletions(-) diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py index 13fd5b2a49..11f4ec6de0 100644 --- a/meta/lib/oeqa/core/target/ssh.py +++ b/meta/lib/oeqa/core/target/ssh.py @@ -229,29 +229,41 @@ def SSHCall(command, logger, timeout=None, **opts): if timeout: endtime = starttime + timeout eof = False + os.set_blocking(process.stdout.fileno(), False) while time.time() < endtime and not eof: - logger.debug('time: %s, endtime: %s' % (time.time(), endtime)) try: + logger.debug('Waiting for process output: time: %s, endtime: %s' % (time.time(), endtime)) if select.select([process.stdout], [], [], 5)[0] != []: - reader = codecs.getreader('utf-8')(process.stdout, 'ignore') - data = reader.read(1024, 4096) + # wait a bit for more data, tries to avoid reading single characters + time.sleep(0.2) + logger.debug('Reading from process stdout') + data = process.stdout.read() if not data: - process.stdout.close() + logger.debug('End of file from process stdout') eof = True else: - output += data - logger.debug('Partial data from SSH call:\n%s' % data) + # ignore errors to capture as much as possible + string = data.decode('utf-8', errors='ignore') + output += string + logger.debug('Partial data from SSH call:\n%s' % string) endtime = time.time() + timeout except InterruptedError: + logger.debug('InterruptedError') continue + logger.debug('Closing process stdout') + process.stdout.close() + # process hasn't returned yet if not eof: + logger.debug('Terminating process') process.terminate() time.sleep(5) try: + logger.debug('Killing process') process.kill() except OSError: + logger.debug('OSError when killing process') pass endtime = time.time() - starttime lastline = ("\nProcess killed - no output for %d seconds. Total" @@ -263,6 +275,19 @@ def SSHCall(command, logger, timeout=None, **opts): output = process.communicate()[0].decode('utf-8', errors='ignore') logger.debug('Data from SSH call:\n%s' % output.rstrip()) + # timout or not, make sure process exits and is not hanging + if process.returncode == None: + try: + logger.debug('Process still running, waiting for it to end') + process.wait(timeout=5) + except TimeoutExpired: + try: + logger.debug('Timeout, killing process') + process.kill() + except OSError: + logger.debug('OSError') + pass + options = { "stdout": subprocess.PIPE, "stderr": subprocess.STDOUT, @@ -282,12 +307,15 @@ def SSHCall(command, logger, timeout=None, **opts): options['env'] = env try: + logger.debug('run') run() except: + logger.debug('except from run()') # Need to guard against a SystemExit or other exception ocurring # whilst running and ensure we don't leave a process behind. if process.poll() is None: process.kill() logger.debug('Something went wrong, killing SSH process') raise - return (process.wait(), output.rstrip()) + + return (process.returncode, output.rstrip()) From patchwork Fri Jan 20 14:44:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D14ABC52D11 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web10.75891.1674225900330537079 for ; Fri, 20 Jan 2023 06:45:00 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdW-006KUp-HP; Fri, 20 Jan 2023 16:44:58 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdW-00CEQz-7t; Fri, 20 Jan 2023 16:44:58 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 03/14] oeqa ssh.py: clarify timeout API and add more debug prints to run() Date: Fri, 20 Jan 2023 16:44:39 +0200 Message-Id: <20230120144450.2913935-4-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-4-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176194 Clarify timeout handling. Timeout calculation is from start of command OR last output from ssh command. When the ssh command output is captured, that resets the timeout timer. Timeout is effetively an "inactive timeout". Print process output to a new line to keep the output more readable. Then log the calls to dumper functions which also seem to be deadlocking for ever when a qemu machine hangs. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/core/target/ssh.py | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py index 11f4ec6de0..2aaa15ec0e 100644 --- a/meta/lib/oeqa/core/target/ssh.py +++ b/meta/lib/oeqa/core/target/ssh.py @@ -65,7 +65,7 @@ class OESSHTarget(OETarget): """ Runs command in target using SSHProcess. """ - self.logger.debug("[Running]$ %s" % " ".join(command)) + self.logger.debug("[Running]$ '%s', timeout=%s, ignore_status=%s" % (" ".join(command), timeout, ignore_status)) starttime = time.time() status, output = SSHCall(command, self.logger, timeout) @@ -83,9 +83,10 @@ class OESSHTarget(OETarget): Runs command in target. command: Command to run on target. - timeout: : Kill command after seconds. - None: Kill command default value seconds. - 0: No timeout, runs until return. + timeout: : Kill command after seconds when there is no output. + None: Kill command with default value seconds when there is no output. + 0: No activity timeout, runs until return. Can still fail with timeout + if ssh client returns failures or TCP socket times out. """ targetCmd = 'export PATH=/usr/sbin:/sbin:/usr/bin:/bin; %s' % command sshCmd = self.ssh + [self.ip, targetCmd] @@ -98,15 +99,19 @@ class OESSHTarget(OETarget): processTimeout = self.timeout status, output = self._run(sshCmd, processTimeout, ignore_status) - self.logger.debug('Command: %s\nStatus: %d Output: %s\n' % (command, status, output)) + # ssh reports errors with return value 255 but also commands can return that value if (status == 255) and (('No route to host') in output): if self.monitor_dumper: + self.logger.debug('Command failed, capturing QMP data: dump_monitor()') self.monitor_dumper.dump_monitor() if status == 255: if self.target_dumper: + self.logger.debug('Command failed, capturing data from serial console: dump_target()') self.target_dumper.dump_target() if self.monitor_dumper: + self.logger.debug('Command failed, capturing QMP data: dump_monitor()') self.monitor_dumper.dump_monitor() + self.logger.debug('returning from run()') return (status, output) def copyTo(self, localSrc, remoteDst): From patchwork Fri Jan 20 14:44:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D16F9C5320B for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75992.1674225900796206407 for ; Fri, 20 Jan 2023 06:45:01 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdX-006KUx-1S; Fri, 20 Jan 2023 16:44:59 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdW-00CESJ-P9; Fri, 20 Jan 2023 16:44:58 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 04/14] oeqa ssh.py: add connection keep alive options to ssh client Date: Fri, 20 Jan 2023 16:44:40 +0200 Message-Id: <20230120144450.2913935-5-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-5-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176195 Configure ssh client to test that connection with server is up. If the server does not respond within a minute then the connection, target machine or sshd daemon are stuck and it's better to exit the command execution with errors. Some tests can execute a long time without returning stdout/stderror data and it's difficult to adjust timers for those cases if connection to target machine or the target machine itself hangs and output is not expected in minutes or even hours. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/core/target/ssh.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/meta/lib/oeqa/core/target/ssh.py b/meta/lib/oeqa/core/target/ssh.py index 2aaa15ec0e..d42cd79c4f 100644 --- a/meta/lib/oeqa/core/target/ssh.py +++ b/meta/lib/oeqa/core/target/ssh.py @@ -34,6 +34,8 @@ class OESSHTarget(OETarget): self.timeout = timeout self.user = user ssh_options = [ + '-o', 'ServerAliveCountMax=2', + '-o', 'ServerAliveInterval=30', '-o', 'UserKnownHostsFile=/dev/null', '-o', 'StrictHostKeyChecking=no', '-o', 'LogLevel=ERROR' From patchwork Fri Jan 20 14:44:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id E15CFC54EAA for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75993.1674225901377692856 for ; Fri, 20 Jan 2023 06:45:01 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdX-006KV5-Jm; Fri, 20 Jan 2023 16:44:59 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdX-00CETX-Aj; Fri, 20 Jan 2023 16:44:59 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 05/14] testimage.bbclass: add debug prints for major state changes Date: Fri, 20 Jan 2023 16:44:41 +0200 Message-Id: <20230120144450.2913935-6-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: BAYES_HAM(-3.00) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: RCVD_TLS_LAST(0.00) Symbol: R_DKIM_NA(0.00) Symbol: ASN(0.00) Symbol: MIME_TRACE(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-6-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176196 When a test target hangs completely, it is important to know what happened before that in do_testimage() task logs. Signed-off-by: Mikko Rapeli --- meta/classes-recipe/testimage.bbclass | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/meta/classes-recipe/testimage.bbclass b/meta/classes-recipe/testimage.bbclass index 5cc408b0c4..79ac28a327 100644 --- a/meta/classes-recipe/testimage.bbclass +++ b/meta/classes-recipe/testimage.bbclass @@ -211,6 +211,7 @@ def testimage_main(d): """ Catch SIGTERM from worker in order to stop qemu. """ + bb.note('SIGTERM received, killing qemu with SIGINT') os.kill(os.getpid(), signal.SIGINT) def handle_test_timeout(timeout): @@ -373,14 +374,18 @@ def testimage_main(d): complete = False orig_sigterm_handler = signal.signal(signal.SIGTERM, sigterm_exception) try: + bb.debug(1, 'Starting qemu') # We need to check if runqemu ends unexpectedly # or if the worker send us a SIGTERM tc.target.start(params=d.getVar("TEST_QEMUPARAMS"), runqemuparams=d.getVar("TEST_RUNQEMUPARAMS")) import threading try: + bb.debug(1, 'Setting timer for TEST_OVERALL_TIMEOUT') threading.Timer(int(d.getVar("TEST_OVERALL_TIMEOUT")), handle_test_timeout, (int(d.getVar("TEST_OVERALL_TIMEOUT")),)).start() except ValueError: + bb.debug(1, 'Timer exception: ValueError') pass + bb.debug(1, 'Running tests') results = tc.runTests() complete = True except (KeyboardInterrupt, BlockingIOError) as err: @@ -389,10 +394,13 @@ def testimage_main(d): else: bb.error('runqemu failed, shutting down...') if results: + bb.debug(1, 'Stopping') results.stop() results = tc.results finally: + bb.debug(1, 'Resetting SIGTERM handler') signal.signal(signal.SIGTERM, orig_sigterm_handler) + bb.debug(1, 'Stopping qemu target') tc.target.stop() # Show results (if we have them) From patchwork Fri Jan 20 14:44:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id E15B4C54794 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75994.1674225901931524682 for ; Fri, 20 Jan 2023 06:45:02 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdY-006KVC-2i; Fri, 20 Jan 2023 16:45:00 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdX-00CEVN-Q8; Fri, 20 Jan 2023 16:44:59 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 06/14] oeqa dump.py: add error counter and stop after 5 failures Date: Fri, 20 Jan 2023 16:44:42 +0200 Message-Id: <20230120144450.2913935-7-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-7-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176197 If test target qemu machine hangs completely, dump_target() calls over serial console are taking a long time to time out, possibly for every failing ssh command execution and a lot of test cases, and same with dump_monitor(). Instead of trying for ever, count errors and after 5 stop trying to dump_target() and dump_monitor() completely. These help to end testing earlier when a test target is completely deadlocked and all ssh, serial and QMP communication with it are failing. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/dump.py | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/meta/lib/oeqa/utils/dump.py b/meta/lib/oeqa/utils/dump.py index bcee03b576..d420b497f9 100644 --- a/meta/lib/oeqa/utils/dump.py +++ b/meta/lib/oeqa/utils/dump.py @@ -93,37 +93,55 @@ class HostDumper(BaseDumper): self._write_dump(cmd.split()[0], result.output) class TargetDumper(BaseDumper): - """ Class to get dumps from target, it only works with QemuRunner """ + """ Class to get dumps from target, it only works with QemuRunner. + Will give up permanently after 5 errors from running commands over + serial console. This helps to end testing when target is really dead, hanging + or unresponsive. + """ def __init__(self, cmds, parent_dir, runner): super(TargetDumper, self).__init__(cmds, parent_dir) self.runner = runner + self.errors = 0 def dump_target(self, dump_dir=""): + if self.errors >= 5: + print("Too many errors when dumping data from target, assuming it is dead! Will not dump data anymore!") + return if dump_dir: self.dump_dir = dump_dir for cmd in self.cmds: # We can continue with the testing if serial commands fail try: (status, output) = self.runner.run_serial(cmd) + if status == 0: + self.errors = self.errors + 1 self._write_dump(cmd.split()[0], output) except: + self.errors = self.errors + 1 print("Tried to dump info from target but " "serial console failed") print("Failed CMD: %s" % (cmd)) class MonitorDumper(BaseDumper): - """ Class to get dumps via the Qemu Monitor, it only works with QemuRunner """ + """ Class to get dumps via the Qemu Monitor, it only works with QemuRunner + Will stop completely if there are more than 5 errors when dumping monitor data. + This helps to end testing when target is really dead, hanging or unresponsive. + """ def __init__(self, cmds, parent_dir, runner): super(MonitorDumper, self).__init__(cmds, parent_dir) self.runner = runner + self.errors = 0 def dump_monitor(self, dump_dir=""): if self.runner is None: return if dump_dir: self.dump_dir = dump_dir + if self.errors >= 5: + print("Too many errors when dumping data from qemu monitor, assuming it is dead! Will not dump data anymore!") + return for cmd in self.cmds: cmd_name = cmd.split()[0] try: @@ -137,4 +155,5 @@ class MonitorDumper(BaseDumper): output = self.runner.run_monitor(cmd_name) self._write_dump(cmd_name, output) except Exception as e: + self.errors = self.errors + 1 print("Failed to dump QMP CMD: %s with\nException: %s" % (cmd_name, e)) From patchwork Fri Jan 20 14:44:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0C72C52D77 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75996.1674225902381684696 for ; Fri, 20 Jan 2023 06:45:02 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdY-006KVJ-Hj; Fri, 20 Jan 2023 16:45:00 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdY-00CEWn-8Y; Fri, 20 Jan 2023 16:45:00 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 07/14] oeqa qemurunner.py: use os.set_blocking() instead of fcntl() Date: Fri, 20 Jan 2023 16:44:43 +0200 Message-Id: <20230120144450.2913935-8-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-8-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176198 It's now available since python 3.5. Then add debug prints to leave traces if something still hangs or fails. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index b5fed6c9fe..49ec3858c7 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -123,12 +123,12 @@ class QemuRunner: f.write("%s" % msg) def getOutput(self, o): - import fcntl - fl = fcntl.fcntl(o, fcntl.F_GETFL) - fcntl.fcntl(o, fcntl.F_SETFL, fl | os.O_NONBLOCK) + os.set_blocking(o.fileno(), False) try: + self.logger.debug('Calling read()') return os.read(o.fileno(), 1000000).decode("utf-8") except BlockingIOError: + self.logger.debug('BlockingIOError but should not have happened!') return "" From patchwork Fri Jan 20 14:44:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF2AAC54EB4 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web10.75894.1674225902981640905 for ; Fri, 20 Jan 2023 06:45:03 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdZ-006KVS-0G; Fri, 20 Jan 2023 16:45:01 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdY-00CEYI-N1; Fri, 20 Jan 2023 16:45:00 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 08/14] oeqa qemurunner: read more data at a time from serial and log all of it Date: Fri, 20 Jan 2023 16:44:44 +0200 Message-Id: <20230120144450.2913935-9-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-9-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176199 Use a short sleep to bundle serial console reads so that we are not reading one character at a time which makes logs more readable. Then write full serial log to debug output so that boot issues can be investigated from do_testimage task logs. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index 49ec3858c7..97c049e224 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -202,7 +202,7 @@ class QemuRunner: qmp_file = "." + next(tempfile._get_candidate_names()) qmp_param = ' -S -qmp unix:./%s,server,wait' % (qmp_file) qmp_port = self.tmpdir + "/" + qmp_file - # Create a second socket connection for debugging use, + # Create a second socket connection for debugging use, # note this will NOT cause qemu to block waiting for the connection qmp_file2 = "." + next(tempfile._get_candidate_names()) qmp_param += ' -qmp unix:./%s,server,nowait' % (qmp_file2) @@ -468,8 +468,11 @@ class QemuRunner: socklist.remove(self.server_socket) self.logger.debug("Connection from %s:%s" % addr) else: + # try to avoid reading single character at a time, makes logs readable + time.sleep(0.1) data = data + sock.recv(1024) if data: + self.logger.debug(r'%s' % (data.decode('utf-8', errors='ignore'))) bootlog += data if self.serial_ports < 2: # this socket has mixed console/kernel data, log it to logfile From patchwork Fri Jan 20 14:44:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF328C54EB6 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75997.1674225903343252381 for ; Fri, 20 Jan 2023 06:45:03 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdZ-006KVc-Fh; Fri, 20 Jan 2023 16:45:01 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdZ-00CEZl-5V; Fri, 20 Jan 2023 16:45:01 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 09/14] oeqa qemurunner.py: increase boot failure warning log from 25 to 100 lines Date: Fri, 20 Jan 2023 16:44:45 +0200 Message-Id: <20230120144450.2913935-10-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-10-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176200 More details are needed to debug issues. Warning is printed to bitbake output console. Task execution logs will contain all debug output. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index 97c049e224..eb6d3e8c57 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -500,11 +500,11 @@ class QemuRunner: if time.time() >= endtime: self.logger.warning("Target didn't reach login banner in %d seconds (%s)" % (self.boottime, time.strftime("%D %H:%M:%S"))) - tail = lambda l: "\n".join(l.splitlines()[-25:]) + tail = lambda l: "\n".join(l.splitlines()[-100:]) bootlog = self.decode_qemulog(bootlog) # in case bootlog is empty, use tail qemu log store at self.msg lines = tail(bootlog if bootlog else self.msg) - self.logger.warning("Last 25 lines of text (%d):\n%s" % (len(bootlog), lines)) + self.logger.warning("Last 100 lines of text (%d):\n%s" % (len(bootlog), lines)) self.logger.warning("Check full boot log: %s" % self.logfile) self._dump_host() self.stop() From patchwork Fri Jan 20 14:44:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEA7DC54E94 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web11.75998.1674225903886822165 for ; Fri, 20 Jan 2023 06:45:04 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsda-006KWC-1l; Fri, 20 Jan 2023 16:45:02 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdZ-00CEbC-MU; Fri, 20 Jan 2023 16:45:01 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 10/14] oeqa qemurunner.py: add debug prints to stop() Date: Fri, 20 Jan 2023 16:44:46 +0200 Message-Id: <20230120144450.2913935-11-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: BAYES_HAM(-3.00) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: RCVD_TLS_LAST(0.00) Symbol: R_DKIM_NA(0.00) Symbol: ASN(0.00) Symbol: MIME_TRACE(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-11-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176201 These help to investigate if any of the functions hang when test target hangs. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index eb6d3e8c57..ed44036758 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -533,12 +533,16 @@ class QemuRunner: return True def stop(self): + self.logger.debug("Stopping QemuRunner") if hasattr(self, "origchldhandler"): signal.signal(signal.SIGCHLD, self.origchldhandler) + self.logger.debug("Stopping thread") self.stop_thread() + self.logger.debug("Stopping qemu system") self.stop_qemu_system() if self.runqemu: if hasattr(self, "monitorpid"): + self.logger.debug("Sending SIGKILL to monitorpid") os.kill(self.monitorpid, signal.SIGKILL) self.logger.debug("Sending SIGTERM to runqemu") try: @@ -559,12 +563,15 @@ class QemuRunner: self.runqemu_exited = True if hasattr(self, 'qmp') and self.qmp: + self.logger.debug("Closing QMP") self.qmp.close() self.qmp = None if hasattr(self, 'server_socket') and self.server_socket: + self.logger.debug("Closing server socket") self.server_socket.close() self.server_socket = None if hasattr(self, 'threadsock') and self.threadsock: + self.logger.debug("Closing threadsock") self.threadsock.close() self.threadsock = None self.qemupid = None @@ -576,6 +583,7 @@ class QemuRunner: # We raced, ignore pass if self.monitorpipe: + self.logger.debug("Closing monitorpipe") self.monitorpipe.close() def stop_qemu_system(self): From patchwork Fri Jan 20 14:44:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C519C54ED7 for ; Fri, 20 Jan 2023 14:45:09 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web10.75895.1674225904346143894 for ; Fri, 20 Jan 2023 06:45:04 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsda-006KWS-Fd; Fri, 20 Jan 2023 16:45:02 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsda-00CEeA-5M; Fri, 20 Jan 2023 16:45:02 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 11/14] oeqa qemurunner.py: add timeout to QMP calls Date: Fri, 20 Jan 2023 16:44:47 +0200 Message-Id: <20230120144450.2913935-12-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-12-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:09 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176202 When a qemu machine hangs, the QMP calls can hang for ever too, and when this happens any failing test commands from ssh runner may be followed by dump_monitor() calls which then also hang. Hangs followed by hangs. Use runqemutime at setup and run_monitor() specific timeout for later calls. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index ed44036758..7f5d87fe70 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -350,6 +350,9 @@ class QemuRunner: return False try: + # set timeout value for all QMP calls + self.logger.info("QMP setting timeout to %d" % (self.runqemutime)) + self.qmp.settimeout(self.runqemutime) self.qmp.connect() connect_time = time.time() self.logger.info("QMP connected to QEMU at %s and took %s seconds" % @@ -633,7 +636,9 @@ class QemuRunner: return False def run_monitor(self, command, args=None, timeout=60): + self.logger.debug("run_monitor() called: command = %s, args = %s, timeout = %d" % (command, args, timeout)) if hasattr(self, 'qmp') and self.qmp: + self.qmp.settimeout(timeout) if args is not None: return self.qmp.cmd(command, args) else: From patchwork Fri Jan 20 14:44:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3B9EC54737 for ; Fri, 20 Jan 2023 14:45:08 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web10.75897.1674225904759126094 for ; Fri, 20 Jan 2023 06:45:05 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsda-006KWg-Sx; Fri, 20 Jan 2023 16:45:02 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsda-00CEhw-Ig; Fri, 20 Jan 2023 16:45:02 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 12/14] oeqa qemurunner.py: kill qemu if it hangs Date: Fri, 20 Jan 2023 16:44:48 +0200 Message-Id: <20230120144450.2913935-13-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-13-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:08 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176203 qemu doesn't always behave well and can hang too. Log calls to stop qemu and finally kill it with force if was still alive. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index 7f5d87fe70..494b4e03da 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -590,12 +590,21 @@ class QemuRunner: self.monitorpipe.close() def stop_qemu_system(self): + self.logger.debug('stop_qemu_system: self.qemupid = %s' % (self.qemupid)) if self.qemupid: try: # qemu-system behaves well and a SIGTERM is enough + self.logger.debug('sending SIGTERM to %s' % (self.qemupid)) os.kill(self.qemupid, signal.SIGTERM) except ProcessLookupError as e: self.logger.warning('qemu-system ended unexpectedly') + time.sleep(5) + try: + # qemu-system did not behave well + self.logger.debug('sending SIGKILL to %s' % (self.qemupid)) + os.kill(self.qemupid, signal.SIGKILL) + except ProcessLookupError as e: + self.logger.debug('qemu-system already dead') def stop_thread(self): if self.thread and self.thread.is_alive(): From patchwork Fri Jan 20 14:44:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18360 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0AFBEC54ED2 for ; Fri, 20 Jan 2023 14:45:09 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web10.75898.1674225905463364773 for ; Fri, 20 Jan 2023 06:45:05 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdb-006KX1-Ce; Fri, 20 Jan 2023 16:45:03 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdb-00CEk6-2D; Fri, 20 Jan 2023 16:45:03 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 13/14] oeqa qemurunner.py: add debug output to run_serial() Date: Fri, 20 Jan 2023 16:44:49 +0200 Message-Id: <20230120144450.2913935-14-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-14-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:09 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176204 Add debug output which is useful if qemu system hangs and doesn't respond in time. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 1 + 1 file changed, 1 insertion(+) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index 494b4e03da..0e720ef5f0 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -657,6 +657,7 @@ class QemuRunner: # Returns (status, output) where status is 1 on success and 0 on error # We assume target system have echo to get command status + self.logger.debug("run_serial() called: command = %s, raw = %s, timeout = %d" % (command, raw, timeout)) if not raw: command = "%s; echo $?\n" % command From patchwork Fri Jan 20 14:44:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikko Rapeli X-Patchwork-Id: 18361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09F26C05027 for ; Fri, 20 Jan 2023 14:45:09 +0000 (UTC) Received: from mail.kapsi.fi (mail.kapsi.fi [91.232.154.25]) by mx.groups.io with SMTP id smtpd.web10.75899.1674225906191776692 for ; Fri, 20 Jan 2023 06:45:06 -0800 Authentication-Results: mx.groups.io; dkim=missing; spf=none, err=permanent DNS error (domain: lakka.kapsi.fi, ip: 91.232.154.25, mailfrom: mcfrisk@lakka.kapsi.fi) Received: from kapsi.fi ([2001:67c:1be8::11] helo=lakka.kapsi.fi) by mail.kapsi.fi with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1pIsdc-006KXS-7M; Fri, 20 Jan 2023 16:45:04 +0200 Received: from mcfrisk by lakka.kapsi.fi with local (Exim 4.94.2) (envelope-from ) id 1pIsdb-00CEn0-MA; Fri, 20 Jan 2023 16:45:03 +0200 From: Mikko Rapeli To: openembedded-core@lists.openembedded.org Cc: Mikko Rapeli Subject: [PATCH 14/14] oeqa qemurunner.py: improve logging in run_serial() Date: Fri, 20 Jan 2023 16:44:50 +0200 Message-Id: <20230120144450.2913935-15-mikko.rapeli@linaro.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230120144450.2913935-1-mikko.rapeli@linaro.org> References: <20230120144450.2913935-1-mikko.rapeli@linaro.org> MIME-Version: 1.0 X-Rspam-Score: -1.2 (-) X-Rspam-Report: Action: no action Symbol: RCVD_TLS_LAST(0.00) Symbol: ARC_NA(0.00) Symbol: DMARC_POLICY_SOFTFAIL(0.10) Symbol: FROM_HAS_DN(0.00) Symbol: TO_DN_SOME(0.00) Symbol: R_MISSING_CHARSET(0.50) Symbol: TO_MATCH_ENVRCPT_ALL(0.00) Symbol: MIME_GOOD(-0.10) Symbol: RCPT_COUNT_TWO(0.00) Symbol: MID_CONTAINS_FROM(1.00) Symbol: NEURAL_HAM(-0.00) Symbol: R_SPF_NA(0.00) Symbol: FORGED_SENDER(0.30) Symbol: R_DKIM_NA(0.00) Symbol: MIME_TRACE(0.00) Symbol: ASN(0.00) Symbol: FROM_NEQ_ENVFROM(0.00) Symbol: BAYES_HAM(-3.00) Symbol: RCVD_COUNT_TWO(0.00) Message-ID: 20230120144450.2913935-15-mikko.rapeli@linaro.org X-SA-Exim-Connect-IP: 2001:67c:1be8::11 X-SA-Exim-Mail-From: mcfrisk@lakka.kapsi.fi X-SA-Exim-Scanned: No (on mail.kapsi.fi); SAEximRunCond expanded to false List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 20 Jan 2023 14:45:09 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/176205 When serial communication times outs, write note also to bitbake task output. Then add a small sleep to collect more output to avoid reading data one character at a time. This makes logs much more readable. Signed-off-by: Mikko Rapeli --- meta/lib/oeqa/utils/qemurunner.py | 3 +++ 1 file changed, 3 insertions(+) diff --git a/meta/lib/oeqa/utils/qemurunner.py b/meta/lib/oeqa/utils/qemurunner.py index 0e720ef5f0..b1c6ab8ce7 100644 --- a/meta/lib/oeqa/utils/qemurunner.py +++ b/meta/lib/oeqa/utils/qemurunner.py @@ -670,12 +670,15 @@ class QemuRunner: now = time.time() if now >= end: data += "<<< run_serial(): command timed out after %d seconds without output >>>\r\n\r\n" % timeout + self.logger.debug("run_serial(): command timed out after %d seconds without output" % timeout) break try: sread, _, _ = select.select([self.server_socket],[],[], end - now) except InterruptedError: continue if sread: + # try to avoid reading single character at a time, makes logs readable + time.sleep(0.1) answer = self.server_socket.recv(1024) if answer: data += answer.decode('utf-8')