diff mbox series

linux-yocto-6.5: Add serial port suspend/boot workaround

Message ID 20231015084412.379771-1-richard.purdie@linuxfoundation.org
State New
Headers show
Series linux-yocto-6.5: Add serial port suspend/boot workaround | expand

Commit Message

Richard Purdie Oct. 15, 2023, 8:44 a.m. UTC
The 6.5 kernel has some issues with serial ports not always starting up
in early boot correctly after upstream PM changes. Add a hack to cause
data not to get lost in the xmit buffer. This has caused breakage on the
autobuilder.

[Patch being sent for wider attention and not for merging as Bruce would
handle adding it to the kernels]

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 .../linux/files/suspendhack.patch             | 79 +++++++++++++++++++
 meta/recipes-kernel/linux/linux-yocto_6.5.bb  |  3 +-
 2 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 meta/recipes-kernel/linux/files/suspendhack.patch
diff mbox series

Patch

diff --git a/meta/recipes-kernel/linux/files/suspendhack.patch b/meta/recipes-kernel/linux/files/suspendhack.patch
new file mode 100644
index 00000000000..03cfc084fef
--- /dev/null
+++ b/meta/recipes-kernel/linux/files/suspendhack.patch
@@ -0,0 +1,79 @@ 
+We're seeing an issue on x86_64 with 6.5.X where data appears to be
+left in the transmission buffer and not sent to the port on the second
+serial port (ttyS1) until we trigger it with intervention such as an echo
+to the same serial port from within the image.
+
+Paul Gortmaker bisected the issue down to this commit:
+
+serial: core: Start managing serial controllers to enable runtime PM
+https://lore.kernel.org/linux-serial/1431f5b4-fb39-483b-9314-ed2b7c118c11@gmail.com/T/#t
+
+We run very simple plain images under qemu with two serial ports
+configured. These are x86_64 images with two 16550A ports with the
+standard x86 port addresses which appear as ttyS0 and ttyS1, nothing
+special. We run a console and getty on ttyS0 and a getty on ttyS1.
+After boot, we wait for a getty to appear on ttyS1, login to it and run
+commands. The serial connections are made over sockets from qemu.
+
+
+With 6.5.5 (and latest 6.5.X head as of yesterday), we see an issue say
+1% of the time where the getty never appears on ttyS1 even after our
+timeout of 1000s.
+
+When this happens we've added code to login to the ttyS0 getty and run
+debug commands. We've been able to confirm the getty is running and the
+init system doesn't matter (happens with sysvinit and systemd). The
+most interesting debug I've seen is this:
+
+root@qemux86-64:~# cat /proc/tty/driver/serial
+serinfo:1.0 driver revision:
+0: uart:16550A port:000003F8 irq:4 tx:418 rx:43 RTS|CTS|DTR|DSR|CD
+1: uart:16550A port:000002F8 irq:3 tx:249 rx:0 RTS|CTS|DTR|DSR|CD
+2: uart:unknown port:000003E8 irq:4
+3: uart:unknown port:000002E8 irq:3
+root@qemux86-64:~# echo helloA > /dev/ttyS1
+root@qemux86-64:~# echo helloB > /dev/ttyS0
+helloB
+root@qemux86-64:~# cat /proc/tty/driver/serial
+serinfo:1.0 driver revision:
+0: uart:16550A port:000003F8 irq:4 tx:803 rx:121 RTS|CTS|DTR|DSR|CD
+1: uart:16550A port:000002F8 irq:3 tx:281 rx:0 RTS|CTS|DTR|DSR|CD
+2: uart:unknown port:000003E8 irq:4
+3: uart:unknown port:000002E8 irq:3
+
+This is being run after the getty didn't appear for 60s on ttyS1 so
+we've logged into ttyS0 and run these commands. We've seen that if it
+doesn't appear after 60s, it won't appear after 1000s either.
+
+The tx:249 is interesting as it should be tx:273, 273 being the number
+of bytes our successful serial getty prompt has. Once we echo something
+to the port (8 bytes), tx: jumps to 281, so it suddenly found our
+missing login prompt. This is confirmed with the data appearing on the
+port after the echo.
+
+I did try disabling the autosuspend code in the commit above but it
+made no difference. What does seem to help is changing the conditional
+the patch adds around start_tx() back to being under the original
+conditions. This is relatively harmless as it will just stop_tx() again
+if the xmit buffer is empty and this is a one off operation at probe time.
+The small overhead is much preferred to randomly failing tests.
+
+Discussions with upstream are being attempted:
+https://lore.kernel.org/linux-serial/c85ab969826989c27402711155ec086fd81574fb.camel@linuxfoundation.org/T/#t
+
+Upstream-Status: Pending
+
+diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
+index 831d033611e6..cb2434b733c2 100644
+--- a/drivers/tty/serial/serial_core.c
++++ b/drivers/tty/serial/serial_core.c
+@@ -157,7 +157,7 @@ static void __uart_start(struct tty_struct *tty)
+ 	 * enabled, serial_port_runtime_resume() calls start_tx() again
+ 	 * after enabling the device.
+ 	 */
+-	if (pm_runtime_active(&port_dev->dev))
++	if (1)
+ 		port->ops->start_tx(port);
+ 	pm_runtime_mark_last_busy(&port_dev->dev);
+ 	pm_runtime_put_autosuspend(&port_dev->dev);
+
diff --git a/meta/recipes-kernel/linux/linux-yocto_6.5.bb b/meta/recipes-kernel/linux/linux-yocto_6.5.bb
index 392e6b3d819..51ee963c1ce 100644
--- a/meta/recipes-kernel/linux/linux-yocto_6.5.bb
+++ b/meta/recipes-kernel/linux/linux-yocto_6.5.bb
@@ -41,7 +41,8 @@  PN:class-devupstream = "linux-yocto-upstream"
 KBRANCH:class-devupstream = "v6.5/base"
 
 SRC_URI = "git://git.yoctoproject.org/linux-yocto.git;name=machine;branch=${KBRANCH};protocol=https \
-           git://git.yoctoproject.org/yocto-kernel-cache;type=kmeta;name=meta;branch=yocto-6.5;destsuffix=${KMETA};protocol=https"
+           git://git.yoctoproject.org/yocto-kernel-cache;type=kmeta;name=meta;branch=yocto-6.5;destsuffix=${KMETA};protocol=https \
+           file://suspendhack.patch"
 
 LIC_FILES_CHKSUM = "file://COPYING;md5=6bc538ed5bd9a7fc9398086aedcd7e46"
 LINUX_VERSION ?= "6.5.7"