Message ID | 20250206172451.3843991-1-ross.burton@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [meta-yocto-bsp] linux-yocto: revert omap8250 power management changes on genericarm64 | expand |
On Thu, Feb 6, 2025 at 12:25 PM Ross Burton via lists.yoctoproject.org <ross.burton=arm.com@lists.yoctoproject.org> wrote: > The genericarm64 machine sets SERIAL_CONSOLES to a number of potential > devices: > > SERIAL_CONSOLES ?= "115200;ttyAMA0 115200;hvc0 115200;ttyS0 115200;ttyS1 > 115200;ttyS2" > > With sysvinit this turns into getty lines in inittab, and with systemd > the systemd-serialgetty recipe creates explicit units to spawn gettys. > > This worked fine with 6.6, but since "serial: 8250_omap: Drop > pm_runtime_irq_safe()"[1] in 6.7 onwards we see kernel hangs: > > BUG: scheduling while atomic: getty/957/0x00000002 > Call trace: > dump_stack+0x1c/0x30 > __schedule_bug+0x60/0x90 > __schedule+0x83c/0xcf8 > schedule+0x40/0x158 > schedule_timeout+0xb0/0x1b0 > wait_for_completion_timeout+0x84/0x188 > ti_sci_set_device_state+0x134/0x220 > ti_sci_cmd_get_device_exclusive+0x24/0x40 > ti_sci_pd_power_on+0x34/0x68 [ti_sci_pm_domains] > _genpd_power_on+0xa4/0x178 > genpd_power_on+0xb4/0x190 > genpd_runtime_resume+0xc8/0x260 > __rpm_callback+0x54/0x200 > rpm_callback+0x78/0x90 > rpm_resume+0x420/0x690 > __pm_runtime_resume+0x5c/0xb0 > omap8250_set_mctrl+0x38/0xe0 [8250_omap] > serial8250_set_mctrl+0x2c/0x60 > uart_update_mctrl+0x98/0x120 > uart_shutdown+0x124/0x180 > uart_hangup+0x7c/0x180 > __tty_hangup.part.0+0x408/0x440 > tty_vhangup_session+0x24/0x40 > disassociate_ctty.part.0+0x48/0x1b0 > disassociate_ctty+0x30/0x48 > (full backtrace elided) > > With many thanks to TI, my understanding is that it was determined that > the problem here is that we have a getty connected to ttyS1 which is > actually the expansion port uart and on the BeaglePlay wired up to the > wifi controller's debug port. The getty receives noise it doesn't know > what to do with, and at some point the power management code does a > suspend/result cycle of the device. The serial drivers assume that > child nodes use the serdev driver and they manage runtime_pm, but the > getty opening the tty breaks a series of bad assumptions in the drivers. > > So, there are two bugs: > 1) The kernel shouldn't crash if this tty is opened > 2) The only serial port for a console on the BeaglePlay is ttyS2, > despite others existing. > > TI are looking at (1) and other patches to follow will deal with (2). > Until one of these is resolved entirely, reverting this change to power > management stops the crashes. > It isn't any extra work to just revert it in linux-yocto directly, versus doing this. Is there any particular reason we'd patch this, versus having the revert in the tree ? Bruce > > [ YOCTO #15704 ] > [1] linux 8700a7ea5519fb0b3bad2362adfeac358c2119ce > > Signed-off-by: Ross Burton <ross.burton@arm.com> > --- > ...l-8250_omap-Drop-pm_runtime_irq_safe.patch | 130 ++++++++++++++++++ > .../linux/linux-yocto_6.12.bbappend | 2 + > 2 files changed, 132 insertions(+) > create mode 100644 > meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch > > diff --git > a/meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch > b/meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch > new file mode 100644 > index 00000000000..8837dd23464 > --- /dev/null > +++ > b/meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch > @@ -0,0 +1,130 @@ > +From cc255f5132cf39e9154340cf58780f8c763c6481 Mon Sep 17 00:00:00 2001 > +From: Ross Burton <ross.burton@arm.com> > +Date: Thu, 23 Jan 2025 17:06:08 +0000 > +Subject: [PATCH] Revert "serial: 8250_omap: Drop pm_runtime_irq_safe()" > + > +This reverts commit 8700a7ea5519fb0b3bad2362adfeac358c2119ce. > + > +Upstream-Status: Inappropriate > +Signed-off-by: Ross Burton <ross.burton@arm.com> > +--- > + drivers/tty/serial/8250/8250_omap.c | 29 ++++++++--------------------- > + 1 file changed, 8 insertions(+), 21 deletions(-) > + > +diff --git a/drivers/tty/serial/8250/8250_omap.c > b/drivers/tty/serial/8250/8250_omap.c > +index 0dd68bdbfbcf7..db24d7d1dcb67 100644 > +--- a/drivers/tty/serial/8250/8250_omap.c > ++++ b/drivers/tty/serial/8250/8250_omap.c > +@@ -8,7 +8,6 @@ > + * > + */ > + > +-#include <linux/atomic.h> > + #include <linux/clk.h> > + #include <linux/device.h> > + #include <linux/io.h> > +@@ -134,7 +133,6 @@ struct omap8250_priv { > + > + u8 tx_trigger; > + u8 rx_trigger; > +- atomic_t active; > + bool is_suspending; > + int wakeirq; > + u32 latency; > +@@ -636,23 +634,14 @@ static irqreturn_t omap8250_irq(int irq, void > *dev_id) > + unsigned int iir, lsr; > + int ret; > + > +- pm_runtime_get_noresume(port->dev); > +- > +- /* Shallow idle state wake-up to an IO interrupt? */ > +- if (atomic_add_unless(&priv->active, 1, 1)) { > +- priv->latency = priv->calc_latency; > +- schedule_work(&priv->qos_work); > +- } > +- > + #ifdef CONFIG_SERIAL_8250_DMA > + if (up->dma) { > + ret = omap_8250_dma_handle_irq(port); > +- pm_runtime_mark_last_busy(port->dev); > +- pm_runtime_put(port->dev); > + return IRQ_RETVAL(ret); > + } > + #endif > + > ++ serial8250_rpm_get(up); > + lsr = serial_port_in(port, UART_LSR); > + iir = serial_port_in(port, UART_IIR); > + ret = serial8250_handle_irq(port, iir); > +@@ -701,8 +690,7 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id) > + schedule_delayed_work(&up->overrun_backoff, delay); > + } > + > +- pm_runtime_mark_last_busy(port->dev); > +- pm_runtime_put(port->dev); > ++ serial8250_rpm_put(up); > + > + return IRQ_RETVAL(ret); > + } > +@@ -1314,8 +1302,11 @@ static int omap_8250_dma_handle_irq(struct > uart_port *port) > + u16 status; > + u8 iir; > + > ++ serial8250_rpm_get(up); > ++ > + iir = serial_port_in(port, UART_IIR); > + if (iir & UART_IIR_NO_INT) { > ++ serial8250_rpm_put(up); > + return IRQ_HANDLED; > + } > + > +@@ -1348,6 +1339,7 @@ static int omap_8250_dma_handle_irq(struct > uart_port *port) > + > + uart_unlock_and_check_sysrq(port); > + > ++ serial8250_rpm_put(up); > + return 1; > + } > + > +@@ -1539,6 +1531,8 @@ static int omap8250_probe(struct platform_device > *pdev) > + if (!of_get_available_child_count(pdev->dev.of_node)) > + pm_runtime_set_autosuspend_delay(&pdev->dev, -1); > + > ++ pm_runtime_irq_safe(&pdev->dev); > ++ > + pm_runtime_get_sync(&pdev->dev); > + > + omap_serial_fill_features_erratas(&up, priv); > +@@ -1776,7 +1770,6 @@ static int omap8250_runtime_suspend(struct device > *dev) > + > + priv->latency = PM_QOS_CPU_LATENCY_DEFAULT_VALUE; > + schedule_work(&priv->qos_work); > +- atomic_set(&priv->active, 0); > + > + return 0; > + } > +@@ -1786,10 +1779,6 @@ static int omap8250_runtime_resume(struct device > *dev) > + struct omap8250_priv *priv = dev_get_drvdata(dev); > + struct uart_8250_port *up = NULL; > + > +- /* Did the hardware wake to a device IO interrupt before a > wakeirq? */ > +- if (atomic_read(&priv->active)) > +- return 0; > +- > + if (priv->line >= 0) > + up = serial8250_get_port(priv->line); > + > +@@ -1805,10 +1794,8 @@ static int omap8250_runtime_resume(struct device > *dev) > + uart_port_unlock_irq(&up->port); > + } > + > +- atomic_set(&priv->active, 1); > + priv->latency = priv->calc_latency; > + schedule_work(&priv->qos_work); > +- > + return 0; > + } > + > +-- > +2.43.0 > + > diff --git a/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend > b/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend > index 1ffd2194d96..79bf2157460 100644 > --- a/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend > +++ b/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend > @@ -7,3 +7,5 @@ KMACHINE:beaglebone-yocto ?= "beaglebone" > KMACHINE:genericx86 ?= "common-pc" > KMACHINE:genericx86-64 ?= "common-pc-64" > > +FILESEXTRAPATHS:prepend := "${THISDIR}/files:" > +SRC_URI:append:genericarm64 = " > file://0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch" > -- > 2.43.0 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#13524): > https://lists.yoctoproject.org/g/poky/message/13524 > Mute This Topic: https://lists.yoctoproject.org/mt/111036278/1050810 > Group Owner: poky+owner@lists.yoctoproject.org > Unsubscribe: https://lists.yoctoproject.org/g/poky/unsub [ > bruce.ashfield@gmail.com] > -=-=-=-=-=-=-=-=-=-=-=- > >
On 6 Feb 2025, at 17:31, Bruce Ashfield <bruce.ashfield@gmail.com> wrote: > It isn't any extra work to just revert it in linux-yocto directly, versus > doing this. > > Is there any particular reason we'd patch this, versus having the > revert in the tree ? Honestly, mainly because this is more ugly so it’s more obvious that we need to remove it asap. :) I’m happy either way. Ross
On Thu, Feb 6, 2025 at 12:38 PM Ross Burton <Ross.Burton@arm.com> wrote: > On 6 Feb 2025, at 17:31, Bruce Ashfield <bruce.ashfield@gmail.com> wrote: > > It isn't any extra work to just revert it in linux-yocto directly, versus > > doing this. > > > > Is there any particular reason we'd patch this, versus having the > > revert in the tree ? > > Honestly, mainly because this is more ugly so it’s more obvious that we > need to remove it asap. :) > Let's roll with this then! I'm about to send a kernel pull request, so getting the fix via SRCREV would also imply the -stable update. This will let it go into effect now, and when I do my series. Bruce > I’m happy either way. > > Ross
Hi, On Thu, Feb 06, 2025 at 05:24:51PM +0000, Ross Burton via lists.yoctoproject.org wrote: > The genericarm64 machine sets SERIAL_CONSOLES to a number of potential > devices: > > SERIAL_CONSOLES ?= "115200;ttyAMA0 115200;hvc0 115200;ttyS0 115200;ttyS1 115200;ttyS2" > > With sysvinit this turns into getty lines in inittab, and with systemd > the systemd-serialgetty recipe creates explicit units to spawn gettys. So for systemd poky-altcfg genericarm64 images my patch "systemd: use serial-getty-generator on genericarm64" https://lists.openembedded.org/g/openembedded-core/message/210938 should also work around this since build time SERIAL_CONSOLES is ignored and getty started by udev to the correct serial devices. Cheers, -Mikko
On 7 Feb 2025, at 07:48, Mikko Rapeli <mikko.rapeli@linaro.org> wrote: > > Hi, > > On Thu, Feb 06, 2025 at 05:24:51PM +0000, Ross Burton via lists.yoctoproject.org wrote: >> The genericarm64 machine sets SERIAL_CONSOLES to a number of potential >> devices: >> >> SERIAL_CONSOLES ?= "115200;ttyAMA0 115200;hvc0 115200;ttyS0 115200;ttyS1 115200;ttyS2" >> >> With sysvinit this turns into getty lines in inittab, and with systemd >> the systemd-serialgetty recipe creates explicit units to spawn gettys. > > So for systemd poky-altcfg genericarm64 images my patch > "systemd: use serial-getty-generator on genericarm64" > https://lists.openembedded.org/g/openembedded-core/message/210938 > should also work around this since build time SERIAL_CONSOLES is ignored > and getty started by udev to the correct serial devices. Same problem exists for sysv though. Ross
diff --git a/meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch b/meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch new file mode 100644 index 00000000000..8837dd23464 --- /dev/null +++ b/meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch @@ -0,0 +1,130 @@ +From cc255f5132cf39e9154340cf58780f8c763c6481 Mon Sep 17 00:00:00 2001 +From: Ross Burton <ross.burton@arm.com> +Date: Thu, 23 Jan 2025 17:06:08 +0000 +Subject: [PATCH] Revert "serial: 8250_omap: Drop pm_runtime_irq_safe()" + +This reverts commit 8700a7ea5519fb0b3bad2362adfeac358c2119ce. + +Upstream-Status: Inappropriate +Signed-off-by: Ross Burton <ross.burton@arm.com> +--- + drivers/tty/serial/8250/8250_omap.c | 29 ++++++++--------------------- + 1 file changed, 8 insertions(+), 21 deletions(-) + +diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/8250_omap.c +index 0dd68bdbfbcf7..db24d7d1dcb67 100644 +--- a/drivers/tty/serial/8250/8250_omap.c ++++ b/drivers/tty/serial/8250/8250_omap.c +@@ -8,7 +8,6 @@ + * + */ + +-#include <linux/atomic.h> + #include <linux/clk.h> + #include <linux/device.h> + #include <linux/io.h> +@@ -134,7 +133,6 @@ struct omap8250_priv { + + u8 tx_trigger; + u8 rx_trigger; +- atomic_t active; + bool is_suspending; + int wakeirq; + u32 latency; +@@ -636,23 +634,14 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id) + unsigned int iir, lsr; + int ret; + +- pm_runtime_get_noresume(port->dev); +- +- /* Shallow idle state wake-up to an IO interrupt? */ +- if (atomic_add_unless(&priv->active, 1, 1)) { +- priv->latency = priv->calc_latency; +- schedule_work(&priv->qos_work); +- } +- + #ifdef CONFIG_SERIAL_8250_DMA + if (up->dma) { + ret = omap_8250_dma_handle_irq(port); +- pm_runtime_mark_last_busy(port->dev); +- pm_runtime_put(port->dev); + return IRQ_RETVAL(ret); + } + #endif + ++ serial8250_rpm_get(up); + lsr = serial_port_in(port, UART_LSR); + iir = serial_port_in(port, UART_IIR); + ret = serial8250_handle_irq(port, iir); +@@ -701,8 +690,7 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id) + schedule_delayed_work(&up->overrun_backoff, delay); + } + +- pm_runtime_mark_last_busy(port->dev); +- pm_runtime_put(port->dev); ++ serial8250_rpm_put(up); + + return IRQ_RETVAL(ret); + } +@@ -1314,8 +1302,11 @@ static int omap_8250_dma_handle_irq(struct uart_port *port) + u16 status; + u8 iir; + ++ serial8250_rpm_get(up); ++ + iir = serial_port_in(port, UART_IIR); + if (iir & UART_IIR_NO_INT) { ++ serial8250_rpm_put(up); + return IRQ_HANDLED; + } + +@@ -1348,6 +1339,7 @@ static int omap_8250_dma_handle_irq(struct uart_port *port) + + uart_unlock_and_check_sysrq(port); + ++ serial8250_rpm_put(up); + return 1; + } + +@@ -1539,6 +1531,8 @@ static int omap8250_probe(struct platform_device *pdev) + if (!of_get_available_child_count(pdev->dev.of_node)) + pm_runtime_set_autosuspend_delay(&pdev->dev, -1); + ++ pm_runtime_irq_safe(&pdev->dev); ++ + pm_runtime_get_sync(&pdev->dev); + + omap_serial_fill_features_erratas(&up, priv); +@@ -1776,7 +1770,6 @@ static int omap8250_runtime_suspend(struct device *dev) + + priv->latency = PM_QOS_CPU_LATENCY_DEFAULT_VALUE; + schedule_work(&priv->qos_work); +- atomic_set(&priv->active, 0); + + return 0; + } +@@ -1786,10 +1779,6 @@ static int omap8250_runtime_resume(struct device *dev) + struct omap8250_priv *priv = dev_get_drvdata(dev); + struct uart_8250_port *up = NULL; + +- /* Did the hardware wake to a device IO interrupt before a wakeirq? */ +- if (atomic_read(&priv->active)) +- return 0; +- + if (priv->line >= 0) + up = serial8250_get_port(priv->line); + +@@ -1805,10 +1794,8 @@ static int omap8250_runtime_resume(struct device *dev) + uart_port_unlock_irq(&up->port); + } + +- atomic_set(&priv->active, 1); + priv->latency = priv->calc_latency; + schedule_work(&priv->qos_work); +- + return 0; + } + +-- +2.43.0 + diff --git a/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend b/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend index 1ffd2194d96..79bf2157460 100644 --- a/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend +++ b/meta-yocto-bsp/recipes-kernel/linux/linux-yocto_6.12.bbappend @@ -7,3 +7,5 @@ KMACHINE:beaglebone-yocto ?= "beaglebone" KMACHINE:genericx86 ?= "common-pc" KMACHINE:genericx86-64 ?= "common-pc-64" +FILESEXTRAPATHS:prepend := "${THISDIR}/files:" +SRC_URI:append:genericarm64 = " file://0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch"
The genericarm64 machine sets SERIAL_CONSOLES to a number of potential devices: SERIAL_CONSOLES ?= "115200;ttyAMA0 115200;hvc0 115200;ttyS0 115200;ttyS1 115200;ttyS2" With sysvinit this turns into getty lines in inittab, and with systemd the systemd-serialgetty recipe creates explicit units to spawn gettys. This worked fine with 6.6, but since "serial: 8250_omap: Drop pm_runtime_irq_safe()"[1] in 6.7 onwards we see kernel hangs: BUG: scheduling while atomic: getty/957/0x00000002 Call trace: dump_stack+0x1c/0x30 __schedule_bug+0x60/0x90 __schedule+0x83c/0xcf8 schedule+0x40/0x158 schedule_timeout+0xb0/0x1b0 wait_for_completion_timeout+0x84/0x188 ti_sci_set_device_state+0x134/0x220 ti_sci_cmd_get_device_exclusive+0x24/0x40 ti_sci_pd_power_on+0x34/0x68 [ti_sci_pm_domains] _genpd_power_on+0xa4/0x178 genpd_power_on+0xb4/0x190 genpd_runtime_resume+0xc8/0x260 __rpm_callback+0x54/0x200 rpm_callback+0x78/0x90 rpm_resume+0x420/0x690 __pm_runtime_resume+0x5c/0xb0 omap8250_set_mctrl+0x38/0xe0 [8250_omap] serial8250_set_mctrl+0x2c/0x60 uart_update_mctrl+0x98/0x120 uart_shutdown+0x124/0x180 uart_hangup+0x7c/0x180 __tty_hangup.part.0+0x408/0x440 tty_vhangup_session+0x24/0x40 disassociate_ctty.part.0+0x48/0x1b0 disassociate_ctty+0x30/0x48 (full backtrace elided) With many thanks to TI, my understanding is that it was determined that the problem here is that we have a getty connected to ttyS1 which is actually the expansion port uart and on the BeaglePlay wired up to the wifi controller's debug port. The getty receives noise it doesn't know what to do with, and at some point the power management code does a suspend/result cycle of the device. The serial drivers assume that child nodes use the serdev driver and they manage runtime_pm, but the getty opening the tty breaks a series of bad assumptions in the drivers. So, there are two bugs: 1) The kernel shouldn't crash if this tty is opened 2) The only serial port for a console on the BeaglePlay is ttyS2, despite others existing. TI are looking at (1) and other patches to follow will deal with (2). Until one of these is resolved entirely, reverting this change to power management stops the crashes. [ YOCTO #15704 ] [1] linux 8700a7ea5519fb0b3bad2362adfeac358c2119ce Signed-off-by: Ross Burton <ross.burton@arm.com> --- ...l-8250_omap-Drop-pm_runtime_irq_safe.patch | 130 ++++++++++++++++++ .../linux/linux-yocto_6.12.bbappend | 2 + 2 files changed, 132 insertions(+) create mode 100644 meta-yocto-bsp/recipes-kernel/linux/files/0001-Revert-serial-8250_omap-Drop-pm_runtime_irq_safe.patch