Message ID | 20250422190053.3244331-1-ross.burton@arm.com |
---|---|
State | Deferred |
Delegated to: | Steve Sakoman |
Headers | show |
Series | [RFC,walnascar,1/3] systemd: enable getty generator by default | expand |
With this patchset, the effect of serial-getty-generator PACKAGECONFIG is to control whether to remove systemd-getty-generator or not. But is there any case we want to remove this generator? If not, I think we should just remove this PACKAGECONFIG. Regards, Qi On 4/23/25 03:00, Ross Burton via lists.openembedded.org wrote: > Until recently, even when the getty generator was disabled in the > systemd recipe it was actually still active. This was because the old > behaviour was to delete the serial-getty template unit if the generator > was disabled, but the systemd-serialgetty package shipped then shipped > the same files so the generator continued to run. This was a bug in the > original commit[1] so this behaviour has been present since 2016. > > My recent fixes[2] changed this: if the getty generator was disabled > then the generator itself is deleted. This makes the actual behaviour > match the intention, but the consequence was to demonstrate that some > modern platforms were relying on this unexpected behaviour: specifically > the genericarm64 BSP which intends to support a number of virtual and > physical boards with a number of serial console ports that are not > really suitable to be hardcoded into SERIAL_CONSOLES: > > - ttyS0 > - ttyAMA0 (AMBA PL011 uart) > - ttyS2 (BeagleBone Play, S0 and S1 are internal) > - hvc0 (KVM) > - ttyPS1 (AMD KV260) > - And most likely more > > Restore the existing behaviour by explicitly enabling the serial getty > generator: this means that systemd will automatically bring up a getty > on the first serial console it finds. > > In the future we should extend some level of dynamic console-finding to > sysvinit-based systems by searching for a console device in inittab, but > for now this reverts the unintentional regression. > > [1] oe-core 2a8d0df47c9 ("systemd: make systemd-serialgetty optional") > [2] oe-core 2beb3170af6 ("systemd: if getty generator is disabled remove > the generator, not the units") > > Signed-off-by: Ross Burton <ross.burton@arm.com> > --- > meta/recipes-core/systemd/systemd_257.4.bb | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/meta/recipes-core/systemd/systemd_257.4.bb b/meta/recipes-core/systemd/systemd_257.4.bb > index 64fb8fe69ac..f90308f0db0 100644 > --- a/meta/recipes-core/systemd/systemd_257.4.bb > +++ b/meta/recipes-core/systemd/systemd_257.4.bb > @@ -92,6 +92,7 @@ PACKAGECONFIG ??= " \ > quotacheck \ > randomseed \ > resolved \ > + serial-getty-generator \ > set-time-epoch \ > sysusers \ > timedated \ > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#215242): https://lists.openembedded.org/g/openembedded-core/message/215242 > Mute This Topic: https://lists.openembedded.org/mt/112401496/7304865 > Group Owner: openembedded-core+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [Qi.Chen@eng.windriver.com] > -=-=-=-=-=-=-=-=-=-=-=- >
Hi Ross, On Tue, Apr 22, 2025 at 08:00:51PM +0100, Ross Burton wrote: > Until recently, even when the getty generator was disabled in the > systemd recipe it was actually still active. This was because the old > behaviour was to delete the serial-getty template unit if the generator > was disabled, but the systemd-serialgetty package shipped then shipped > the same files so the generator continued to run. This was a bug in the > original commit[1] so this behaviour has been present since 2016. > > My recent fixes[2] changed this: if the getty generator was disabled > then the generator itself is deleted. This makes the actual behaviour > match the intention, but the consequence was to demonstrate that some > modern platforms were relying on this unexpected behaviour: specifically > the genericarm64 BSP which intends to support a number of virtual and > physical boards with a number of serial console ports that are not > really suitable to be hardcoded into SERIAL_CONSOLES: > > - ttyS0 > - ttyAMA0 (AMBA PL011 uart) > - ttyS2 (BeagleBone Play, S0 and S1 are internal) > - hvc0 (KVM) > - ttyPS1 (AMD KV260) > - And most likely more > > Restore the existing behaviour by explicitly enabling the serial getty > generator: this means that systemd will automatically bring up a getty > on the first serial console it finds. Is there a need to be backwards compatible this much? Many things may break or get refactored on master branch between releases and this is just one of them. Then with these patches there still is the bug/issue seen in previous state too where systemd tries to start agetty on all SERIAL_CONSOLES defined at build time which slows down boot to "running" or "degraded" state by more than one minute. This happens also on qemu where core-image-base (I've added systemd-analyze with dependencies there to measure these): root@genericarm64:~# journalctl -b -a|grep tty | tail -15 Apr 23 09:39:31 genericarm64 systemd[1]: dev-ttyS1.device: Job dev-ttyS1.device/start timed out. Apr 23 09:39:31 genericarm64 systemd[1]: Timed out waiting for device /dev/ttyS1. Apr 23 09:39:31 genericarm64 systemd[1]: Dependency failed for Serial Getty on ttyS1. Apr 23 09:39:31 genericarm64 systemd[1]: serial-getty@ttyS1.service: Job serial-getty@ttyS1.service/start failed with result 'dependency'. Apr 23 09:39:31 genericarm64 systemd[1]: dev-ttyS1.device: Job dev-ttyS1.device/start failed with result 'timeout'. Apr 23 09:39:31 genericarm64 systemd[1]: dev-ttyS0.device: Job dev-ttyS0.device/start timed out. Apr 23 09:39:31 genericarm64 systemd[1]: Timed out waiting for device /dev/ttyS0. Apr 23 09:39:31 genericarm64 systemd[1]: Dependency failed for Serial Getty on ttyS0. Apr 23 09:39:31 genericarm64 systemd[1]: serial-getty@ttyS0.service: Job serial-getty@ttyS0.service/start failed with result 'dependency'. Apr 23 09:39:31 genericarm64 systemd[1]: dev-ttyS0.device: Job dev-ttyS0.device/start failed with result 'timeout'. Apr 23 09:39:31 genericarm64 systemd[1]: dev-ttyS2.device: Job dev-ttyS2.device/start timed out. Apr 23 09:39:31 genericarm64 systemd[1]: Timed out waiting for device /dev/ttyS2. Apr 23 09:39:31 genericarm64 systemd[1]: Dependency failed for Serial Getty on ttyS2. Apr 23 09:39:31 genericarm64 systemd[1]: serial-getty@ttyS2.service: Job serial-getty@ttyS2.service/start failed with result 'dependency'. Apr 23 09:39:31 genericarm64 systemd[1]: dev-ttyS2.device: Job dev-ttyS2.device/start failed with result 'timeout'. root@genericarm64:~# systemd-analyze Startup finished in 2.405s (firmware) + 5.109s (loader) + 10.315s (kernel) + 1min 34.041s (userspace) = 1min 51.872s multi-user.target reached after 1min 34.035s in userspace. Thus I think this breaking change is fine for master branch and upcoming release and that anyone expecting different, more robust behavior should just use the systemd agetty generator. For genericarm64 I think https://lists.yoctoproject.org/g/poky/message/13592 is the way to go to fix both completely missing agetty https://bugzilla.yoctoproject.org/show_bug.cgi?id=15830 and the slow boot time. Both this series and https://lists.yoctoproject.org/g/poky/message/13592 applied does not seem to work since SERIAL_CONSOLES is now used all the time, so delayed boot issue remains unfixed. The boot time is important because testing on target HW should not start before system is a stable state which for systemd is "running" when all services succeed or "degraded" when something went wrong. Spending one minute or more, doubling the boot time, due to tty detection timeouts is not nice when udev has already scanned all buses and loaded needed drivers (multiple times, once in initrd and once in main rootfs). For manually detecting this without systemd-analyze, login to target over serial console once prompt appears and see that "systemctl status" shows "starting" status for a long time after everything already calmed down, and "journalctl -b -a" eventually shows the tty timeout messages. The issue happens on qemu and on real target HW like AMD KV260. Cheers, -Mikko
On 23 Apr 2025, at 11:18, Mikko Rapeli <mikko.rapeli@linaro.org> wrote: >> Restore the existing behaviour by explicitly enabling the serial getty >> generator: this means that systemd will automatically bring up a getty >> on the first serial console it finds. > > Is there a need to be backwards compatible this much? Many things may break > or get refactored on master branch between releases and this is just one of them. Yes, because this is the behaviour that both fixes the KV260 issue and also doesn’t break all of the selftests due to how testimage+qemu interact. > Then with these patches there still is the bug/issue seen in previous state too > where systemd tries to start agetty on all SERIAL_CONSOLES defined > at build time which slows down boot to "running" or "degraded" state > by more than one minute. This happens also on qemu where core-image-base > (I've added systemd-analyze with dependencies there to measure these): Yes. As I said, the proper solution here is to actually probe the setup. Once we’ve got walnascar out of the way I’ll finish my branch that does the same system inspection for sysvinit so we don’t need to hardcode SERIAL_CONSOLES, instead it will bring up a console on the first tty it can find. That is likely to cause problems with testimage (as it hooks up a second tty for failures) but will be isolated to genericarm64 and not every qemu machine. > Thus I think this breaking change is fine for master branch and upcoming > release and that anyone expecting different, more robust behavior should just use > the systemd agetty generator. > Absolutely. > For genericarm64 I think https://lists.yoctoproject.org/g/poky/message/13592 > is the way to go to fix both completely missing agetty > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15830 > and the slow boot time. Both this series and > https://lists.yoctoproject.org/g/poky/message/13592 applied does not > seem to work since SERIAL_CONSOLES is now used all the time, so delayed > boot issue remains unfixed. Making systemd machine-specific would be disastrous for rebuilds as it is the provider of libudev and libsystemd, so would mean that _every_ recipe that uses those is also now machine-specific. As was discussed in the Tuesday tech call last week, I suspect a better solution would be to have a machine-specific configuration recipe that can drop in configuration files, mask or enable units, etc. This way specific machines can configure systemd in specific ways, without actually touching the core recipe. Alternatively, splitting systemd into ‘userspace’ and ’system’ recipes where the former is tune-specific and contains the userspace-facing libraries and the latter is machine specific and everything else could make sense. Ross
Hi, On Wed, Apr 23, 2025 at 10:40:27AM +0000, Ross Burton wrote: > On 23 Apr 2025, at 11:18, Mikko Rapeli <mikko.rapeli@linaro.org> wrote: > >> Restore the existing behaviour by explicitly enabling the serial getty > >> generator: this means that systemd will automatically bring up a getty > >> on the first serial console it finds. > > > > Is there a need to be backwards compatible this much? Many things may break > > or get refactored on master branch between releases and this is just one of them. > > Yes, because this is the behaviour that both fixes the KV260 issue and also doesn’t break all of the selftests due to how testimage+qemu interact. > > > Then with these patches there still is the bug/issue seen in previous state too > > where systemd tries to start agetty on all SERIAL_CONSOLES defined > > at build time which slows down boot to "running" or "degraded" state > > by more than one minute. This happens also on qemu where core-image-base > > (I've added systemd-analyze with dependencies there to measure these): > > Yes. As I said, the proper solution here is to actually probe the setup. Once we’ve got walnascar out of the way I’ll finish my branch that does the same system inspection for sysvinit so we don’t need to hardcode SERIAL_CONSOLES, instead it will bring up a console on the first tty it can find. That is likely to cause problems with testimage (as it hooks up a second tty for failures) but will be isolated to genericarm64 and not every qemu machine. Is there some bugzilla ticket or other location for details about the qemu and testimage issue with serial consoles? I've been running some selftests on genericarm64 with my proposed patches applied and have not run into any serial console related problems. I've only seen a lot of pseudo aborts on aarch64 build machine, e.g. wic touching images created by bitbake. > > Thus I think this breaking change is fine for master branch and upcoming > > release and that anyone expecting different, more robust behavior should just use > > the systemd agetty generator. > > > Absolutely. > > > For genericarm64 I think https://lists.yoctoproject.org/g/poky/message/13592 > > is the way to go to fix both completely missing agetty > > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15830 > > and the slow boot time. Both this series and > > https://lists.yoctoproject.org/g/poky/message/13592 applied does not > > seem to work since SERIAL_CONSOLES is now used all the time, so delayed > > boot issue remains unfixed. > > Making systemd machine-specific would be disastrous for rebuilds as it is the provider of libudev and libsystemd, so would mean that _every_ recipe that uses those is also now machine-specific. > > As was discussed in the Tuesday tech call last week, I suspect a better solution would be to have a machine-specific configuration recipe that can drop in configuration files, mask or enable units, etc. This way specific machines can configure systemd in specific ways, without actually touching the core recipe. > > Alternatively, splitting systemd into ‘userspace’ and ’system’ recipes where the former is tune-specific and contains the userspace-facing libraries and the latter is machine specific and everything else could make sense. Ok so machine/BSP config should not change systemd setup then, got it. Since "poky-altcfg" distro uses systemd by default, could it not also use systemd to manage agetty startup and ignore SERIAL_CONSOLES from build time? This sounds like the "efi" support issue I have with systemd. Patches out, decision pending... Cheers, -Mikko
diff --git a/meta/recipes-core/systemd/systemd_257.4.bb b/meta/recipes-core/systemd/systemd_257.4.bb index 64fb8fe69ac..f90308f0db0 100644 --- a/meta/recipes-core/systemd/systemd_257.4.bb +++ b/meta/recipes-core/systemd/systemd_257.4.bb @@ -92,6 +92,7 @@ PACKAGECONFIG ??= " \ quotacheck \ randomseed \ resolved \ + serial-getty-generator \ set-time-epoch \ sysusers \ timedated \
Until recently, even when the getty generator was disabled in the systemd recipe it was actually still active. This was because the old behaviour was to delete the serial-getty template unit if the generator was disabled, but the systemd-serialgetty package shipped then shipped the same files so the generator continued to run. This was a bug in the original commit[1] so this behaviour has been present since 2016. My recent fixes[2] changed this: if the getty generator was disabled then the generator itself is deleted. This makes the actual behaviour match the intention, but the consequence was to demonstrate that some modern platforms were relying on this unexpected behaviour: specifically the genericarm64 BSP which intends to support a number of virtual and physical boards with a number of serial console ports that are not really suitable to be hardcoded into SERIAL_CONSOLES: - ttyS0 - ttyAMA0 (AMBA PL011 uart) - ttyS2 (BeagleBone Play, S0 and S1 are internal) - hvc0 (KVM) - ttyPS1 (AMD KV260) - And most likely more Restore the existing behaviour by explicitly enabling the serial getty generator: this means that systemd will automatically bring up a getty on the first serial console it finds. In the future we should extend some level of dynamic console-finding to sysvinit-based systems by searching for a console device in inittab, but for now this reverts the unintentional regression. [1] oe-core 2a8d0df47c9 ("systemd: make systemd-serialgetty optional") [2] oe-core 2beb3170af6 ("systemd: if getty generator is disabled remove the generator, not the units") Signed-off-by: Ross Burton <ross.burton@arm.com> --- meta/recipes-core/systemd/systemd_257.4.bb | 1 + 1 file changed, 1 insertion(+)