diff mbox series

wic/engine: warn about old host debugfs for standalone directory copy

Message ID 20251218101524.703244-1-daniel.dragomir@windriver.com
State New
Headers show
Series wic/engine: warn about old host debugfs for standalone directory copy | expand

Commit Message

Daniel Dragomir Dec. 18, 2025, 10:15 a.m. UTC
When wic is used in standalone mode, it relies on host tools such as
debugfs. For directory host->image copies into ext* partitions, wic
uses scripted debugfs "-f" input with multiple mkdir/write commands.

Older host debugfs versions (< 1.47) may behave unreliably in this
mode and can silently miss files. This does not affect builds using
debugfs from OE where the version is known to be sufficiently new.

Add a debugfs version check and emit a warning when an older host
debugfs is detected. The warning is shown once per run and does not
alter execution.

Signed-off-by: Daniel Dragomir <daniel.dragomir@windriver.com>
---
 scripts/lib/wic/engine.py | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

Comments

Paul Barker Dec. 18, 2025, 11:34 a.m. UTC | #1
On Thu, 2025-12-18 at 12:15 +0200, Dragomir, Daniel via
lists.openembedded.org wrote:
> When wic is used in standalone mode, it relies on host tools such as
> debugfs. For directory host->image copies into ext* partitions, wic
> uses scripted debugfs "-f" input with multiple mkdir/write commands.
> 
> Older host debugfs versions (< 1.47) may behave unreliably in this
> mode and can silently miss files. This does not affect builds using
> debugfs from OE where the version is known to be sufficiently new.
> 
> Add a debugfs version check and emit a warning when an older host
> debugfs is detected. The warning is shown once per run and does not
> alter execution.

If the risk here is silently missing files, resulting in a corrupted
rootfs or worse, I think this should be a hard error.

Consider the case where someone relies on a device having a firewall
enabled, but /etc/nftables.conf is silently missed during construction
of the rootfs ext4 image. That could result in all ports being open.

On the kirkstone branch we have e2fsprogs 1.46.5, does the same debugfs
issue apply there or has it been patched?

Best regards,
Paul Barker Dec. 18, 2025, 11:46 a.m. UTC | #2
On Thu, 2025-12-18 at 11:34 +0000, Paul Barker wrote:
> On Thu, 2025-12-18 at 12:15 +0200, Dragomir, Daniel via
> lists.openembedded.org wrote:
> > When wic is used in standalone mode, it relies on host tools such as
> > debugfs. For directory host->image copies into ext* partitions, wic
> > uses scripted debugfs "-f" input with multiple mkdir/write commands.
> > 
> > Older host debugfs versions (< 1.47) may behave unreliably in this
> > mode and can silently miss files. This does not affect builds using
> > debugfs from OE where the version is known to be sufficiently new.
> > 
> > Add a debugfs version check and emit a warning when an older host
> > debugfs is detected. The warning is shown once per run and does not
> > alter execution.
> 
> If the risk here is silently missing files, resulting in a corrupted
> rootfs or worse, I think this should be a hard error.
> 
> Consider the case where someone relies on a device having a firewall
> enabled, but /etc/nftables.conf is silently missed during construction
> of the rootfs ext4 image. That could result in all ports being open.
> 
> On the kirkstone branch we have e2fsprogs 1.46.5, does the same debugfs
> issue apply there or has it been patched?

Also, do you have any links to upstream bug reports or the commit(s)
that fixed this? I can't find anything relevant in the recent e2fsprogs
release notes.

Best regards,
Daniel Dragomir Dec. 18, 2025, 1:48 p.m. UTC | #3
On 12/18/25 13:46, Paul Barker wrote:
> On Thu, 2025-12-18 at 11:34 +0000, Paul Barker wrote:
>> On Thu, 2025-12-18 at 12:15 +0200, Dragomir, Daniel via
>> lists.openembedded.org wrote:
>>> When wic is used in standalone mode, it relies on host tools such as
>>> debugfs. For directory host->image copies into ext* partitions, wic
>>> uses scripted debugfs "-f" input with multiple mkdir/write commands.
>>>
>>> Older host debugfs versions (< 1.47) may behave unreliably in this
>>> mode and can silently miss files. This does not affect builds using
>>> debugfs from OE where the version is known to be sufficiently new.
>>>
>>> Add a debugfs version check and emit a warning when an older host
>>> debugfs is detected. The warning is shown once per run and does not
>>> alter execution.
>>
>> If the risk here is silently missing files, resulting in a corrupted
>> rootfs or worse, I think this should be a hard error.
>>
>> Consider the case where someone relies on a device having a firewall
>> enabled, but /etc/nftables.conf is silently missed during construction
>> of the rootfs ext4 image. That could result in all ports being open.

This failure can only occur when copying directories into an already
created WIC image with an ext* partition. It does not affect image or
rootfs construction.

My understanding is that `wic cp` is a post-creation image manipulation
tool and is not used during rootfs generation or WIC image creation.
It is intended solely for modifying an existing WIC image without
mounting it.

>>
>> On the kirkstone branch we have e2fsprogs 1.46.5, does the same debugfs
>> issue apply there or has it been patched?

I built a kirkstone SDK and tested this again: debugfs 1.46.5 works 
fine, and all ~1500 files were copied successfully. This issue appears 
to affect only version 1.45.

I can send a v2 of the patch to adjust the minimum required version.

> 
> Also, do you have any links to upstream bug reports or the commit(s)
> that fixed this? I can't find anything relevant in the recent e2fsprogs
> release notes.

No, I have not investigated so far why older versions of scripted 
debugfs (-f) fail to copy complex directory structures.

Adding a warning was discussed in this thread for a patch I sent to fix
copying directories into a WIC image with an ext* partition. I also 
explained how I discovered this behavior on different servers while 
testing the fix.

I only tested standalone usage of wic, using host tools from e2fsprogs.

https://lists.openembedded.org/g/openembedded-core/topic/115585333#msg228042

Regards,
Daniel

> 
> Best regards,
>
diff mbox series

Patch

diff --git a/scripts/lib/wic/engine.py b/scripts/lib/wic/engine.py
index 565a0db38a..28a973f6c4 100644
--- a/scripts/lib/wic/engine.py
+++ b/scripts/lib/wic/engine.py
@@ -220,6 +220,42 @@  def wic_list(args, scripts_path):
 
     return False
 
+_DEBUGFS_VERSION = None
+_DEBUGFS_WARNED = False
+
+def debugfs_version_check(debugfs_path, min_ver=(1, 47, 0)):
+    global _DEBUGFS_VERSION, _DEBUGFS_WARNED
+
+    if _DEBUGFS_WARNED:
+        return
+
+    # Detect version once
+    if _DEBUGFS_VERSION is None:
+        out = ""
+        for flag in ("-V", "-v"):
+            try:
+                out = exec_cmd(f"{debugfs_path} {flag}")
+                break
+            except Exception:
+                continue
+
+        import re
+        m = re.search(r"(\d+)\.(\d+)\.(\d+)", out or "")
+        _DEBUGFS_VERSION = tuple(map(int, m.groups())) if m else None
+
+    ver = _DEBUGFS_VERSION
+
+    # Compare and warn once
+    if ver is not None and ver < min_ver:
+        logger.warning(
+            "debugfs %s detected (< 1.47): Directory copies into ext* partitions "
+            "via scripted debugfs (-f) may be unreliable. Consider using a newer "
+            "debugfs, for example via a native sysroot from a newer SDK.",
+            ".".join(map(str, ver)) if ver else "unknown"
+        )
+
+    _DEBUGFS_WARNED = True
+
 
 class Disk:
     def __init__(self, imagepath, native_sysroot, fstypes=('fat', 'ext')):
@@ -334,6 +370,7 @@  class Disk:
         if self.partitions[pnum].fstype.startswith('ext'):
             if isinstance(src, str): # host to image case
                 if os.path.isdir(src):
+                    debugfs_version_check(self.debugfs)
                     base = os.path.abspath(src)
                     base_parent = os.path.dirname(base)
                     cmds = []