diff mbox series

[v2] wic: filemap: use separate fd for SEEK_HOLE probes

Message ID 20260316141750.205474-1-twoerner@gmail.com
State Under Review
Headers show
Series [v2] wic: filemap: use separate fd for SEEK_HOLE probes | expand

Commit Message

Trevor Woerner March 16, 2026, 2:17 p.m. UTC
While working on splitting-out wic from oe-core, on my openSUSE Leap
16.0 machine, the moment I split wic out, 2 oe-selftests always failed
with 100% reproducibility:
	- wic.ModifyTests.test_wic_cp_ext
	- wic.Wic2.test_expand_mbr_image

In both cases the symptom is the same: the filesystem has inode tables
that are completely zeroed out. Both issues are linked together to the
same underlying fault.

FilemapSeek._get_ranges() is a generator. Due to the nature of finding
each hole/data extent one at a time using the lseek() system call,
it calls os.lseek() on a raw file descriptor, then yields, then the
caller, sparse_copy(), calls file.seek() + file.read() on a Python
BufferedReader wrapping that same fd — then the generator resumes and
calls os.lseek() again. This interleaving of raw os.lseek() and buffered
I/O on the same fd is undefined behaviour from Python's perspective.
The BufferedReader tracks its own idea of the fd's position and buffer
contents; os.lseek() changes the position behind its back. This can
corrupt its internal state and cause read() to return stale/zero data.

This code, however, has existed in wic since it was written, so why
was it not noticed before? It turns out this bug was being masked by a
number of implementation details that changed, especially when wic was
split out for oe-core. These changes conspired together to cause the bug
to be triggered.

One of the root causes of this bug is that Python 3.14 increased the
default buffer size from 8KB to 128KB[1]. With 8 KB buffers, read()s
either go through the direct-read path leaving the buffer empty, or
if it fills in 8KB chunks the buffer is fully drained. Either way,
with a small buffer, read()s do a real raw seek. No fast path. No
corruption. With a 128KB buffer, however, a much larger window exists
where BufferedReader.seek() can take the fast-path after the raw file
descriptor has already been repositioned by os.lseek() in the generator.
With the smaller buffer, this window was too narrow to hit in practice.

This is fixed by opening a second file object in FilemapSeek.__init__()
dedicated to SEEK_DATA/SEEK_HOLE probes, leaving the data-reading handle
(self._f_image) untouched.

This explains why the corruption is deterministic and tied to specific
block boundaries, why it only manifests with the split-out version using
Python 3.14 (on systems that are using Python versions less than 3.14 on
the host), and why using a separate file descriptor for reading bypasses
the issue entirely.

This is not an intermittent bug. For a more detailed explanation
including log files, in-depth analysis, and a standalone Python
reproducer, please see the linked bugzilla entry.

Fixes: [YOCTO #16197]

[1] https://github.com/python/cpython/commit/b1b4f9625c5f2a6b2c32bc5ee91c9fef3894b5e6
b1b4f9625c5f ("gh-117151: IO performance improvement, increase io.DEFAULT_BUFFER_SIZE to 128k (GH-118144)")

AI-Generated: codex/claude-opus-4.6 (xhigh)
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
---
changes in v2:
- updated the in-code comment to remove references to this being
  kernel-related (it is not, but at one point i thought it was)
- updated the in-code comment to remove the suggestion that this bug is
  intermittent -- the conditions might be intermittent but the bug is not
  and will always show up under the right conditions
---
 scripts/lib/wic/filemap.py | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/scripts/lib/wic/filemap.py b/scripts/lib/wic/filemap.py
index 85b39d5d743e..2554e8312ccc 100644
--- a/scripts/lib/wic/filemap.py
+++ b/scripts/lib/wic/filemap.py
@@ -201,6 +201,13 @@  class FilemapSeek(_FilemapBase):
         _FilemapBase.__init__(self, image, log)
         self._log.debug("FilemapSeek: initializing")
 
+        # Open a separate file handle for SEEK_DATA/SEEK_HOLE probes so
+        # that the lseek() calls do not disturb the BufferedReader state
+        # of self._f_image, which sparse_copy() uses for data reading.
+        # Sharing a single fd between os.lseek() and buffered read()
+        # has the potential to cause data corruption.
+        self._f_seek = open(self._image_path, 'rb')
+
         self._probe_seek_hole()
 
     def _probe_seek_hole(self):
@@ -244,7 +251,7 @@  class FilemapSeek(_FilemapBase):
 
     def block_is_mapped(self, block):
         """Refer the '_FilemapBase' class for the documentation."""
-        offs = _lseek(self._f_image, block * self.block_size, _SEEK_DATA)
+        offs = _lseek(self._f_seek, block * self.block_size, _SEEK_DATA)
         if offs == -1:
             result = False
         else:
@@ -265,11 +272,11 @@  class FilemapSeek(_FilemapBase):
         limit = end + count * self.block_size
 
         while True:
-            start = _lseek(self._f_image, end, whence1)
+            start = _lseek(self._f_seek, end, whence1)
             if start == -1 or start >= limit or start == self.image_size:
                 break
 
-            end = _lseek(self._f_image, start, whence2)
+            end = _lseek(self._f_seek, start, whence2)
             if end == -1 or end == self.image_size:
                 end = self.blocks_cnt * self.block_size
             if end > limit: