diff mbox series

selftest/reproducible: Limit memory used by diffoscope

Message ID 20250523192429.545443-1-yoann.congal@smile.fr
State New
Headers show
Series selftest/reproducible: Limit memory used by diffoscope | expand

Commit Message

Yoann Congal May 23, 2025, 7:24 p.m. UTC
From: Yoann Congal <yoann.congal@smile.fr>

When working on large diffs (eg in  meta-oe's repro test) diffoscope may
use a huge amount of memory and trigger OOM kills on parallel builds.

Use the max_diff_block_lines_saved option to limit to 1024 the number of
diff mbox series

Patch

diff lines saved in a block. Also, limit the number of line in the
report to generate a report even when the limit is reached.

The chosen default 1024 comes from diffoscope default for a diff block.

For a random 10MB binary (packaged in ipk, deb and rpm), this does
decrease the "Maximum resident set size" of diffoscope from 1.3GB to
400MB.

As an added bonus, this also make diffoscope bail out earlier, on the
same example: execution time goes from 30 minutes down to 7.

Fixes [YOCTO #15876]

Signed-off-by: Yoann Congal <yoann.congal@smile.fr>
---
 meta/lib/oeqa/selftest/cases/reproducible.py | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/meta/lib/oeqa/selftest/cases/reproducible.py b/meta/lib/oeqa/selftest/cases/reproducible.py
index 1e094892e9..f06027cb03 100644
--- a/meta/lib/oeqa/selftest/cases/reproducible.py
+++ b/meta/lib/oeqa/selftest/cases/reproducible.py
@@ -97,8 +97,10 @@  def compare_file(reference, test, diffutils_sysroot):
     result.status = SAME
     return result
 
-def run_diffoscope(a_dir, b_dir, html_dir, max_report_size=0, **kwargs):
+def run_diffoscope(a_dir, b_dir, html_dir, max_report_size=0, max_diff_block_lines=1024, max_diff_block_lines_saved=0, **kwargs):
     return runCmd(['diffoscope', '--no-default-limits', '--max-report-size', str(max_report_size),
+                   '--max-diff-block-lines-saved', str(max_diff_block_lines_saved),
+                   '--max-diff-block-lines', str(max_diff_block_lines),
                    '--exclude-directory-metadata', 'yes', '--html-dir', html_dir, a_dir, b_dir],
                 **kwargs)
 
@@ -132,6 +134,11 @@  class ReproducibleTests(OESelftestTestCase):
     # Maximum report size, in bytes
     max_report_size = 250 * 1024 * 1024
 
+    # Maximum diff blocks size, in lines
+    max_diff_block_lines = 1024
+    # Maximum diff blocks size (saved in memory), in lines
+    max_diff_block_lines_saved = max_diff_block_lines
+
     # targets are the things we want to test the reproducibility of
     # Have to add the virtual targets manually for now as builds may or may not include them as they're exclude from world
     targets = ['core-image-minimal', 'core-image-sato', 'core-image-full-cmdline', 'core-image-weston', 'world', 'virtual/librpc', 'virtual/libsdl2', 'virtual/crypt']
@@ -391,6 +398,8 @@  class ReproducibleTests(OESelftestTestCase):
                 self.copy_file(os.path.join(jquery_sysroot, 'usr/share/javascript/jquery/jquery.min.js'), os.path.join(package_html_dir, 'jquery.js'))
 
                 run_diffoscope('reproducibleA', 'reproducibleB-extended', package_html_dir, max_report_size=self.max_report_size,
+                        max_diff_block_lines_saved=self.max_diff_block_lines_saved,
+                        max_diff_block_lines=self.max_diff_block_lines,
                         native_sysroot=diffoscope_sysroot, ignore_status=True, cwd=package_dir)
 
         if fails: