From patchwork Fri Aug 9 12:39:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: aszh07 X-Patchwork-Id: 47589 X-Patchwork-Delegate: steve@sakoman.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95DABC3DA4A for ; Fri, 9 Aug 2024 12:39:35 +0000 (UTC) Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by mx.groups.io with SMTP id smtpd.web11.82929.1723207166567145022 for ; Fri, 09 Aug 2024 05:39:26 -0700 Authentication-Results: mx.groups.io; dkim=pass header.i=@gmail.com header.s=20230601 header.b=lQzOIEqh; spf=pass (domain: gmail.com, ip: 209.85.210.176, mailfrom: mail2szahir@gmail.com) Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-710d1de6ee5so1179447b3a.0 for ; Fri, 09 Aug 2024 05:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723207166; x=1723811966; darn=lists.openembedded.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=6Bdmw+TcHnaj0a+JuInAMlm1KCe/ZliLx8zv4nXE8gw=; b=lQzOIEqhS3EgYkPEvbdMf2VZkT6evGIUz6G2fn37/07W/+GHlA3qG2JGiEhyfhyO3t zV2ZLIoFM2W2C+ahOOscEvqpP9cGb2MLluQFUmaX4Su/DPXLQbQi/nQUFI3nxvlhaJXt n5jBGxoC4+8ZGM51GvVZ4ImSRG9jmZHA6oO4d6aVKOC45KS++QHoOEoBuI89dj0m68NY z5qD5tBGl69dgIFtCzmlWzQRGeu/UxShiyAKPIqTl95UtTqqKa5eHkq9i38YGmvSHtRv QkrQ5RYzF1LMpMcpHTOyuA3cPbSUespJwLRj8XoLhIJe403UkAUaz0Qc3dYX5sevD5de PXiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723207166; x=1723811966; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6Bdmw+TcHnaj0a+JuInAMlm1KCe/ZliLx8zv4nXE8gw=; b=UzWqgGnZyEY+fNVLNqrj830sXhhF8wCc4n45RGIiiq79cacoFnL4bjtvfwmHsxJgeU DsRnJq0eArHQ5IK1iU/lQ92Mv7ZQs8xC93ko44IEPT5vSwJJgCxop44nM2Sxr3E6yFfJ q6gR7haq+cC/aIGvhruHmJgCYScUTiFjWwQtKbid8pbgdvNzYs1Mo2d3xq7TljtCA6zN ry8zercgkOrnu4aGqCtajTFRu4IyaiK/nA2D3+euf264EbSGOVt8cZBcgFQ2ZiQJFdLH Kip0ZF6aM2n6j1b0Xs4qiMhdLLndJuuX0avPcc5+rPwW4qAHA8vJsVoNM+a6u7tmNwjK TC9Q== X-Gm-Message-State: AOJu0YwavbbvY7fhDRyE9OallxvqLUZ6tP+SYZ0F4b5oqEOjNIxTF3n5 ARdVNBMsTjzoXTLltd6a2mS0BTBNwNQP1XVczSjAqcXzGWoyb9ZopEkb0g== X-Google-Smtp-Source: AGHT+IHoLF+wsOjOlZ9duqguzC5A3yIPOp9l12VWanwDl9toj3c5ysRxbJz0/EaU3Bl84hVmw+fUFQ== X-Received: by 2002:a05:6a00:4609:b0:70e:a5f8:33e7 with SMTP id d2e1a72fcca58-710dc705e0dmr1308567b3a.8.1723207165501; Fri, 09 Aug 2024 05:39:25 -0700 (PDT) Received: from localhost.localdomain ([2405:201:e02e:c09b:6008:6442:ed7d:2b47]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-710cb2fb0c4sm2536310b3a.206.2024.08.09.05.39.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Aug 2024 05:39:25 -0700 (PDT) From: aszh07 To: openembedded-core@lists.openembedded.org, bindu.bhabu@kpit.com Cc: Bindu Bhabu Subject: [OE-core][kirkstone][PATCH] expat: fix CVE-2023-52425 Date: Fri, 9 Aug 2024 18:09:02 +0530 Message-Id: <20240809123902.17701-1-mail2szahir@gmail.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Fri, 09 Aug 2024 12:39:35 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/203172 libexpat through 2.5.0 allows a denial of service (resource consumption) because many full reparsings are required in the case of a large token for which multiple buffer fills are needed. References: https://security-tracker.debian.org/tracker/CVE-2023-52425 https://ubuntu.com/security/CVE-2023-52425 https://packages.ubuntu.com/mantic/expat https://github.com/libexpat/libexpat/pull/789 Signed-off-by: Bindu Bhabu --- .../expat/expat/CVE-2023-52425.patch | 581 ++++++++++++++++++ meta/recipes-core/expat/expat_2.5.0.bb | 1 + 2 files changed, 582 insertions(+) create mode 100644 meta/recipes-core/expat/expat/CVE-2023-52425.patch diff --git a/meta/recipes-core/expat/expat/CVE-2023-52425.patch b/meta/recipes-core/expat/expat/CVE-2023-52425.patch new file mode 100644 index 0000000000..00bc464173 --- /dev/null +++ b/meta/recipes-core/expat/expat/CVE-2023-52425.patch @@ -0,0 +1,581 @@ +Backport of https://github.com/libexpat/libexpat/pull/789 + +From 9cdf9b8d77d5c2c2a27d15fb68dd3f83cafb45a1 Mon Sep 17 00:00:00 2001 +From: Snild Dolkow +Date: Thu, 17 Aug 2023 16:25:26 +0200 +Subject: [PATCH] Skip parsing after repeated partials on the same token +MIME-Version: 1.0 +Content-Type: text/plain; charset=UTF-8 +Content-Transfer-Encoding: 8bit + +When the parse buffer contains the starting bytes of a token but not +all of them, we cannot parse the token to completion. We call this a +partial token. When this happens, the parse position is reset to the +start of the token, and the parse() call returns. The client is then +expected to provide more data and call parse() again. + +In extreme cases, this means that the bytes of a token may be parsed +many times: once for every buffer refill required before the full token +is present in the buffer. + +Math: + Assume there's a token of T bytes + Assume the client fills the buffer in chunks of X bytes + We'll try to parse X, 2X, 3X, 4X ... until mX == T (technically >=) + That's (m²+m)X/2 = (T²/X+T)/2 bytes parsed (arithmetic progression) + While it is alleviated by larger refills, this amounts to O(T²) + +Expat grows its internal buffer by doubling it when necessary, but has +no way to inform the client about how much space is available. Instead, +we add a heuristic that skips parsing when we've repeatedly stopped on +an incomplete token. Specifically: + + * Only try to parse if we have a certain amount of data buffered + * Every time we stop on an incomplete token, double the threshold + * As soon as any token completes, the threshold is reset + +This means that when we get stuck on an incomplete token, the threshold +grows exponentially, effectively making the client perform larger buffer +fills, limiting how many times we can end up re-parsing the same bytes. + +Math: + Assume there's a token of T bytes + Assume the client fills the buffer in chunks of X bytes + We'll try to parse X, 2X, 4X, 8X ... until (2^k)X == T (or larger) + That's (2^(k+1)-1)X bytes parsed -- e.g. 15X if T = 8X + This is equal to 2T-X, which amounts to O(T) + +We could've chosen a faster growth rate, e.g. 4 or 8. Those seem to +increase performance further, at the cost of further increasing the +risk of growing the buffer more than necessary. This can easily be +adjusted in the future, if desired. + +This is all completely transparent to the client, except for: +1. possible delay of some callbacks (when our heuristic overshoots) +2. apps that never do isFinal=XML_TRUE could miss data at the end + +For the affected testdata, this change shows a 100-400x speedup. +The recset.xml benchmark shows no clear change either way. + +Before: +benchmark -n ../testdata/largefiles/recset.xml 65535 3 + 3 loops, with buffer size 65535. Average time per loop: 0.270223 +benchmark -n ../testdata/largefiles/aaaaaa_attr.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 15.033048 +benchmark -n ../testdata/largefiles/aaaaaa_cdata.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.018027 +benchmark -n ../testdata/largefiles/aaaaaa_comment.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 11.775362 +benchmark -n ../testdata/largefiles/aaaaaa_tag.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 11.711414 +benchmark -n ../testdata/largefiles/aaaaaa_text.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.019362 + +After: +./run.sh benchmark -n ../testdata/largefiles/recset.xml 65535 3 + 3 loops, with buffer size 65535. Average time per loop: 0.269030 +./run.sh benchmark -n ../testdata/largefiles/aaaaaa_attr.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.044794 +./run.sh benchmark -n ../testdata/largefiles/aaaaaa_cdata.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.016377 +./run.sh benchmark -n ../testdata/largefiles/aaaaaa_comment.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.027022 +./run.sh benchmark -n ../testdata/largefiles/aaaaaa_tag.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.099360 +./run.sh benchmark -n ../testdata/largefiles/aaaaaa_text.xml 4096 3 + 3 loops, with buffer size 4096. Average time per loop: 0.017956 + +CVE: CVE-2023-52425 +Upstream-Status: Backport [http://archive.ubuntu.com/ubuntu/pool/main/e/expat/expat_2.5.0-2ubuntu0.1.debian.tar.xz] +Comments: Hunks refreshed. + Patch difference/summary has also been modified. +Signed-off-by: Bhabu Bindu + +--- + lib/expat.h | 5 +++++ + lib/libexpat.def.cmake | 2 ++ + lib/xmlparse.c | 228 ++++++++++++++++++++++++++++++++++++++++++---------------------------------- + xmlwf/xmlwf.c | 20 ++++++++++++++++++++ + xmlwf/xmlwf_helpgen.py | 4 ++++ + 5 files changed, 156 insertions(+), 103 deletions(-) + +--- a/lib/expat.h ++++ b/lib/expat.h +@@ -16,6 +16,7 @@ + Copyright (c) 2016 Thomas Beutlich + Copyright (c) 2017 Rhodri James + Copyright (c) 2022 Thijs Schreijer ++ Copyright (c) 2023 Sony Corporation / Snild Dolkow + Licensed under the MIT license: + + Permission is hereby granted, free of charge, to any person obtaining +@@ -1050,6 +1051,10 @@ XML_SetBillionLaughsAttackProtectionActi + XML_Parser parser, unsigned long long activationThresholdBytes); + #endif + ++/* Added in Expat 2.6.0. */ ++XMLPARSEAPI(XML_Bool) ++XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled); ++ + /* Expat follows the semantic versioning convention. + See http://semver.org. + */ +--- a/lib/libexpat.def.cmake ++++ b/lib/libexpat.def.cmake +@@ -77,3 +77,5 @@ EXPORTS + ; added with version 2.4.0 + @_EXPAT_COMMENT_DTD_OR_GE@ XML_SetBillionLaughsAttackProtectionActivationThreshold @69 + @_EXPAT_COMMENT_DTD_OR_GE@ XML_SetBillionLaughsAttackProtectionMaximumAmplification @70 ++; added with version 2.6.0 ++ XML_SetReparseDeferralEnabled @71 +--- a/lib/xmlparse.c ++++ b/lib/xmlparse.c +@@ -36,6 +36,7 @@ + Copyright (c) 2022 Samanta Navarro + Copyright (c) 2022 Jeffrey Walton + Copyright (c) 2022 Jann Horn ++ Copyright (c) 2023 Sony Corporation / Snild Dolkow + Licensed under the MIT license: + + Permission is hereby granted, free of charge, to any person obtaining +@@ -73,6 +74,7 @@ + # endif + #endif + ++#include + #include + #include /* memset(), memcpy() */ + #include +@@ -196,6 +198,8 @@ typedef char ICHAR; + /* Do safe (NULL-aware) pointer arithmetic */ + #define EXPAT_SAFE_PTR_DIFF(p, q) (((p) && (q)) ? ((p) - (q)) : 0) + ++#define EXPAT_MIN(a, b) (((a) < (b)) ? (a) : (b)) ++ + #include "internal.h" + #include "xmltok.h" + #include "xmlrole.h" +@@ -617,6 +621,9 @@ struct XML_ParserStruct { + const char *m_bufferLim; + XML_Index m_parseEndByteIndex; + const char *m_parseEndPtr; ++ size_t m_partialTokenBytesBefore; /* used in heuristic to avoid O(n^2) */ ++ XML_Bool m_reparseDeferralEnabled; ++ int m_lastBufferRequestSize; + XML_Char *m_dataBuf; + XML_Char *m_dataBufEnd; + XML_StartElementHandler m_startElementHandler; +@@ -948,6 +955,46 @@ get_hash_secret_salt(XML_Parser parser) + return parser->m_hash_secret_salt; + } + ++static enum XML_Error ++callProcessor(XML_Parser parser, const char *start, const char *end, ++ const char **endPtr) { ++ const size_t have_now = EXPAT_SAFE_PTR_DIFF(end, start); ++ ++ if (parser->m_reparseDeferralEnabled ++ && ! parser->m_parsingStatus.finalBuffer) { ++ // Heuristic: don't try to parse a partial token again until the amount of ++ // available data has increased significantly. ++ const size_t had_before = parser->m_partialTokenBytesBefore; ++ // ...but *do* try anyway if we're close to causing a reallocation. ++ size_t available_buffer ++ = EXPAT_SAFE_PTR_DIFF(parser->m_bufferPtr, parser->m_buffer); ++#if XML_CONTEXT_BYTES > 0 ++ available_buffer -= EXPAT_MIN(available_buffer, XML_CONTEXT_BYTES); ++#endif ++ available_buffer ++ += EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_bufferEnd); ++ // m_lastBufferRequestSize is never assigned a value < 0, so the cast is ok ++ const bool enough ++ = (have_now >= 2 * had_before) ++ || ((size_t)parser->m_lastBufferRequestSize > available_buffer); ++ ++ if (! enough) { ++ *endPtr = start; // callers may expect this to be set ++ return XML_ERROR_NONE; ++ } ++ } ++ const enum XML_Error ret = parser->m_processor(parser, start, end, endPtr); ++ if (ret == XML_ERROR_NONE) { ++ // if we consumed nothing, remember what we had on this parse attempt. ++ if (*endPtr == start) { ++ parser->m_partialTokenBytesBefore = have_now; ++ } else { ++ parser->m_partialTokenBytesBefore = 0; ++ } ++ } ++ return ret; ++} ++ + static XML_Bool /* only valid for root parser */ + startParsing(XML_Parser parser) { + /* hash functions must be initialized before setContext() is called */ +@@ -1129,6 +1176,9 @@ parserInit(XML_Parser parser, const XML_ + parser->m_bufferEnd = parser->m_buffer; + parser->m_parseEndByteIndex = 0; + parser->m_parseEndPtr = NULL; ++ parser->m_partialTokenBytesBefore = 0; ++ parser->m_reparseDeferralEnabled = XML_TRUE; ++ parser->m_lastBufferRequestSize = 0; + parser->m_declElementType = NULL; + parser->m_declAttributeId = NULL; + parser->m_declEntity = NULL; +@@ -1298,6 +1348,7 @@ XML_ExternalEntityParserCreate(XML_Parse + to worry which hash secrets each table has. + */ + unsigned long oldhash_secret_salt; ++ XML_Bool oldReparseDeferralEnabled; + + /* Validate the oldParser parameter before we pull everything out of it */ + if (oldParser == NULL) +@@ -1342,6 +1393,7 @@ XML_ExternalEntityParserCreate(XML_Parse + to worry which hash secrets each table has. + */ + oldhash_secret_salt = parser->m_hash_secret_salt; ++ oldReparseDeferralEnabled = parser->m_reparseDeferralEnabled; + + #ifdef XML_DTD + if (! context) +@@ -1394,6 +1446,7 @@ XML_ExternalEntityParserCreate(XML_Parse + parser->m_defaultExpandInternalEntities = oldDefaultExpandInternalEntities; + parser->m_ns_triplets = oldns_triplets; + parser->m_hash_secret_salt = oldhash_secret_salt; ++ parser->m_reparseDeferralEnabled = oldReparseDeferralEnabled; + parser->m_parentParser = oldParser; + #ifdef XML_DTD + parser->m_paramEntityParsing = oldParamEntityParsing; +@@ -1848,55 +1901,8 @@ XML_Parse(XML_Parser parser, const char + parser->m_parsingStatus.parsing = XML_PARSING; + } + +- if (len == 0) { +- parser->m_parsingStatus.finalBuffer = (XML_Bool)isFinal; +- if (! isFinal) +- return XML_STATUS_OK; +- parser->m_positionPtr = parser->m_bufferPtr; +- parser->m_parseEndPtr = parser->m_bufferEnd; +- +- /* If data are left over from last buffer, and we now know that these +- data are the final chunk of input, then we have to check them again +- to detect errors based on that fact. +- */ +- parser->m_errorCode +- = parser->m_processor(parser, parser->m_bufferPtr, +- parser->m_parseEndPtr, &parser->m_bufferPtr); +- +- if (parser->m_errorCode == XML_ERROR_NONE) { +- switch (parser->m_parsingStatus.parsing) { +- case XML_SUSPENDED: +- /* It is hard to be certain, but it seems that this case +- * cannot occur. This code is cleaning up a previous parse +- * with no new data (since len == 0). Changing the parsing +- * state requires getting to execute a handler function, and +- * there doesn't seem to be an opportunity for that while in +- * this circumstance. +- * +- * Given the uncertainty, we retain the code but exclude it +- * from coverage tests. +- * +- * LCOV_EXCL_START +- */ +- XmlUpdatePosition(parser->m_encoding, parser->m_positionPtr, +- parser->m_bufferPtr, &parser->m_position); +- parser->m_positionPtr = parser->m_bufferPtr; +- return XML_STATUS_SUSPENDED; +- /* LCOV_EXCL_STOP */ +- case XML_INITIALIZED: +- case XML_PARSING: +- parser->m_parsingStatus.parsing = XML_FINISHED; +- /* fall through */ +- default: +- return XML_STATUS_OK; +- } +- } +- parser->m_eventEndPtr = parser->m_eventPtr; +- parser->m_processor = errorProcessor; +- return XML_STATUS_ERROR; +- } +-#ifndef XML_CONTEXT_BYTES +- else if (parser->m_bufferPtr == parser->m_bufferEnd) { ++#if XML_CONTEXT_BYTES == 0 ++ if (parser->m_bufferPtr == parser->m_bufferEnd) { + const char *end; + int nLeftOver; + enum XML_Status result; +@@ -1907,12 +1913,15 @@ XML_Parse(XML_Parser parser, const char + parser->m_processor = errorProcessor; + return XML_STATUS_ERROR; + } ++ // though this isn't a buffer request, we assume that `len` is the app's ++ // preferred buffer fill size, and therefore save it here. ++ parser->m_lastBufferRequestSize = len; + parser->m_parseEndByteIndex += len; + parser->m_positionPtr = s; + parser->m_parsingStatus.finalBuffer = (XML_Bool)isFinal; + + parser->m_errorCode +- = parser->m_processor(parser, s, parser->m_parseEndPtr = s + len, &end); ++ = callProcessor(parser, s, parser->m_parseEndPtr = s + len, &end); + + if (parser->m_errorCode != XML_ERROR_NONE) { + parser->m_eventEndPtr = parser->m_eventPtr; +@@ -1939,23 +1948,25 @@ XML_Parse(XML_Parser parser, const char + &parser->m_position); + nLeftOver = s + len - end; + if (nLeftOver) { +- if (parser->m_buffer == NULL +- || nLeftOver > parser->m_bufferLim - parser->m_buffer) { +- /* avoid _signed_ integer overflow */ +- char *temp = NULL; +- const int bytesToAllocate = (int)((unsigned)len * 2U); +- if (bytesToAllocate > 0) { +- temp = (char *)REALLOC(parser, parser->m_buffer, bytesToAllocate); +- } +- if (temp == NULL) { +- parser->m_errorCode = XML_ERROR_NO_MEMORY; +- parser->m_eventPtr = parser->m_eventEndPtr = NULL; +- parser->m_processor = errorProcessor; +- return XML_STATUS_ERROR; +- } +- parser->m_buffer = temp; +- parser->m_bufferLim = parser->m_buffer + bytesToAllocate; ++ // Back up and restore the parsing status to avoid XML_ERROR_SUSPENDED ++ // (and XML_ERROR_FINISHED) from XML_GetBuffer. ++ const enum XML_Parsing originalStatus = parser->m_parsingStatus.parsing; ++ parser->m_parsingStatus.parsing = XML_PARSING; ++ void *const temp = XML_GetBuffer(parser, nLeftOver); ++ parser->m_parsingStatus.parsing = originalStatus; ++ // GetBuffer may have overwritten this, but we want to remember what the ++ // app requested, not how many bytes were left over after parsing. ++ parser->m_lastBufferRequestSize = len; ++ if (temp == NULL) { ++ // NOTE: parser->m_errorCode has already been set by XML_GetBuffer(). ++ parser->m_eventPtr = parser->m_eventEndPtr = NULL; ++ parser->m_processor = errorProcessor; ++ return XML_STATUS_ERROR; + } ++ // Since we know that the buffer was empty and XML_CONTEXT_BYTES is 0, we ++ // don't have any data to preserve, and can copy straight into the start ++ // of the buffer rather than the GetBuffer return pointer (which may be ++ // pointing further into the allocated buffer). + memcpy(parser->m_buffer, end, nLeftOver); + } + parser->m_bufferPtr = parser->m_buffer; +@@ -1966,16 +1977,15 @@ XML_Parse(XML_Parser parser, const char + parser->m_eventEndPtr = parser->m_bufferPtr; + return result; + } +-#endif /* not defined XML_CONTEXT_BYTES */ +- else { +- void *buff = XML_GetBuffer(parser, len); +- if (buff == NULL) +- return XML_STATUS_ERROR; +- else { +- memcpy(buff, s, len); +- return XML_ParseBuffer(parser, len, isFinal); +- } ++#endif /* XML_CONTEXT_BYTES == 0 */ ++ void *buff = XML_GetBuffer(parser, len); ++ if (buff == NULL) ++ return XML_STATUS_ERROR; ++ if (len > 0) { ++ assert(s != NULL); // make sure s==NULL && len!=0 was rejected above ++ memcpy(buff, s, len); + } ++ return XML_ParseBuffer(parser, len, isFinal); + } + + enum XML_Status XMLCALL +@@ -2015,8 +2025,8 @@ XML_ParseBuffer(XML_Parser parser, int l + parser->m_parseEndByteIndex += len; + parser->m_parsingStatus.finalBuffer = (XML_Bool)isFinal; + +- parser->m_errorCode = parser->m_processor( +- parser, start, parser->m_parseEndPtr, &parser->m_bufferPtr); ++ parser->m_errorCode = callProcessor(parser, start, parser->m_parseEndPtr, ++ &parser->m_bufferPtr); + + if (parser->m_errorCode != XML_ERROR_NONE) { + parser->m_eventEndPtr = parser->m_eventPtr; +@@ -2061,8 +2071,12 @@ XML_GetBuffer(XML_Parser parser, int len + default:; + } + +- if (len > EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_bufferEnd)) { +-#ifdef XML_CONTEXT_BYTES ++ // whether or not the request succeeds, `len` seems to be the app's preferred ++ // buffer fill size; remember it. ++ parser->m_lastBufferRequestSize = len; ++ if (len > EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_bufferEnd) ++ || parser->m_buffer == NULL) { ++#if XML_CONTEXT_BYTES > 0 + int keep; + #endif /* defined XML_CONTEXT_BYTES */ + /* Do not invoke signed arithmetic overflow: */ +@@ -2083,10 +2097,11 @@ XML_GetBuffer(XML_Parser parser, int len + return NULL; + } + neededSize += keep; +-#endif /* defined XML_CONTEXT_BYTES */ +- if (neededSize +- <= EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_buffer)) { +-#ifdef XML_CONTEXT_BYTES ++#endif /* XML_CONTEXT_BYTES > 0 */ ++ if (parser->m_buffer && parser->m_bufferPtr ++ && neededSize ++ <= EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_buffer)) { ++#if XML_CONTEXT_BYTES > 0 + if (keep < EXPAT_SAFE_PTR_DIFF(parser->m_bufferPtr, parser->m_buffer)) { + int offset + = (int)EXPAT_SAFE_PTR_DIFF(parser->m_bufferPtr, parser->m_buffer) +@@ -2099,19 +2114,17 @@ XML_GetBuffer(XML_Parser parser, int len + parser->m_bufferPtr -= offset; + } + #else +- if (parser->m_buffer && parser->m_bufferPtr) { +- memmove(parser->m_buffer, parser->m_bufferPtr, +- EXPAT_SAFE_PTR_DIFF(parser->m_bufferEnd, parser->m_bufferPtr)); +- parser->m_bufferEnd +- = parser->m_buffer +- + EXPAT_SAFE_PTR_DIFF(parser->m_bufferEnd, parser->m_bufferPtr); +- parser->m_bufferPtr = parser->m_buffer; +- } +-#endif /* not defined XML_CONTEXT_BYTES */ ++ memmove(parser->m_buffer, parser->m_bufferPtr, ++ EXPAT_SAFE_PTR_DIFF(parser->m_bufferEnd, parser->m_bufferPtr)); ++ parser->m_bufferEnd ++ = parser->m_buffer ++ + EXPAT_SAFE_PTR_DIFF(parser->m_bufferEnd, parser->m_bufferPtr); ++ parser->m_bufferPtr = parser->m_buffer; ++#endif /* XML_CONTEXT_BYTES > 0 */ + } else { + char *newBuf; + int bufferSize +- = (int)EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_bufferPtr); ++ = (int)EXPAT_SAFE_PTR_DIFF(parser->m_bufferLim, parser->m_buffer); + if (bufferSize == 0) + bufferSize = INIT_BUFFER_SIZE; + do { +@@ -2208,7 +2221,7 @@ XML_ResumeParser(XML_Parser parser) { + } + parser->m_parsingStatus.parsing = XML_PARSING; + +- parser->m_errorCode = parser->m_processor( ++ parser->m_errorCode = callProcessor( + parser, parser->m_bufferPtr, parser->m_parseEndPtr, &parser->m_bufferPtr); + + if (parser->m_errorCode != XML_ERROR_NONE) { +@@ -2576,6 +2576,15 @@ XML_SetBillionLaughsAttackProtectionActi + } + #endif /* XML_GE == 1 */ + ++XML_Bool XMLCALL ++XML_SetReparseDeferralEnabled(XML_Parser parser, XML_Bool enabled) { ++ if (parser != NULL && (enabled == XML_TRUE || enabled == XML_FALSE)) { ++ parser->m_reparseDeferralEnabled = enabled; ++ return XML_TRUE; ++ } ++ return XML_FALSE; ++} ++ + /* Initially tag->rawName always points into the parse buffer; + for those TAG instances opened while the current parse buffer was + processed, and not yet closed, we need to store tag->rawName in a more +@@ -4497,15 +4506,15 @@ entityValueInitProcessor(XML_Parser pars + parser->m_processor = entityValueProcessor; + return entityValueProcessor(parser, next, end, nextPtr); + } +- /* If we are at the end of the buffer, this would cause XmlPrologTok to +- return XML_TOK_NONE on the next call, which would then cause the +- function to exit with *nextPtr set to s - that is what we want for other +- tokens, but not for the BOM - we would rather like to skip it; +- then, when this routine is entered the next time, XmlPrologTok will +- return XML_TOK_INVALID, since the BOM is still in the buffer ++ /* XmlPrologTok has now set the encoding based on the BOM it found, and we ++ must move s and nextPtr forward to consume the BOM. ++ ++ If we didn't, and got XML_TOK_NONE from the next XmlPrologTok call, we ++ would leave the BOM in the buffer and return. On the next call to this ++ function, our XmlPrologTok call would return XML_TOK_INVALID, since it ++ is not valid to have multiple BOMs. + */ +- else if (tok == XML_TOK_BOM && next == end +- && ! parser->m_parsingStatus.finalBuffer) { ++ else if (tok == XML_TOK_BOM) { + # if XML_GE == 1 + if (! accountingDiffTolerated(parser, tok, s, next, __LINE__, + XML_ACCOUNT_DIRECT)) { +@@ -4500,7 +4522,7 @@ entityValueInitProcessor(XML_Parser pars + # endif + + *nextPtr = next; +- return XML_ERROR_NONE; ++ s = next; + } + /* If we get this token, we have the start of what might be a + normal tag, but not a declaration (i.e. it doesn't begin with +--- a/xmlwf/xmlwf.c ++++ b/xmlwf/xmlwf.c +@@ -914,6 +914,9 @@ usage(const XML_Char *prog, int rc) { + T(" -a FACTOR set maximum tolerated [a]mplification factor (default: 100.0)\n") + T(" -b BYTES set number of output [b]ytes needed to activate (default: 8 MiB)\n") + T("\n") ++ T("reparse deferral:\n") ++ T(" -q disable reparse deferral, and allow [q]uadratic parse runtime with large tokens\n") ++ T("\n") + T("info arguments:\n") + T(" -h show this [h]elp message and exit\n") + T(" -v show program's [v]ersion number and exit\n") +@@ -967,6 +970,8 @@ tmain(int argc, XML_Char **argv) { + unsigned long long attackThresholdBytes; + XML_Bool attackThresholdGiven = XML_FALSE; + ++ XML_Bool disableDeferral = XML_FALSE; ++ + int exitCode = XMLWF_EXIT_SUCCESS; + enum XML_ParamEntityParsing paramEntityParsing + = XML_PARAM_ENTITY_PARSING_NEVER; +@@ -1089,6 +1094,11 @@ tmain(int argc, XML_Char **argv) { + #endif + break; + } ++ case T('q'): { ++ disableDeferral = XML_TRUE; ++ j++; ++ break; ++ } + case T('\0'): + if (j > 1) { + i++; +@@ -1134,6 +1144,16 @@ tmain(int argc, XML_Char **argv) { + #endif + } + ++ if (disableDeferral) { ++ const XML_Bool success = XML_SetReparseDeferralEnabled(parser, XML_FALSE); ++ if (! success) { ++ // This prevents tperror(..) from reporting misleading "[..]: Success" ++ errno = EINVAL; ++ tperror(T("Failed to disable reparse deferral")); ++ exit(XMLWF_EXIT_INTERNAL_ERROR); ++ } ++ } ++ + if (requireStandalone) + XML_SetNotStandaloneHandler(parser, notStandalone); + XML_SetParamEntityParsing(parser, paramEntityParsing); +--- a/xmlwf/xmlwf_helpgen.py ++++ b/xmlwf/xmlwf_helpgen.py +@@ -81,6 +81,10 @@ billion_laughs.add_argument('-a', metava + help='set maximum tolerated [a]mplification factor (default: 100.0)') + billion_laughs.add_argument('-b', metavar='BYTES', help='set number of output [b]ytes needed to activate (default: 8 MiB)') + ++reparse_deferral = parser.add_argument_group('reparse deferral') ++reparse_deferral.add_argument('-q', metavar='FACTOR', ++ help='disable reparse deferral, and allow [q]uadratic parse runtime with large tokens') ++ + parser.add_argument('files', metavar='FILE', nargs='*', help='file to process (default: STDIN)') + + info = parser.add_argument_group('info arguments') diff --git a/meta/recipes-core/expat/expat_2.5.0.bb b/meta/recipes-core/expat/expat_2.5.0.bb index 31e989cfe2..09bc7a0a0b 100644 --- a/meta/recipes-core/expat/expat_2.5.0.bb +++ b/meta/recipes-core/expat/expat_2.5.0.bb @@ -22,6 +22,7 @@ SRC_URI = "https://github.com/libexpat/libexpat/releases/download/R_${VERSION_TA file://CVE-2023-52426-009.patch \ file://CVE-2023-52426-010.patch \ file://CVE-2023-52426-011.patch \ + file://CVE-2023-52425.patch \ " UPSTREAM_CHECK_URI = "https://github.com/libexpat/libexpat/releases/"