diff mbox series

[scarthgap] cve-update-nvd2-native: add workaround for json5 style list

Message ID 20250410164455.9027-1-peter.marko@siemens.com
State Under Review
Delegated to: Steve Sakoman
Headers show
Series [scarthgap] cve-update-nvd2-native: add workaround for json5 style list | expand

Commit Message

Marko, Peter April 10, 2025, 4:44 p.m. UTC
From: Peter Marko <peter.marko@siemens.com>

NVD responses changed to an invalid json between:
* April 5, 2025 at 3:03:44 AM GMT+2
* April 5, 2025 at 4:19:48 AM GMT+2

The last response is since then in format
{
  "resultsPerPage": 625,
  "startIndex": 288000,
  "totalResults": 288625,
  "format": "NVD_CVE",
  "version": "2.0",
  "timestamp": "2025-04-07T07:17:17.534",
  "vulnerabilities": [
    {...},
    ...
    {...},
  ]
}

Json does not allow trailing , in responses, that is json5 format.
So cve-update-nvd2-native do_Fetch task fails with log backtrace ending:

...
File: '/builds/ccp/meta-siemens/projects/ccp/../../poky/meta/recipes-core/meta/cve-update-nvd2-native.bb', lineno: 234, function: update_db_file
     0230:            if raw_data is None:
     0231:                # We haven't managed to download data
     0232:                return False
     0233:
 *** 0234:            data = json.loads(raw_data)
     0235:
     0236:            index = data["startIndex"]
     0237:            total = data["totalResults"]
     0238:            per_page = data["resultsPerPage"]
...
File: '/usr/lib/python3.11/json/decoder.py', lineno: 355, function: raw_decode
     0351:        """
     0352:        try:
     0353:            obj, end = self.scan_once(s, idx)
     0354:        except StopIteration as err:
 *** 0355:            raise JSONDecodeError("Expecting value", s, err.value) from None
     0356:        return obj, end
Exception: json.decoder.JSONDecodeError: Expecting value: line 1 column 1442633 (char 1442632)
...

There was no announcement about json format of API v2.0 by nvd.
Also this happens only if whole database is queried (database update is
fine, even when multiple pages as queried).
And lastly it's only the cve list, all other lists inside are fine.
So this looks like a bug in NVD 2.0 introduced with some update.

Patch this with simple character deletion for now and let's monitor the
situation and possibly switch to json5 in the future.
Note that there is no native json5 support in python, we'd have to use
one of external libraries for it.

(From OE-Core rev: 6e526327f5c9e739ac7981e4a43a4ce53a908945)

Signed-off-by: Peter Marko <peter.marko@siemens.com>
Signed-off-by: Mathieu Dubois-Briand <mathieu.dubois-briand@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 meta/recipes-core/meta/cve-update-nvd2-native.bb | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Steve Sakoman April 11, 2025, 3:31 p.m. UTC | #1
I'm not sure if it is related to this patch, but I've gotten cve-check
errors in my test builds with this patch:

The stack trace of python calls that resulted in this exception/failure was:
File: 'exec_func_python() autogenerated', lineno: 2, function: <module>
     0001:
 *** 0002:do_cve_check(d)
     0003:
File: '/srv/pokybuild/yocto-worker/oe-selftest-debian/build/meta/classes/cve-check.bbclass',
lineno: 192, function: do_cve_check
     0188:            try:
     0189:                patched_cves = get_patched_cves(d)
     0190:            except FileNotFoundError:
     0191:                bb.fatal("Failure in searching patches")
 *** 0192:            ignored, patched, unpatched, status =
check_cves(d, patched_cves)
     0193:            if patched or unpatched or
(d.getVar("CVE_CHECK_COVERAGE") == "1" and status):
     0194:                cve_data = get_cve_info(d, patched +
unpatched + ignored)
     0195:                cve_write_data(d, patched, unpatched,
ignored, cve_data, status)
     0196:        else:
File: '/srv/pokybuild/yocto-worker/oe-selftest-debian/build/meta/classes/cve-check.bbclass',
lineno: 346, function: check_cves
     0342:        else:
     0343:            vendor = "%"
     0344:
     0345:        # Find all relevant CVE IDs.
 *** 0346:        cve_cursor = conn.execute("SELECT DISTINCT ID FROM
PRODUCTS WHERE PRODUCT IS ? AND VENDOR LIKE ?", (product, vendor))
     0347:        for cverow in cve_cursor:
     0348:            cve = cverow[0]
     0349:
     0350:            if cve in cve_ignore:
Exception: sqlite3.DatabaseError: database disk image is malformed

I've seen this twice now:

https://autobuilder.yoctoproject.org/valkyrie/?#/builders/23/builds/1421
https://autobuilder.yoctoproject.org/valkyrie/?#/builders/35/builds/1343

Steve

On Thu, Apr 10, 2025 at 9:46 AM Peter Marko via lists.openembedded.org
<peter.marko=siemens.com@lists.openembedded.org> wrote:
>
> From: Peter Marko <peter.marko@siemens.com>
>
> NVD responses changed to an invalid json between:
> * April 5, 2025 at 3:03:44 AM GMT+2
> * April 5, 2025 at 4:19:48 AM GMT+2
>
> The last response is since then in format
> {
>   "resultsPerPage": 625,
>   "startIndex": 288000,
>   "totalResults": 288625,
>   "format": "NVD_CVE",
>   "version": "2.0",
>   "timestamp": "2025-04-07T07:17:17.534",
>   "vulnerabilities": [
>     {...},
>     ...
>     {...},
>   ]
> }
>
> Json does not allow trailing , in responses, that is json5 format.
> So cve-update-nvd2-native do_Fetch task fails with log backtrace ending:
>
> ...
> File: '/builds/ccp/meta-siemens/projects/ccp/../../poky/meta/recipes-core/meta/cve-update-nvd2-native.bb', lineno: 234, function: update_db_file
>      0230:            if raw_data is None:
>      0231:                # We haven't managed to download data
>      0232:                return False
>      0233:
>  *** 0234:            data = json.loads(raw_data)
>      0235:
>      0236:            index = data["startIndex"]
>      0237:            total = data["totalResults"]
>      0238:            per_page = data["resultsPerPage"]
> ...
> File: '/usr/lib/python3.11/json/decoder.py', lineno: 355, function: raw_decode
>      0351:        """
>      0352:        try:
>      0353:            obj, end = self.scan_once(s, idx)
>      0354:        except StopIteration as err:
>  *** 0355:            raise JSONDecodeError("Expecting value", s, err.value) from None
>      0356:        return obj, end
> Exception: json.decoder.JSONDecodeError: Expecting value: line 1 column 1442633 (char 1442632)
> ...
>
> There was no announcement about json format of API v2.0 by nvd.
> Also this happens only if whole database is queried (database update is
> fine, even when multiple pages as queried).
> And lastly it's only the cve list, all other lists inside are fine.
> So this looks like a bug in NVD 2.0 introduced with some update.
>
> Patch this with simple character deletion for now and let's monitor the
> situation and possibly switch to json5 in the future.
> Note that there is no native json5 support in python, we'd have to use
> one of external libraries for it.
>
> (From OE-Core rev: 6e526327f5c9e739ac7981e4a43a4ce53a908945)
>
> Signed-off-by: Peter Marko <peter.marko@siemens.com>
> Signed-off-by: Mathieu Dubois-Briand <mathieu.dubois-briand@bootlin.com>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  meta/recipes-core/meta/cve-update-nvd2-native.bb | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/meta/recipes-core/meta/cve-update-nvd2-native.bb b/meta/recipes-core/meta/cve-update-nvd2-native.bb
> index 99acead18d..74c780493d 100644
> --- a/meta/recipes-core/meta/cve-update-nvd2-native.bb
> +++ b/meta/recipes-core/meta/cve-update-nvd2-native.bb
> @@ -231,6 +231,11 @@ def update_db_file(db_tmp_file, d, database_time):
>                  # We haven't managed to download data
>                  return False
>
> +            # hack for json5 style responses
> +            if raw_data[-3:] == ',]}':
> +                bb.note("Removing trailing ',' from nvd response")
> +                raw_data = raw_data[:-3] + ']}'
> +
>              data = json.loads(raw_data)
>
>              index = data["startIndex"]
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#214682): https://lists.openembedded.org/g/openembedded-core/message/214682
> Mute This Topic: https://lists.openembedded.org/mt/112195016/3620601
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [steve@sakoman.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Marko, Peter April 11, 2025, 3:45 p.m. UTC | #2
I think it's a known issue that the database gets corrupted sometimes on autobuilder.
I guess you'll see this until the corrupted DB is deleted.

Note that my change does not touch DB code, so it should not be related.

Peter

> -----Original Message-----
> From: Steve Sakoman <steve@sakoman.com>
> Sent: Friday, April 11, 2025 17:31
> To: Marko, Peter (FT D EU SK BFS1) <Peter.Marko@siemens.com>
> Cc: openembedded-core@lists.openembedded.org; Mathieu Dubois-Briand
> <mathieu.dubois-briand@bootlin.com>; Richard Purdie
> <richard.purdie@linuxfoundation.org>
> Subject: Re: [OE-core][scarthgap][PATCH] cve-update-nvd2-native: add
> workaround for json5 style list
> 
> I'm not sure if it is related to this patch, but I've gotten cve-check
> errors in my test builds with this patch:
> 
> The stack trace of python calls that resulted in this exception/failure was:
> File: 'exec_func_python() autogenerated', lineno: 2, function: <module>
>      0001:
>  *** 0002:do_cve_check(d)
>      0003:
> File: '/srv/pokybuild/yocto-worker/oe-selftest-debian/build/meta/classes/cve-
> check.bbclass',
> lineno: 192, function: do_cve_check
>      0188:            try:
>      0189:                patched_cves = get_patched_cves(d)
>      0190:            except FileNotFoundError:
>      0191:                bb.fatal("Failure in searching patches")
>  *** 0192:            ignored, patched, unpatched, status =
> check_cves(d, patched_cves)
>      0193:            if patched or unpatched or
> (d.getVar("CVE_CHECK_COVERAGE") == "1" and status):
>      0194:                cve_data = get_cve_info(d, patched +
> unpatched + ignored)
>      0195:                cve_write_data(d, patched, unpatched,
> ignored, cve_data, status)
>      0196:        else:
> File: '/srv/pokybuild/yocto-worker/oe-selftest-debian/build/meta/classes/cve-
> check.bbclass',
> lineno: 346, function: check_cves
>      0342:        else:
>      0343:            vendor = "%"
>      0344:
>      0345:        # Find all relevant CVE IDs.
>  *** 0346:        cve_cursor = conn.execute("SELECT DISTINCT ID FROM
> PRODUCTS WHERE PRODUCT IS ? AND VENDOR LIKE ?", (product, vendor))
>      0347:        for cverow in cve_cursor:
>      0348:            cve = cverow[0]
>      0349:
>      0350:            if cve in cve_ignore:
> Exception: sqlite3.DatabaseError: database disk image is malformed
> 
> I've seen this twice now:
> 
> https://autobuilder.yoctoproject.org/valkyrie/?#/builders/23/builds/1421
> https://autobuilder.yoctoproject.org/valkyrie/?#/builders/35/builds/1343
> 
> Steve
> 
> On Thu, Apr 10, 2025 at 9:46 AM Peter Marko via lists.openembedded.org
> <peter.marko=siemens.com@lists.openembedded.org> wrote:
> >
> > From: Peter Marko <peter.marko@siemens.com>
> >
> > NVD responses changed to an invalid json between:
> > * April 5, 2025 at 3:03:44 AM GMT+2
> > * April 5, 2025 at 4:19:48 AM GMT+2
> >
> > The last response is since then in format
> > {
> >   "resultsPerPage": 625,
> >   "startIndex": 288000,
> >   "totalResults": 288625,
> >   "format": "NVD_CVE",
> >   "version": "2.0",
> >   "timestamp": "2025-04-07T07:17:17.534",
> >   "vulnerabilities": [
> >     {...},
> >     ...
> >     {...},
> >   ]
> > }
> >
> > Json does not allow trailing , in responses, that is json5 format.
> > So cve-update-nvd2-native do_Fetch task fails with log backtrace ending:
> >
> > ...
> > File: '/builds/ccp/meta-siemens/projects/ccp/../../poky/meta/recipes-
> core/meta/cve-update-nvd2-native.bb', lineno: 234, function: update_db_file
> >      0230:            if raw_data is None:
> >      0231:                # We haven't managed to download data
> >      0232:                return False
> >      0233:
> >  *** 0234:            data = json.loads(raw_data)
> >      0235:
> >      0236:            index = data["startIndex"]
> >      0237:            total = data["totalResults"]
> >      0238:            per_page = data["resultsPerPage"]
> > ...
> > File: '/usr/lib/python3.11/json/decoder.py', lineno: 355, function: raw_decode
> >      0351:        """
> >      0352:        try:
> >      0353:            obj, end = self.scan_once(s, idx)
> >      0354:        except StopIteration as err:
> >  *** 0355:            raise JSONDecodeError("Expecting value", s, err.value) from
> None
> >      0356:        return obj, end
> > Exception: json.decoder.JSONDecodeError: Expecting value: line 1 column
> 1442633 (char 1442632)
> > ...
> >
> > There was no announcement about json format of API v2.0 by nvd.
> > Also this happens only if whole database is queried (database update is
> > fine, even when multiple pages as queried).
> > And lastly it's only the cve list, all other lists inside are fine.
> > So this looks like a bug in NVD 2.0 introduced with some update.
> >
> > Patch this with simple character deletion for now and let's monitor the
> > situation and possibly switch to json5 in the future.
> > Note that there is no native json5 support in python, we'd have to use
> > one of external libraries for it.
> >
> > (From OE-Core rev: 6e526327f5c9e739ac7981e4a43a4ce53a908945)
> >
> > Signed-off-by: Peter Marko <peter.marko@siemens.com>
> > Signed-off-by: Mathieu Dubois-Briand <mathieu.dubois-briand@bootlin.com>
> > Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> > ---
> >  meta/recipes-core/meta/cve-update-nvd2-native.bb | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/meta/recipes-core/meta/cve-update-nvd2-native.bb b/meta/recipes-
> core/meta/cve-update-nvd2-native.bb
> > index 99acead18d..74c780493d 100644
> > --- a/meta/recipes-core/meta/cve-update-nvd2-native.bb
> > +++ b/meta/recipes-core/meta/cve-update-nvd2-native.bb
> > @@ -231,6 +231,11 @@ def update_db_file(db_tmp_file, d, database_time):
> >                  # We haven't managed to download data
> >                  return False
> >
> > +            # hack for json5 style responses
> > +            if raw_data[-3:] == ',]}':
> > +                bb.note("Removing trailing ',' from nvd response")
> > +                raw_data = raw_data[:-3] + ']}'
> > +
> >              data = json.loads(raw_data)
> >
> >              index = data["startIndex"]
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#214682): https://lists.openembedded.org/g/openembedded-
> core/message/214682
> > Mute This Topic: https://lists.openembedded.org/mt/112195016/3620601
> > Group Owner: openembedded-core+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub
> [steve@sakoman.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
diff mbox series

Patch

diff --git a/meta/recipes-core/meta/cve-update-nvd2-native.bb b/meta/recipes-core/meta/cve-update-nvd2-native.bb
index 99acead18d..74c780493d 100644
--- a/meta/recipes-core/meta/cve-update-nvd2-native.bb
+++ b/meta/recipes-core/meta/cve-update-nvd2-native.bb
@@ -231,6 +231,11 @@  def update_db_file(db_tmp_file, d, database_time):
                 # We haven't managed to download data
                 return False
 
+            # hack for json5 style responses
+            if raw_data[-3:] == ',]}':
+                bb.note("Removing trailing ',' from nvd response")
+                raw_data = raw_data[:-3] + ']}'
+
             data = json.loads(raw_data)
 
             index = data["startIndex"]