diff mbox series

[2/2] spdx3: support SBOM compression based on SPDX_SBOM_EXT

Message ID 20260512-sbom-zstd-support-v1-2-93273381d548@bootlin.com
State Changes Requested
Headers show
Series spdx3: support SBOM compression with Zstd | expand

Commit Message

Jérémie Dautheribes May 12, 2026, 5:01 p.m. UTC
Add support for optional zstd compression for all types of SBOMs,
including:
  - image SBOM
  - recipe SBOM
  - SDK SBOM

Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".

Co-authored-by: Benjamin Robin (Schneider Electric) <benjamin.robin@bootlin.com>
Signed-off-by: Jérémie Dautheribes (Schneider Electric) <jeremie.dautheribes@bootlin.com>
---
 meta/classes/create-spdx-3.0.bbclass |  3 ++-
 meta/lib/oe/sbom30.py                | 11 +++++++++--
 2 files changed, 11 insertions(+), 3 deletions(-)

Comments

Richard Purdie May 12, 2026, 7:54 p.m. UTC | #1
On Tue, 2026-05-12 at 19:01 +0200, Jérémie Dautheribes via lists.openembedded.org wrote:
> 
>      objset.objects.add(objset.doc)
> -    with dest.open("wb") as f:
> -        serializer.write(objset, f, force_at_graph=True)
> +
> +    if dest.name.endswith(".zst"):
> +        num_threads = int(d.getVar("BB_NUMBER_THREADS"))
> +        with bb.compress.zstd.open(dest, "w", num_threads=num_threads) as f:

This should be derived from PARALLEL_MAKE, not BB_NUMBER_THREADS.

The latter is how many tasks bitbake runs in parallel, the former is
how many threads the task should have.

There is a function somewhere which extracts the number of jobs from
PARALLEL_MAKE...

Cheers,

Richard
Joshua Watt May 12, 2026, 10:27 p.m. UTC | #2
On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
lists.openembedded.org
<jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
>
> Add support for optional zstd compression for all types of SBOMs,
> including:
>   - image SBOM
>   - recipe SBOM
>   - SDK SBOM
>
> Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".
>
> Co-authored-by: Benjamin Robin (Schneider Electric) <benjamin.robin@bootlin.com>
> Signed-off-by: Jérémie Dautheribes (Schneider Electric) <jeremie.dautheribes@bootlin.com>
> ---
>  meta/classes/create-spdx-3.0.bbclass |  3 ++-
>  meta/lib/oe/sbom30.py                | 11 +++++++++--
>  2 files changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
> index 785edb9865..6cf8fa4688 100644
> --- a/meta/classes/create-spdx-3.0.bbclass
> +++ b/meta/classes/create-spdx-3.0.bbclass
> @@ -75,7 +75,8 @@ SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base variable that describes how to \
>              SPDX 3 spec. Optional but recommended"
>
>  SPDX_SBOM_EXT ??= ".spdx.json"
> -SPDX_SBOM_EXT[doc] = "SBOM file extension name."
> +SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
> +    If it ends with '.zst', SBOMs are automatically compressed using Zstd."
>
>  # Agents
>  #   Bitbake variables can be used to describe an SPDX Agent that may be used
> diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
> index 0f1f9281ad..2184c1a07f 100644
> --- a/meta/lib/oe/sbom30.py
> +++ b/meta/lib/oe/sbom30.py
> @@ -1036,8 +1036,15 @@ def write_jsonld_doc(d, objset, dest):
>          serializer = oe.spdx30.JSONLDInlineSerializer()
>
>      objset.objects.add(objset.doc)
> -    with dest.open("wb") as f:
> -        serializer.write(objset, f, force_at_graph=True)
> +
> +    if dest.name.endswith(".zst"):

I'm not sure I like this detection mechanism; I think we usually do
something more explicit for compression rather than relying on the
suffix in other places?

> +        num_threads = int(d.getVar("BB_NUMBER_THREADS"))

The API is oe.utils.parallel_make_argument()

> +        with bb.compress.zstd.open(dest, "w", num_threads=num_threads) as f:
> +            serializer.write(objset, f, force_at_graph=True)
> +    else:
> +        with dest.open("wb") as f:
> +            serializer.write(objset, f, force_at_graph=True)
> +
>      objset.objects.remove(objset.doc)
>
>
>
> --
> 2.54.0
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#236897): https://lists.openembedded.org/g/openembedded-core/message/236897
> Mute This Topic: https://lists.openembedded.org/mt/119282964/3616693
> Group Owner: openembedded-core+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [JPEWhacker@gmail.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Joshua Watt May 12, 2026, 10:29 p.m. UTC | #3
On Tue, May 12, 2026 at 4:27 PM Joshua Watt <jpewhacker@gmail.com> wrote:
>
> On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
> lists.openembedded.org
> <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
> >
> > Add support for optional zstd compression for all types of SBOMs,
> > including:
> >   - image SBOM
> >   - recipe SBOM
> >   - SDK SBOM

We should perhaps also implement decompression when reading in
documents, so that the intermediate documents are compressed as well;
if we are allowing the final documents to be compressed, I don't see a
compelling reason why we wouldn't just compress all of them.

> >
> > Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".
> >
> > Co-authored-by: Benjamin Robin (Schneider Electric) <benjamin.robin@bootlin.com>
> > Signed-off-by: Jérémie Dautheribes (Schneider Electric) <jeremie.dautheribes@bootlin.com>
> > ---
> >  meta/classes/create-spdx-3.0.bbclass |  3 ++-
> >  meta/lib/oe/sbom30.py                | 11 +++++++++--
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
> > index 785edb9865..6cf8fa4688 100644
> > --- a/meta/classes/create-spdx-3.0.bbclass
> > +++ b/meta/classes/create-spdx-3.0.bbclass
> > @@ -75,7 +75,8 @@ SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base variable that describes how to \
> >              SPDX 3 spec. Optional but recommended"
> >
> >  SPDX_SBOM_EXT ??= ".spdx.json"
> > -SPDX_SBOM_EXT[doc] = "SBOM file extension name."
> > +SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
> > +    If it ends with '.zst', SBOMs are automatically compressed using Zstd."
> >
> >  # Agents
> >  #   Bitbake variables can be used to describe an SPDX Agent that may be used
> > diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
> > index 0f1f9281ad..2184c1a07f 100644
> > --- a/meta/lib/oe/sbom30.py
> > +++ b/meta/lib/oe/sbom30.py
> > @@ -1036,8 +1036,15 @@ def write_jsonld_doc(d, objset, dest):
> >          serializer = oe.spdx30.JSONLDInlineSerializer()
> >
> >      objset.objects.add(objset.doc)
> > -    with dest.open("wb") as f:
> > -        serializer.write(objset, f, force_at_graph=True)
> > +
> > +    if dest.name.endswith(".zst"):
>
> I'm not sure I like this detection mechanism; I think we usually do
> something more explicit for compression rather than relying on the
> suffix in other places?
>
> > +        num_threads = int(d.getVar("BB_NUMBER_THREADS"))
>
> The API is oe.utils.parallel_make_argument()
>
> > +        with bb.compress.zstd.open(dest, "w", num_threads=num_threads) as f:
> > +            serializer.write(objset, f, force_at_graph=True)
> > +    else:
> > +        with dest.open("wb") as f:
> > +            serializer.write(objset, f, force_at_graph=True)
> > +
> >      objset.objects.remove(objset.doc)
> >
> >
> >
> > --
> > 2.54.0
> >
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#236897): https://lists.openembedded.org/g/openembedded-core/message/236897
> > Mute This Topic: https://lists.openembedded.org/mt/119282964/3616693
> > Group Owner: openembedded-core+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [JPEWhacker@gmail.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
Benjamin Robin May 13, 2026, 7:07 a.m. UTC | #4
Hello Joshua,

On Wednesday, May 13, 2026 at 12:29 AM, Joshua Watt wrote:
> On Tue, May 12, 2026 at 4:27 PM Joshua Watt <jpewhacker@gmail.com> wrote:
> >
> > On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
> > lists.openembedded.org
> > <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
> > >
> > > Add support for optional zstd compression for all types of SBOMs,
> > > including:
> > >   - image SBOM
> > >   - recipe SBOM
> > >   - SDK SBOM
> 
> We should perhaps also implement decompression when reading in
> documents, so that the intermediate documents are compressed as well;
> if we are allowing the final documents to be compressed, I don't see a
> compelling reason why we wouldn't just compress all of them.

I am not sure this is a good idea performance-wise, mainly because
Yocto is currently relying on an external program to compress and
decompress. We need to wait for Python 3.14 to be the minimum required
Python version to be able to use the native implementation of zstd.
Indeed intermediate documents are pretty "small".

Also with SPDX2, intermediate documents were not compressed.

The goal is not to reduce the size of the build directory, but only
the size of deployed artifacts.
Benjamin Robin May 13, 2026, 7:18 a.m. UTC | #5
On Wednesday, May 13, 2026 at 12:27 AM, Joshua Watt wrote:
> On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
> lists.openembedded.org
> <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
> >
> > Add support for optional zstd compression for all types of SBOMs,
> > including:
> >   - image SBOM
> >   - recipe SBOM
> >   - SDK SBOM
> >
> > Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".
> >
> > Co-authored-by: Benjamin Robin (Schneider Electric) <benjamin.robin@bootlin.com>
> > Signed-off-by: Jérémie Dautheribes (Schneider Electric) <jeremie.dautheribes@bootlin.com>
> > ---
> >  meta/classes/create-spdx-3.0.bbclass |  3 ++-
> >  meta/lib/oe/sbom30.py                | 11 +++++++++--
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
> > index 785edb9865..6cf8fa4688 100644
> > --- a/meta/classes/create-spdx-3.0.bbclass
> > +++ b/meta/classes/create-spdx-3.0.bbclass
> > @@ -75,7 +75,8 @@ SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base variable that describes how to \
> >              SPDX 3 spec. Optional but recommended"
> >
> >  SPDX_SBOM_EXT ??= ".spdx.json"
> > -SPDX_SBOM_EXT[doc] = "SBOM file extension name."
> > +SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
> > +    If it ends with '.zst', SBOMs are automatically compressed using Zstd."
> >
> >  # Agents
> >  #   Bitbake variables can be used to describe an SPDX Agent that may be used
> > diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
> > index 0f1f9281ad..2184c1a07f 100644
> > --- a/meta/lib/oe/sbom30.py
> > +++ b/meta/lib/oe/sbom30.py
> > @@ -1036,8 +1036,15 @@ def write_jsonld_doc(d, objset, dest):
> >          serializer = oe.spdx30.JSONLDInlineSerializer()
> >
> >      objset.objects.add(objset.doc)
> > -    with dest.open("wb") as f:
> > -        serializer.write(objset, f, force_at_graph=True)
> > +
> > +    if dest.name.endswith(".zst"):
> 
> I'm not sure I like this detection mechanism; I think we usually do
> something more explicit for compression rather than relying on the
> suffix in other places?

Do you have an example somewhere in the code base?
I am not opposed to use a variable like `SPDX_COMPRESSED_SBOM`
and to have the following code "duplicated" (or create a function for it):

sbom_file_extension = ".spdx.json.zst" if compressed_sbom else ".spdx.json"

The goal was to simplify the code, and to allow user flexibility.
The user could choose any other extension (even if it violate the
ISO standard extension for SPDX documents).

> 
> > +        num_threads = int(d.getVar("BB_NUMBER_THREADS"))
> 
> The API is oe.utils.parallel_make_argument()

Thanks, but for information all code instance calling
`bb.compress.zstd.open` use the BB_NUMBER_THREADS variable :)

So maybe this should be fixed by another patch?

> 
> > +        with bb.compress.zstd.open(dest, "w", num_threads=num_threads) as f:
> > +            serializer.write(objset, f, force_at_graph=True)
> > +    else:
> > +        with dest.open("wb") as f:
> > +            serializer.write(objset, f, force_at_graph=True)
> > +
> >      objset.objects.remove(objset.doc)
> >
> >
> >
> > --
> > 2.54.0
> >
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#236897): https://lists.openembedded.org/g/openembedded-core/message/236897
> > Mute This Topic: https://lists.openembedded.org/mt/119282964/3616693
> > Group Owner: openembedded-core+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [JPEWhacker@gmail.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
>
Jérémie Dautheribes May 13, 2026, 7:35 a.m. UTC | #6
Hello Joshua, Benjamin,

On 13/05/2026 09:07, Benjamin Robin wrote:
> Hello Joshua,
> 
> On Wednesday, May 13, 2026 at 12:29 AM, Joshua Watt wrote:
>> On Tue, May 12, 2026 at 4:27 PM Joshua Watt <jpewhacker@gmail.com> wrote:
>>>
>>> On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
>>> lists.openembedded.org
>>> <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
>>>>
>>>> Add support for optional zstd compression for all types of SBOMs,
>>>> including:
>>>>    - image SBOM
>>>>    - recipe SBOM
>>>>    - SDK SBOM
>>
>> We should perhaps also implement decompression when reading in
>> documents, so that the intermediate documents are compressed as well;
>> if we are allowing the final documents to be compressed, I don't see a
>> compelling reason why we wouldn't just compress all of them.
> 
> I am not sure this is a good idea performance-wise, mainly because
> Yocto is currently relying on an external program to compress and
> decompress. We need to wait for Python 3.14 to be the minimum required
> Python version to be able to use the native implementation of zstd.
> Indeed intermediate documents are pretty "small".
> 
> Also with SPDX2, intermediate documents were not compressed.
> 
> The goal is not to reduce the size of the build directory, but only
> the size of deployed artifacts.

In addition to what Benjamin already explained, our typical use-case is
storing the deployed SBOMs to an external location (typically a cloud
provider) and we encountered some cases where the uncompressed image 
SBOM size is ~180 MB.

We could compress them outside of Yocto of course, but we thought it 
would be great to have this feature directly in Yocto, especially since 
it was already supported in the SPDX 2.2 implementation.

Best regards,
Jérémie Dautheribes May 13, 2026, 7:47 a.m. UTC | #7
Hello Joshua,

On 13/05/2026 00:27, Joshua Watt wrote:
> On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
> lists.openembedded.org
> <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
>>
>> Add support for optional zstd compression for all types of SBOMs,
>> including:
>>    - image SBOM
>>    - recipe SBOM
>>    - SDK SBOM
>>
>> Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".
>>
>> Co-authored-by: Benjamin Robin (Schneider Electric) <benjamin.robin@bootlin.com>
>> Signed-off-by: Jérémie Dautheribes (Schneider Electric) <jeremie.dautheribes@bootlin.com>
>> ---
>>   meta/classes/create-spdx-3.0.bbclass |  3 ++-
>>   meta/lib/oe/sbom30.py                | 11 +++++++++--
>>   2 files changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
>> index 785edb9865..6cf8fa4688 100644
>> --- a/meta/classes/create-spdx-3.0.bbclass
>> +++ b/meta/classes/create-spdx-3.0.bbclass
>> @@ -75,7 +75,8 @@ SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base variable that describes how to \
>>               SPDX 3 spec. Optional but recommended"
>>
>>   SPDX_SBOM_EXT ??= ".spdx.json"
>> -SPDX_SBOM_EXT[doc] = "SBOM file extension name."
>> +SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
>> +    If it ends with '.zst', SBOMs are automatically compressed using Zstd."
>>
>>   # Agents
>>   #   Bitbake variables can be used to describe an SPDX Agent that may be used
>> diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
>> index 0f1f9281ad..2184c1a07f 100644
>> --- a/meta/lib/oe/sbom30.py
>> +++ b/meta/lib/oe/sbom30.py
>> @@ -1036,8 +1036,15 @@ def write_jsonld_doc(d, objset, dest):
>>           serializer = oe.spdx30.JSONLDInlineSerializer()
>>
>>       objset.objects.add(objset.doc)
>> -    with dest.open("wb") as f:
>> -        serializer.write(objset, f, force_at_graph=True)
>> +
>> +    if dest.name.endswith(".zst"):
> 
> I'm not sure I like this detection mechanism; I think we usually do
> something more explicit for compression rather than relying on the
> suffix in other places?

Maybe we should then introduce a SPDX_COMPRESSED_SBOM boolean variable,
which would be used by SPDX_SBOM_EXT_SUFFIX to determine whether ".zst"
is appended to the SBOM file name or not. Then, we could check in the
`write_jsonld_doc` function whether compression is enabled based on this
SPDX_COMPRESSED_SBOM variable.

What do you think? Do you have any other suggestions?

Best regards,
Peter Kjellerstedt May 13, 2026, 8:02 a.m. UTC | #8
> -----Original Message-----
> From: openembedded-core@lists.openembedded.org <openembedded-core@lists.openembedded.org> On Behalf Of Jérémie Dautheribes via lists.openembedded.org
> Sent: den 13 maj 2026 09:47
> To: Joshua Watt <jpewhacker@gmail.com>
> Cc: openembedded-core@lists.openembedded.org; miquel.raynal@bootlin.com; thomas.petazzoni@bootlin.com; benjamin.robin@bootlin.com
> Subject: Re: [OE-core][PATCH 2/2] spdx3: support SBOM compression based on SPDX_SBOM_EXT
> 
> Hello Joshua,
> 
> On 13/05/2026 00:27, Joshua Watt wrote:
> > On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via lists.openembedded.org <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
> >>
> >> Add support for optional zstd compression for all types of SBOMs,
> >> including:
> >>    - image SBOM
> >>    - recipe SBOM
> >>    - SDK SBOM
> >>
> >> Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".
> >>
> >> Co-authored-by: Benjamin Robin (Schneider Electric) <benjamin.robin@bootlin.com>
> >> Signed-off-by: Jérémie Dautheribes (Schneider Electric) <jeremie.dautheribes@bootlin.com>
> >> ---
> >>   meta/classes/create-spdx-3.0.bbclass |  3 ++-
> >>   meta/lib/oe/sbom30.py                | 11 +++++++++--
> >>   2 files changed, 11 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
> >> index 785edb9865..6cf8fa4688 100644
> >> --- a/meta/classes/create-spdx-3.0.bbclass
> >> +++ b/meta/classes/create-spdx-3.0.bbclass
> >> @@ -75,7 +75,8 @@ SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base variable that describes how to \
> >>               SPDX 3 spec. Optional but recommended"
> >>
> >>   SPDX_SBOM_EXT ??= ".spdx.json"
> >> -SPDX_SBOM_EXT[doc] = "SBOM file extension name."
> >> +SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
> >> +    If it ends with '.zst', SBOMs are automatically compressed using Zstd."
> >>
> >>   # Agents
> >>   #   Bitbake variables can be used to describe an SPDX Agent that may be used
> >> diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
> >> index 0f1f9281ad..2184c1a07f 100644
> >> --- a/meta/lib/oe/sbom30.py
> >> +++ b/meta/lib/oe/sbom30.py
> >> @@ -1036,8 +1036,15 @@ def write_jsonld_doc(d, objset, dest):
> >>           serializer = oe.spdx30.JSONLDInlineSerializer()
> >>
> >>       objset.objects.add(objset.doc)
> >> -    with dest.open("wb") as f:
> >> -        serializer.write(objset, f, force_at_graph=True)
> >> +
> >> +    if dest.name.endswith(".zst"):
> >
> > I'm not sure I like this detection mechanism; I think we usually do
> > something more explicit for compression rather than relying on the
> > suffix in other places?
> 
> Maybe we should then introduce a SPDX_COMPRESSED_SBOM boolean variable,
> which would be used by SPDX_SBOM_EXT_SUFFIX to determine whether ".zst"
> is appended to the SBOM file name or not. Then, we could check in the
> `write_jsonld_doc` function whether compression is enabled based on this
> SPDX_COMPRESSED_SBOM variable.
> 
> What do you think? Do you have any other suggestions?

If you use something like:

SPDX_COMPRESSION = "zstd"

then you make it more future proof if someone wants to add support for 
some other compression format.

> 
> Best regards,
> --
> Jérémie Dautheribes, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

//Peter
Jérémie Dautheribes May 13, 2026, 8:03 a.m. UTC | #9
On 13/05/2026 09:47, Jérémie Dautheribes via lists.openembedded.org wrote:
> Hello Joshua,
> 
> On 13/05/2026 00:27, Joshua Watt wrote:
>> On Tue, May 12, 2026 at 11:02 AM Jérémie Dautheribes via
>> lists.openembedded.org
>> <jeremie.dautheribes=bootlin.com@lists.openembedded.org> wrote:
>>>
>>> Add support for optional zstd compression for all types of SBOMs,
>>> including:
>>>    - image SBOM
>>>    - recipe SBOM
>>>    - SDK SBOM
>>>
>>> Zstd compression is applied if SPDX_SBOM_EXT ends with ".zst".
>>>
>>> Co-authored-by: Benjamin Robin (Schneider Electric) 
>>> <benjamin.robin@bootlin.com>
>>> Signed-off-by: Jérémie Dautheribes (Schneider Electric) 
>>> <jeremie.dautheribes@bootlin.com>
>>> ---
>>>   meta/classes/create-spdx-3.0.bbclass |  3 ++-
>>>   meta/lib/oe/sbom30.py                | 11 +++++++++--
>>>   2 files changed, 11 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/ 
>>> create-spdx-3.0.bbclass
>>> index 785edb9865..6cf8fa4688 100644
>>> --- a/meta/classes/create-spdx-3.0.bbclass
>>> +++ b/meta/classes/create-spdx-3.0.bbclass
>>> @@ -75,7 +75,8 @@ SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base 
>>> variable that describes how to \
>>>               SPDX 3 spec. Optional but recommended"
>>>
>>>   SPDX_SBOM_EXT ??= ".spdx.json"
>>> -SPDX_SBOM_EXT[doc] = "SBOM file extension name."
>>> +SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
>>> +    If it ends with '.zst', SBOMs are automatically compressed using 
>>> Zstd."
>>>
>>>   # Agents
>>>   #   Bitbake variables can be used to describe an SPDX Agent that 
>>> may be used
>>> diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
>>> index 0f1f9281ad..2184c1a07f 100644
>>> --- a/meta/lib/oe/sbom30.py
>>> +++ b/meta/lib/oe/sbom30.py
>>> @@ -1036,8 +1036,15 @@ def write_jsonld_doc(d, objset, dest):
>>>           serializer = oe.spdx30.JSONLDInlineSerializer()
>>>
>>>       objset.objects.add(objset.doc)
>>> -    with dest.open("wb") as f:
>>> -        serializer.write(objset, f, force_at_graph=True)
>>> +
>>> +    if dest.name.endswith(".zst"):
>>
>> I'm not sure I like this detection mechanism; I think we usually do
>> something more explicit for compression rather than relying on the
>> suffix in other places?
> 
> Maybe we should then introduce a SPDX_COMPRESSED_SBOM boolean variable,
> which would be used by SPDX_SBOM_EXT_SUFFIX to determine whether ".zst"
> is appended to the SBOM file name or not. Then, we could check in the
> `write_jsonld_doc` function whether compression is enabled based on this
> SPDX_COMPRESSED_SBOM variable.
> 

After further thought, that solution would not work well since
`write_jsonld_doc` is not only used for SBOM generation.
diff mbox series

Patch

diff --git a/meta/classes/create-spdx-3.0.bbclass b/meta/classes/create-spdx-3.0.bbclass
index 785edb9865..6cf8fa4688 100644
--- a/meta/classes/create-spdx-3.0.bbclass
+++ b/meta/classes/create-spdx-3.0.bbclass
@@ -75,7 +75,8 @@  SPDX_IMPORTS[doc] = "SPDX_IMPORTS is the base variable that describes how to \
             SPDX 3 spec. Optional but recommended"
 
 SPDX_SBOM_EXT ??= ".spdx.json"
-SPDX_SBOM_EXT[doc] = "SBOM file extension name."
+SPDX_SBOM_EXT[doc] = "SBOM file extension name.\
+    If it ends with '.zst', SBOMs are automatically compressed using Zstd."
 
 # Agents
 #   Bitbake variables can be used to describe an SPDX Agent that may be used
diff --git a/meta/lib/oe/sbom30.py b/meta/lib/oe/sbom30.py
index 0f1f9281ad..2184c1a07f 100644
--- a/meta/lib/oe/sbom30.py
+++ b/meta/lib/oe/sbom30.py
@@ -1036,8 +1036,15 @@  def write_jsonld_doc(d, objset, dest):
         serializer = oe.spdx30.JSONLDInlineSerializer()
 
     objset.objects.add(objset.doc)
-    with dest.open("wb") as f:
-        serializer.write(objset, f, force_at_graph=True)
+
+    if dest.name.endswith(".zst"):
+        num_threads = int(d.getVar("BB_NUMBER_THREADS"))
+        with bb.compress.zstd.open(dest, "w", num_threads=num_threads) as f:
+            serializer.write(objset, f, force_at_graph=True)
+    else:
+        with dest.open("wb") as f:
+            serializer.write(objset, f, force_at_graph=True)
+
     objset.objects.remove(objset.doc)