| Message ID | 20251110171337.754568-1-fbberton@gmail.com |
|---|---|
| Headers | show |
| Series | spdx: Add software file externalRef support | expand |
Hello Fabio, thanks for your comments and patch! On Mon, 2025-11-10 at 17:13 +0000, Fabio Berton wrote: > Our first idea was to use 'downloadLocation', but what I understand is > that this is a package property, and files fetched from the layer are > 'software_File' type. Looking at the SPDX spec, it appears we could use > the 'ExternalRef' for this purpose. I'm not to familiar with the SPDX spec yet, but adding individual files entries as `ExternalRef` instead of `downloadLocation` to a recipes spdx sounds reasonable. I think in the long term adding a `SPDXRef-Layer-xyz` entry per layer with a `downloadLocation` pointing to the subpath of the layer inside a git repo. I'm not quite shure if it would be possible to formulate a dependency on a file contained within a different SPDXRef, e.g. ``` SPDXRef-Layer-xyz:recipes-core/base-files/base-files/fstab ``` or if we'd have to create a SPDXRef Item for each file within a layer in order to reference it properly. That would make it even more verbose. The approach of having a layer as an independent SPDXRef would mean getting the git revision etc. for that layer would run only once per build and not per `file://` entry in SRC_URI. > > The idea is to have two options to add this information: one to add the > full path of a file, and another to add the git information IMO the full path to the file is unneeded information, if the file is solely available locally a `NOASSERTION` would be appropriate. > > Should I add a variable like 'SPDX_FILE_LOCATION_GIT_REMOTE_<layername> > = "remote_name"' to set a specific remote for each layer? Would setting > the git remote be sufficient to cover most cases? In my experimentation I removed the per-layer setting again because tracking the `vardeps` for the `do_create_spdx` get's more complicated with per-layer variables. > Sincerely Daniel Wagenknecht
On 11/12/25 16:59, Daniel Wagenknecht wrote: > Hello Fabio, > > thanks for your comments and patch! > > On Mon, 2025-11-10 at 17:13 +0000, Fabio Berton wrote: >> Our first idea was to use 'downloadLocation', but what I understand is >> that this is a package property, and files fetched from the layer are >> 'software_File' type. Looking at the SPDX spec, it appears we could use >> the 'ExternalRef' for this purpose. > > I'm not to familiar with the SPDX spec yet, but adding individual files > entries as `ExternalRef` instead of `downloadLocation` to a recipes > spdx sounds reasonable. > > I think in the long term adding a `SPDXRef-Layer-xyz` entry per layer > with a `downloadLocation` pointing to the subpath of the layer inside a > git repo. I'm not quite shure if it would be possible to formulate a > dependency on a file contained within a different SPDXRef, e.g. > ``` > SPDXRef-Layer-xyz:recipes-core/base-files/base-files/fstab > ``` > or if we'd have to create a SPDXRef Item for each file within a layer > in order to reference it properly. That would make it even more > verbose. Hi Daniel, Yes, we should have a way to get Git information at parser time to avoid calling for every `file://`. But I don't know exactly how to do this, because if we need to add a variable in all layers, and of course, we can't do this, we still need a fallback if the variable doesn't exist. In my case, we don't use OE-Core from https://git.openembedded.org/, we have all layers in an internal infrastructure, so we need to change all variables to point to our fork. My idea to use functions from 'oe.buildcfg' is to get Git information from the layer and not from variables, it doesn't matter if it's a fork or not. But I didn't cover the case where different remotes are used. I know that when using `repo` to manage Git repositories, it's common to use different remotes, e.g., https://github.com/Freescale/fsl-community-bsp-platform/blob/scarthgap/default.xml. Honestly, I don't know if adding a variable to set the "downloadLocation" will be better or not. > > The approach of having a layer as an independent SPDXRef would mean > getting the git revision etc. for that layer would run only once per > build and not per `file://` entry in SRC_URI. >> >> The idea is to have two options to add this information: one to add the >> full path of a file, and another to add the git information > > IMO the full path to the file is unneeded information, if the file is > solely available locally a `NOASSERTION` would be appropriate. The 'path' option is to not use the 'Git' information, e.g., when using a tarball and not a Git repo. The 'locator' will be '/home/user/src/openembedded-core/meta/recipes-core/busybox/files/syslog' instead of 'git+[https://git.openembedded.org/openembedded-core@ac5d9579a0db63b54bbebb5015de2ae860a462bf#meta/recipes-core/busybox/files/syslog](https://git.openembedded.org/openembedded-core@ac5d9579a0db63b54bbebb5015de2ae860a462bf#meta/recipes-core/busybox/files/syslog)' > >> >> Should I add a variable like 'SPDX_FILE_LOCATION_GIT_REMOTE_<layername> >> = "remote_name"' to set a specific remote for each layer? Would setting >> the git remote be sufficient to cover most cases? > In my experimentation I removed the per-layer setting again because > tracking the `vardeps` for the `do_create_spdx` get's more complicated > with per-layer variables. Uhmm, good point, I didn't think about `vardeps`. >> > Sincerely > Daniel Wagenknecht
Hi all, When starting to test SPDX 3.0 in our projects, we noticed that it would be necessary to have more information for files fetched via the 'file://' protocol, such as the full path of the file or a URL with git information. Our first idea was to use 'downloadLocation', but what I understand is that this is a package property, and files fetched from the layer are 'software_File' type. Looking at the SPDX spec, it appears we could use the 'ExternalRef' for this purpose. The idea is to have two options to add this information: one to add the full path of a file, and another to add the git information 'git+https://host/repo@commit#path/to/file'. The information is added as an 'externalRef' and can be configured using these types: https://spdx.github.io/spdx-spec/v3.0.1/model/Core/Vocabularies/ExternalRefType/ When using the 'path' option, something like this is added: ``` "externalRef": [ { "type": "ExternalRef", "externalRefType": "sourceArtifact", "locator": [ "/home/user/src/openembedded-core/meta/recipes-core/busybox/files/syslog" ] } ], ``` This option is non-reproducible, if the build path changes, the SPDX will be different. And with the 'git' option: ``` "externalRef": [ { "type": "ExternalRef", "externalRefType": "sourceArtifact", "locator": [ "git+https://git.openembedded.org/openembedded-core@ac5d9579a0db63b54bbebb5015de2ae860a462bf#meta/recipes-core/busybox/files/syslog" ] } ], ``` The implementation is not completely finished, but since there is already a thread on this subject, https://lists.openembedded.org/g/openembedded-core/topic/thoughts_on_spdx_for_files/116135395, I wanted to share my work and get opinions on how to improve this implementation. My questions are: Is the 'externalRef' the right way to add the information in the spdx file? I'm using the 'choices' type, but this only works when inheriting typecheck.bbclass, and this bbclass is not inherited when using OE-Core with 'nodistro'. Can this 'choices' type be used here? I still need to find a way to cache Git layer information to avoid calling the 'oe.buildcfg' function every time. Maybe it would be possible to use something like this: https://git.openembedded.org/openembedded-core/tree/meta/classes/metadata_scm.bbclass to get information at parsing time. However, this information is only needed when using SPDX_FILE_LOCATION with the git option, and for all layers. Any idea here? For the git option, we need to get a git remote, but there can be more than one remote per layer, so we need a way to configure these remotes. In this first implementation, I'm assuming that all layers use the same remote, and the remote name can be configured, which fits our current use case. Should I add a variable like 'SPDX_FILE_LOCATION_GIT_REMOTE_<layername> = "remote_name"' to set a specific remote for each layer? Would setting the git remote be sufficient to cover most cases? Any feedback or suggestions would be appreciated. Best regards, Fabio Fabio Berton (1): spdx: Add software file externalRef support meta/classes/create-spdx-3.0.bbclass | 24 +++++++ meta/lib/oe/sbom30.py | 14 ++++- meta/lib/oe/spdx30_tasks.py | 93 ++++++++++++++++++++++++++++ 3 files changed, 130 insertions(+), 1 deletion(-)