diff mbox series

[RFC,08/21] utils: add Go mod h1 checksum support

Message ID 20241220112613.22647-9-stefan.herbrechtsmeier-oss@weidmueller.com
State Accepted, archived
Commit deefb01592f717efba68e3997fefd04dc7611d88
Headers show
Series Concept for tightly coupled package manager (Node.js, Go, Rust) | expand

Commit Message

Stefan Herbrechtsmeier Dec. 20, 2024, 11:25 a.m. UTC
From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>

Add support for the Go mod h1 hash. The hash is
based on the Go dirhash package. The package
defines hashes over directory trees and is uses
for Go mod files and zip archives.

Signed-off-by: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
---

 lib/bb/fetch2/__init__.py |  2 +-
 lib/bb/utils.py           | 25 +++++++++++++++++++++++++
 2 files changed, 26 insertions(+), 1 deletion(-)

Comments

Richard Purdie Dec. 23, 2024, 10:01 a.m. UTC | #1
On Fri, 2024-12-20 at 12:25 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote:
> From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> 
> Add support for the Go mod h1 hash. The hash is
> based on the Go dirhash package. The package
> defines hashes over directory trees and is uses
> for Go mod files and zip archives.
> 
> Signed-off-by: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
> ---
> 
>  lib/bb/fetch2/__init__.py |  2 +-
>  lib/bb/utils.py           | 25 +++++++++++++++++++++++++
>  2 files changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
> index 7d8f71b20..0c2d6d73e 100644
> --- a/lib/bb/fetch2/__init__.py
> +++ b/lib/bb/fetch2/__init__.py
> @@ -34,7 +34,7 @@ _revisions_cache = bb.checksum.RevisionsCache()
>  
>  logger = logging.getLogger("BitBake.Fetcher")
>  
> -CHECKSUM_LIST = [ "md5", "sha256", "sha1", "sha384", "sha512" ]
> +CHECKSUM_LIST = [ "h1", "md5", "sha256", "sha1", "sha384", "sha512" ]
>  SHOWN_CHECKSUM_LIST = ["sha256"]
>  
>  class BBFetchException(Exception):
> 

The others are all generic checksum formats so I'm wondering if we need
to indicate this one is go specific go-h1/goh1/go_h1?

Cheers,

Richard
Stefan Herbrechtsmeier Jan. 2, 2025, 8:27 a.m. UTC | #2
Am 23.12.2024 um 11:01 schrieb Richard Purdie:
> On Fri, 2024-12-20 at 12:25 +0100, Stefan Herbrechtsmeier via lists.openembedded.org wrote:
>> From: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
>>
>> Add support for the Go mod h1 hash. The hash is
>> based on the Go dirhash package. The package
>> defines hashes over directory trees and is uses
>> for Go mod files and zip archives.
>>
>> Signed-off-by: Stefan Herbrechtsmeier <stefan.herbrechtsmeier@weidmueller.com>
>> ---
>>
>>   lib/bb/fetch2/__init__.py |  2 +-
>>   lib/bb/utils.py           | 25 +++++++++++++++++++++++++
>>   2 files changed, 26 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
>> index 7d8f71b20..0c2d6d73e 100644
>> --- a/lib/bb/fetch2/__init__.py
>> +++ b/lib/bb/fetch2/__init__.py
>> @@ -34,7 +34,7 @@ _revisions_cache = bb.checksum.RevisionsCache()
>>   
>>   logger = logging.getLogger("BitBake.Fetcher")
>>   
>> -CHECKSUM_LIST = [ "md5", "sha256", "sha1", "sha384", "sha512" ]
>> +CHECKSUM_LIST = [ "h1", "md5", "sha256", "sha1", "sha384", "sha512" ]
>>   SHOWN_CHECKSUM_LIST = ["sha256"]
>>   
>>   class BBFetchException(Exception):
>>
> The others are all generic checksum formats so I'm wondering if we need
> to indicate this one is go specific go-h1/goh1/go_h1?

It looks like the go dirhash (h1) [1] is different to the dirhash 
standard [2] and the name should be go specific. The SRC_URI parameters 
use composed words without separator (goh1sum). What do you prefer:

SRC_URI[package_name-1.2.3.go-h1sum]
SRC_URI[package_name-1.2.3.goh1sum]
SRC_URI[package_name-1.2.3.go_h1sum]

[1] https://pkg.go.dev/golang.org/x/mod/sumdb/dirhash
[2] https://github.com/andhus/dirhash
diff mbox series

Patch

diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
index 7d8f71b20..0c2d6d73e 100644
--- a/lib/bb/fetch2/__init__.py
+++ b/lib/bb/fetch2/__init__.py
@@ -34,7 +34,7 @@  _revisions_cache = bb.checksum.RevisionsCache()
 
 logger = logging.getLogger("BitBake.Fetcher")
 
-CHECKSUM_LIST = [ "md5", "sha256", "sha1", "sha384", "sha512" ]
+CHECKSUM_LIST = [ "h1", "md5", "sha256", "sha1", "sha384", "sha512" ]
 SHOWN_CHECKSUM_LIST = ["sha256"]
 
 class BBFetchException(Exception):
diff --git a/lib/bb/utils.py b/lib/bb/utils.py
index e722f9113..131766e33 100644
--- a/lib/bb/utils.py
+++ b/lib/bb/utils.py
@@ -585,6 +585,31 @@  def sha512_file(filename):
     import hashlib
     return _hasher(hashlib.sha512(), filename)
 
+def h1_file(filename):
+    """
+    Return the hex string representation of the Go mod h1 checksum of the
+    filename. The Go mod h1 checksum uses the Go dirhash package. The package
+    defines hashes over directory trees and is used by go mod for mod files and
+    zip archives.
+    """
+    import hashlib
+    import zipfile
+
+    lines = []
+    if zipfile.is_zipfile(filename):
+        with zipfile.ZipFile(filename) as archive:
+            for fn in sorted(archive.namelist()):
+                method = hashlib.sha256()
+                method.update(archive.read(fn))
+                hash = method.hexdigest()
+                lines.append("%s  %s\n" % (hash, fn))
+    else:
+        hash = _hasher(hashlib.sha256(), filename)
+        lines.append("%s  go.mod\n" % hash)
+    method = hashlib.sha256()
+    method.update("".join(lines).encode('utf-8'))
+    return method.hexdigest()
+
 def preserved_envvars_exported():
     """Variables which are taken from the environment and placed in and exported
     from the metadata"""