Message ID | bhfQ.1727300278439114446.Ff3Z@lists.openembedded.org |
---|---|
State | New |
Headers | show |
Series | URL encoding/decoding bug with files like "serialgetty@.service" or "sshd@.service" with archiver.bbclass #bitbake #kirkstone #poky | expand |
On Wed, 25 Sept 2024 at 23:37, Brennan Coslett via lists.openembedded.org <brennan.coslett=bluerivertech.com@lists.openembedded.org> wrote: > It then can't find that file and so then the archiver doesn't package anything up. > > Hopefully bitbake-devel is the right mailing list for folks with more knowledge of this stuff? Not really. Archiver is implemented in oe-core, and so oe-core list is a better place. It also helps if you can provide specific numbered steps to reproduce the problem on poky master. Alex
I think the bug is in `bb.fetch.decodeurl`, I just noticed it because of the archiver. @Alex if that means it should still move over to the oe-core list, let me know and I can do that. Steps to replicate: 1. Checkout poky master (just replicated with 8634e46b4040b6008410b6d77fecb5cbaec7e90e) 2. source oe-init-build-env 3. append the following to the auto-generated local.conf: + require conf/distro/include/init-manager-systemd.inc + INHERIT += "archiver" + ARCHIVER_MODE[src] = "original" 4. Run "bitbake systemd-serialgetty", the source dir will be created but it wont contain anything: brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ ls tmp/deploy/sources/x86_64-poky-linux/systemd-serialgetty-1.0-r0/ brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ I think something in the regex inside the `decodeurl` function in `bitbake/lib/bb/fetch2/__init__.py` needs to change to stop it from parsing out `<name>@` in `<name>@.service` into the `user` field or maybe it needs additional logic inside that function when its a local file to put the parsed out `user` group back into the `location` group? This appears to fix it but i'm not sure if this is actually valid, if it is I can submit it as an actual patch. I think there should probably be a corresponding addition of a fetch test around these types of files? diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index ddee4400bb..2ea834416b 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -369,6 +369,11 @@ def decodeurl(url): path = location[locidx:] elif type.lower() == 'file': host = "" + # parsing out a user doesn't make sense for + # local files? + if user: + location = "%s@%s" % (user, location) + user = None path = location else: host = location I was able to see the `serialgetty@.service` file in the output folder: brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ bitbake systemd-serialgetty ... NOTE: Tasks Summary: Attempted 969 tasks of which 871 didn't need to be rerun and all succeeded. Summary: There was 1 WARNING message. brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ ls tmp/deploy/sources/x86_64-poky-linux/systemd-serialgetty-1.0-r0/ serial-getty@.service Thanks, Brennan Coslett
On Thu, 26 Sept 2024 at 16:57, Brennan Coslett via lists.openembedded.org <brennan.coslett=bluerivertech.com@lists.openembedded.org> wrote: > > I think the bug is in `bb.fetch.decodeurl`, I just noticed it because of the archiver. > @Alex if that means it should still move over to the oe-core list, let me know and I can do that. Right, let's stay here. > Steps to replicate: > 1. Checkout poky master (just replicated with 8634e46b4040b6008410b6d77fecb5cbaec7e90e) > 2. source oe-init-build-env > 3. append the following to the auto-generated local.conf: > + require conf/distro/include/init-manager-systemd.inc > + INHERIT += "archiver" > + ARCHIVER_MODE[src] = "original" > 4. Run "bitbake systemd-serialgetty", the source dir will be created but it wont contain anything: > brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ ls tmp/deploy/sources/x86_64-poky-linux/systemd-serialgetty-1.0-r0/ > brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ > > I think something in the regex inside the `decodeurl` function in `bitbake/lib/bb/fetch2/__init__.py` needs to change to stop it from > parsing out `<name>@` in `<name>@.service` into the `user` field or maybe it needs additional logic inside that function when its a > local file to put the parsed out `user` group back into the `location` group? This appears to fix it but i'm not sure if this is actually valid, > if it is I can submit it as an actual patch. I think there should probably be a corresponding addition of a fetch test around these types of > files? Let's do test driven development here :) There are tests for fetchers in bitbake/lib/bb/tests/fetch.py - if you can check and add a test that exposes this issue (e.g. fails without the fix and passes with it), then we can discuss what the best fix would be. Run them with bitbake-selftest. Alex
Okay I added an additional test case to URLHandle which then tests encoding and decoding the original patch I sent worked for decoding but in the encoding test it converted "@" to "%40" which I dont think we need for local files. The test-case looks like: diff --git a/bitbake/lib/bb/tests/fetch.py b/bitbake/lib/bb/tests/fetch.py index 4e26e38811..dff7780e33 100644 --- a/bitbake/lib/bb/tests/fetch.py +++ b/bitbake/lib/bb/tests/fetch.py @@ -1390,6 +1390,7 @@ class URLHandle(unittest.TestCase): "file://somelocation;someparam=1": ('file', '', 'somelocation', '', '', {'someparam': '1'}), "https://somesite.com/somerepo.git;user=anyUser:idtoken=1234" : ('https', 'somesite.com', '/somerepo.git', '', '', {'user': 'anyUser:idtoken=1234'}), r'git://s.o-me_ONE:!#$%^&*()-_={}[]\|:?,.<>~`@ git.openembedded.org/bitbake;branch=main;protocol=https': ('git', ' git.openembedded.org', '/bitbake', 's.o-me_ONE', r'!#$%^&*()-_={}[]\|:?,.<>~`', {'branch': 'main', 'prot ocol' : 'https'}), + "file://serialgetty@.service" : ('file', '', 'serialgetty@.service', '', '', {}), } # we require a pathname to encodeurl but users can still pass such urls to # decodeurl and we need to handle them So adding the encodeurl change I needed: diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py index ddee4400bb..4ea561e1f8 100644 --- a/bitbake/lib/bb/fetch2/__init__.py +++ b/bitbake/lib/bb/fetch2/__init__.py @@ -369,6 +369,11 @@ def decodeurl(url): path = location[locidx:] elif type.lower() == 'file': host = "" + # parsing out a user doesn't make sense for + # local files? + if user: + location = "%s@%s" % (user, location) + user = None path = location else: host = location @@ -414,7 +419,12 @@ def encodeurl(decoded): # Standardise path to ensure comparisons work while '//' in path: path = path.replace("//", "/") - url.append("%s" % urllib.parse.quote(path)) + urllib_parsed = urllib.parse.urlparse(path).path + if type == "file": + # Local systemd template file paths don't need the + # escaped character. + urllib_parsed = urllib_parsed.replace('%40', '@') + url.append(urllib_parsed) if p: for parm in p: url.append(";%s=%s" % (parm, p[parm])) - Brennan On Thu, Sep 26, 2024 at 10:03 AM Alexander Kanavin <alex.kanavin@gmail.com> wrote: > On Thu, 26 Sept 2024 at 16:57, Brennan Coslett via > lists.openembedded.org > <brennan.coslett=bluerivertech.com@lists.openembedded.org> wrote: > > > > I think the bug is in `bb.fetch.decodeurl`, I just noticed it because of > the archiver. > > @Alex if that means it should still move over to the oe-core list, let > me know and I can do that. > > Right, let's stay here. > > > Steps to replicate: > > 1. Checkout poky master (just replicated with > 8634e46b4040b6008410b6d77fecb5cbaec7e90e) > > 2. source oe-init-build-env > > 3. append the following to the auto-generated local.conf: > > + require conf/distro/include/init-manager-systemd.inc > > + INHERIT += "archiver" > > + ARCHIVER_MODE[src] = "original" > > 4. Run "bitbake systemd-serialgetty", the source dir will be created but > it wont contain anything: > > brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ ls > tmp/deploy/sources/x86_64-poky-linux/systemd-serialgetty-1.0-r0/ > > brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ > > > > I think something in the regex inside the `decodeurl` function in > `bitbake/lib/bb/fetch2/__init__.py` needs to change to stop it from > > parsing out `<name>@` in `<name>@.service` into the `user` field or > maybe it needs additional logic inside that function when its a > > local file to put the parsed out `user` group back into the `location` > group? This appears to fix it but i'm not sure if this is actually valid, > > if it is I can submit it as an actual patch. I think there should > probably be a corresponding addition of a fetch test around these types of > > files? > > Let's do test driven development here :) There are tests for fetchers > in bitbake/lib/bb/tests/fetch.py - if you can check and add a test > that exposes this issue (e.g. fails without the fix and passes with > it), then we can discuss what the best fix would be. Run them with > bitbake-selftest. > > Alex >
Can you turn these into patches, and send them as RFC? Alex On Thu, 26 Sept 2024 at 17:31, Brennan Coslett <brennan.coslett@bluerivertech.com> wrote: > > Okay I added an additional test case to URLHandle which then tests encoding and decoding the original patch I sent worked for decoding > but in the encoding test it converted "@" to "%40" which I dont think we need for local files. The test-case looks like: > > diff --git a/bitbake/lib/bb/tests/fetch.py b/bitbake/lib/bb/tests/fetch.py > index 4e26e38811..dff7780e33 100644 > --- a/bitbake/lib/bb/tests/fetch.py > +++ b/bitbake/lib/bb/tests/fetch.py > @@ -1390,6 +1390,7 @@ class URLHandle(unittest.TestCase): > "file://somelocation;someparam=1": ('file', '', 'somelocation', '', '', {'someparam': '1'}), > "https://somesite.com/somerepo.git;user=anyUser:idtoken=1234" : ('https', 'somesite.com', '/somerepo.git', '', '', {'user': 'anyUser:idtoken=1234'}), > r'git://s.o-me_ONE:!#$%^&*()-_={}[]\|:?,.<>~`@git.openembedded.org/bitbake;branch=main;protocol=https': ('git', 'git.openembedded.org', '/bitbake', 's.o-me_ONE', r'!#$%^&*()-_={}[]\|:?,.<>~`', {'branch': 'main', 'prot > ocol' : 'https'}), > + "file://serialgetty@.service" : ('file', '', 'serialgetty@.service', '', '', {}), > } > # we require a pathname to encodeurl but users can still pass such urls to > # decodeurl and we need to handle them > > So adding the encodeurl change I needed: > > diff --git a/bitbake/lib/bb/fetch2/__init__.py b/bitbake/lib/bb/fetch2/__init__.py > index ddee4400bb..4ea561e1f8 100644 > --- a/bitbake/lib/bb/fetch2/__init__.py > +++ b/bitbake/lib/bb/fetch2/__init__.py > @@ -369,6 +369,11 @@ def decodeurl(url): > path = location[locidx:] > elif type.lower() == 'file': > host = "" > + # parsing out a user doesn't make sense for > + # local files? > + if user: > + location = "%s@%s" % (user, location) > + user = None > path = location > else: > host = location > @@ -414,7 +419,12 @@ def encodeurl(decoded): > # Standardise path to ensure comparisons work > while '//' in path: > path = path.replace("//", "/") > - url.append("%s" % urllib.parse.quote(path)) > + urllib_parsed = urllib.parse.urlparse(path).path > + if type == "file": > + # Local systemd template file paths don't need the > + # escaped character. > + urllib_parsed = urllib_parsed.replace('%40', '@') > + url.append(urllib_parsed) > if p: > for parm in p: > url.append(";%s=%s" % (parm, p[parm])) > - Brennan > > On Thu, Sep 26, 2024 at 10:03 AM Alexander Kanavin <alex.kanavin@gmail.com> wrote: >> >> On Thu, 26 Sept 2024 at 16:57, Brennan Coslett via >> lists.openembedded.org >> <brennan.coslett=bluerivertech.com@lists.openembedded.org> wrote: >> > >> > I think the bug is in `bb.fetch.decodeurl`, I just noticed it because of the archiver. >> > @Alex if that means it should still move over to the oe-core list, let me know and I can do that. >> >> Right, let's stay here. >> >> > Steps to replicate: >> > 1. Checkout poky master (just replicated with 8634e46b4040b6008410b6d77fecb5cbaec7e90e) >> > 2. source oe-init-build-env >> > 3. append the following to the auto-generated local.conf: >> > + require conf/distro/include/init-manager-systemd.inc >> > + INHERIT += "archiver" >> > + ARCHIVER_MODE[src] = "original" >> > 4. Run "bitbake systemd-serialgetty", the source dir will be created but it wont contain anything: >> > brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ ls tmp/deploy/sources/x86_64-poky-linux/systemd-serialgetty-1.0-r0/ >> > brennan.coslett@lpgt-bcosl001:~/missing-source/poky/build$ >> > >> > I think something in the regex inside the `decodeurl` function in `bitbake/lib/bb/fetch2/__init__.py` needs to change to stop it from >> > parsing out `<name>@` in `<name>@.service` into the `user` field or maybe it needs additional logic inside that function when its a >> > local file to put the parsed out `user` group back into the `location` group? This appears to fix it but i'm not sure if this is actually valid, >> > if it is I can submit it as an actual patch. I think there should probably be a corresponding addition of a fetch test around these types of >> > files? >> >> Let's do test driven development here :) There are tests for fetchers >> in bitbake/lib/bb/tests/fetch.py - if you can check and add a test >> that exposes this issue (e.g. fails without the fix and passes with >> it), then we can discuss what the best fix would be. Run them with >> bitbake-selftest. >> >> Alex
diff --git a/meta/classes/archiver.bbclass b/meta/classes/archiver.bbclass index 4a5865d7b5..daf7b4605b 100644 --- a/meta/classes/archiver.bbclass +++ b/meta/classes/archiver.bbclass @@ -181,6 +181,7 @@ python do_ar_original() { ar_outdir = d.getVar('ARCHIVER_OUTDIR') bb.note('Archiving the original source...') urls = d.getVar("SRC_URI").split() + bb.debug(1, "BRENNAN full SRC_URI %s" % urls) # destsuffix (git fetcher) and subdir (everything else) are allowed to be # absolute paths (for example, destsuffix=${S}/foobar). # That messes with unpacking inside our tmpdir below, because the fetchers @@ -198,7 +199,7 @@ python do_ar_original() { del decoded[5][param] encoded = bb.fetch2.encodeurl(decoded) urls[i] = encoded - + bb.debug(1, "BRENNAN %s" % urls) # Cleanup SRC_URI before call bb.fetch2.Fetch() since now SRC_URI is in the # variable "urls", otherwise there might be errors like: # The SRCREV_FORMAT variable must be set when multiple SCMs are used