Message ID | 20241203200242.2955858-2-ross.burton@arm.com |
---|---|
State | New |
Headers | show |
Series | [1/2] fetch2/wget: handle HTTP 308 Permanent Redirect | expand |
Note that I’d like someone with more than passing knowledge in the fetcher to double check this. Also I just noticed the commit shortlog is entirely wrong (from an early version). Ross > On 3 Dec 2024, at 20:02, Ross Burton via lists.openembedded.org <ross.burton=arm.com@lists.openembedded.org> wrote: > > ud.path has been unescaped (eg %20 is space) but as we're reconstructing > a URL we should re-escape it. For example, unzip has a SRC_URI > containing "UnZip%206.x%20%28latest%29/UnZip%206.0/unzip60.tar.gz" which > then throws exceptions if the unescaped string " (latest)" is used. > > Also, this code uses the extracted ud.host and ud.path variables. These > are unescaped but potentially stale as eg the cargo fetcher subclasses > Wget() and reassigns ud.url on construction. > > Simplify the code by reconstructing a URL from ud.url directly instead > of bouncing through intermediate variables that may be wrong or > unescaped. > > Signed-off-by: Ross Burton <ross.burton@arm.com> > --- > bitbake/lib/bb/fetch2/wget.py | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/bitbake/lib/bb/fetch2/wget.py b/bitbake/lib/bb/fetch2/wget.py > index fcb71246d9a..198426065b4 100644 > --- a/bitbake/lib/bb/fetch2/wget.py > +++ b/bitbake/lib/bb/fetch2/wget.py > @@ -376,8 +376,8 @@ class Wget(FetchMethod): > opener = urllib.request.build_opener(*handlers) > > try: > - uri_base = ud.url.split(";")[0] > - uri = "{}://{}{}".format(urllib.parse.urlparse(uri_base).scheme, ud.host, ud.path) > + parts = urllib.parse.urlparse(ud.url.split(";")[0]) > + uri = "{}://{}{}".format(parts.scheme, parts.netloc, parts.path) > r = urllib.request.Request(uri) > r.get_method = lambda: "HEAD" > # Some servers (FusionForge, as used on Alioth) require that the > -- > 2.34.1 > > > -=-=-=-=-=-=-=-=-=-=-=- > Links: You receive all messages sent to this group. > View/Reply Online (#16859): https://lists.openembedded.org/g/bitbake-devel/message/16859 > Mute This Topic: https://lists.openembedded.org/mt/109907401/6875888 > Group Owner: bitbake-devel+owner@lists.openembedded.org > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [ross.burton@arm.com] > -=-=-=-=-=-=-=-=-=-=-=- >
diff --git a/bitbake/lib/bb/fetch2/wget.py b/bitbake/lib/bb/fetch2/wget.py index fcb71246d9a..198426065b4 100644 --- a/bitbake/lib/bb/fetch2/wget.py +++ b/bitbake/lib/bb/fetch2/wget.py @@ -376,8 +376,8 @@ class Wget(FetchMethod): opener = urllib.request.build_opener(*handlers) try: - uri_base = ud.url.split(";")[0] - uri = "{}://{}{}".format(urllib.parse.urlparse(uri_base).scheme, ud.host, ud.path) + parts = urllib.parse.urlparse(ud.url.split(";")[0]) + uri = "{}://{}{}".format(parts.scheme, parts.netloc, parts.path) r = urllib.request.Request(uri) r.get_method = lambda: "HEAD" # Some servers (FusionForge, as used on Alioth) require that the
ud.path has been unescaped (eg %20 is space) but as we're reconstructing a URL we should re-escape it. For example, unzip has a SRC_URI containing "UnZip%206.x%20%28latest%29/UnZip%206.0/unzip60.tar.gz" which then throws exceptions if the unescaped string " (latest)" is used. Also, this code uses the extracted ud.host and ud.path variables. These are unescaped but potentially stale as eg the cargo fetcher subclasses Wget() and reassigns ud.url on construction. Simplify the code by reconstructing a URL from ud.url directly instead of bouncing through intermediate variables that may be wrong or unescaped. Signed-off-by: Ross Burton <ross.burton@arm.com> --- bitbake/lib/bb/fetch2/wget.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)